I'm planning on switching my engines rendering system from the traditional multi-pass system, to a deferred shading system. I've been reading through various articles on the subject, and I understand the concept, but i'm having trouble figuring out how I would go about implementing it.

I know I have to pack the vertex positions, normals, material data and other info into a floating point texture. That texture is sent to the vertex fragment processors where the scene is reconstructed from the data, and lighting is applied to the image with a shader.

Now, suppose I want to use normal mapping in my final lighting shader. I just have to pack the tangent vector into the G-buffer aswell, since I can calculate the bitangent on the GPU during the post-process step. However, I need to get the normals out of the normal map. There can be many normal maps in a scene, since most objects are usually textured differently. Do I make a second rendering pass that renders the scene with normal maps only, and then copies the screen into a texture? Would I then use this texture with the G-buffer "texture" to perform the lighting on the GPU? My lighting shader also uses gloss maps. With the method mentioned above, I would need to make three rendering passes before actually doing any lighting. Is this still faster than num_objects*num_lights rendering?

One of the key advantages of deferred shading is supposed that the scene need only be rendered once. To acheive this, would I just store scene info in VBOs (vertex buffer objects)?

I'm a little confused, so hopefully some of you can help clear this up .

Thanks.