1. using the macros NX NY and NZ you incur a couple of normalizations, then a couple more, then a crossproduct or two...point being, you aint going to get this for cheap regardless. My method is pretty intuitive though:
Those aren't macros, just variables. But I deleted this code because D3DXMatrixLookAtLH() already does this for me, it's the same code.

Dot product returns the cosine of the angle between two vectors...not the arccos. Sorry.

2. Just as long as you have found something that works.

3. You're algorithm is quicker darkness, but it seems as if you have to know what axis you have to rotate on in order to point towards something. Not that that's a bad thing, you're following the first rule of optimization: don't do more work than you have to.

I came across the algorithm Bubba worked out when dealing with Renderware. Every object in Renderware has a "frame" (their fancy term for transform matrix) which is always dealt with using vector math.

4. How is his method quicker? He is using asinf and acosf along with normalization and other vector operations.

The vector method:

Dot product: 3 FMULs and 3 FADDs per vector
Normalize: 3 FMULs and 1 FSQRT root per vector
Cross product: 6 FMULS, 3 FSUBs and 3 FSTs or memory accesses per vector - there also may be some FPU stack overhead associated with this function depending on the usage of the FPU.
Matrix: 16 memory accesses

FMUL might be slow but it is not as slow as asinf and acosf. asinf and acosf both incur 1 square root per function and also require another trig function (if I remember the code for a<cos/sin>f correctly).

Notice that you can also use the GPU as a vector processor since dot product, cross product, and normalization are all supported on the chip and in pixel shader 1.0+. I'm not sure about Cg. So you could compute this for next to no cost at all in CPU cycles.

5. I see 3 normalizations in his code, 1 vector subtract, the trig functions, and these two: RotationVector.InitFromAngles(mPitch,mYaw,0) and RotationVector.TransformDir (&FireVector). I'm not quite sure what he's doing EXACTLY in side of those, but to do what he's trying to do I'm sure they're not extremely taxing.

His is slightly faster because he's not doing a lot of vector math, which requires an operation for each dimension (in our case 3). Yeah, that can be sped up using SIMD, but his trig functions can be sped up using table lookups instead of calcs. Correct me if I'm wrong Darkness, but you are expecting both of these objects to be on the same plane. So you don't even have to worry about a 3rd dimension. The other function works with 3 dimensions and arbitrary points anywhere in space. So this is not to say that A is better than B, or vice versa. I was just pointing out some pro's/con's for each.

6. The initfromangles thing is part of something external to test the results from the algorithm, it is not a part of the algorithm itself. NX and NY seem to call normalize and the other variables (which I thought were macros, but I guess I don't know how #local works) call atan. Not that it matters a ton because they're both going to be relatively slow anyway you cut it.

7. Code:
```or you can do this instead:
#local NY=vnormalize(T-L);
#local NX=vnormalize(vcross(NY,z)); // z is down in this example
#local NZ=vcross(NX,NY);
object { Gun matrix < NX.x,NX.y,NX.z, NY.x,NY.y,NY.z, NZ.x,NZ.y,NZ.z, L.x,L.y,L.z > }```
This is the portion of the POV raytrace code that I converted. The first portion uses another method to get the same result but atn is oh so slow and I'd never use it. The second method is more vector based and is a lot faster. Sorry for the confusion.

I thought my other posts cleared this up:

zaxis = normal(At - Eye)
xaxis = normal(cross(Up, zaxis))
yaxis = cross(zaxis, xaxis)

xaxis.x yaxis.x zaxis.x 0
xaxis.y yaxis.y zaxis.y 0
xaxis.z yaxis.z zaxis.z 0
-dot(xaxis, eye) -dot(yaxis, eye) -dot(zaxis, eye) 1
That's how Direct3D does it. I fail to see how this is extremely slow.

8. edit:
I thought my other posts cleared this up:
nope...I don't think either of us really knew what you were talking about

Maybe it's not slow then.

9. It's not slow, just slower than his. In actual measure, they're both pretty damn quick. You can use both functions many times per frame and have little to no performance effect (in my experience).