When you get my source - use the DXEngine.CPP and DXEngine.h files. Create a new project and change the screen mode and resolution to 320x200x256. That code already retrieves a pointer to the buffer or surface and a pointer to the backbuffer - use the backbuffer to access the screen. You must then flip the backbuffer to the surface representing the screen. Here is the code in assembly. I forgot the actual names of my buffers so just substitute them in where appropriate.
Copies 32 bit BackBuffer to 32-bit Surface.
Code:
asm {
push edi
push esi
push ds
lds esi,[BackBuffer]
les edi,[Surface]
mov ecx,03E80h //16000 DWORDs in 64000 bytes
rep movsd
pop ds
pop esi
pop edi
}