Hi, I started coding some asm through C++ so I can optimize my own graphics engine drawing functions. I managed to learn asm and get a nested loop of pixel access and drawing working fully and I can actually see the image result in a window. The render's speed got increased and I'm happy about that. However there's something I don't really understand: why did it work?

To speed my nested loops I thought about moving the most used vars in the registers permanently so it reduces access to ram and stuff. By "why did it work", I mean that since OS are multithreaded and everyprogram need to access and set registers, how can my algorithm worked properly, without having any of the register suddenly get their memory changed?

Is it possible that while running some code in a "_asm" in C++ all other programs are stopped from execution at least until the "_asm" code is done? I don't know much about multithread programming, but I know there are atom, mutex and others concept I don't really understand, that prevents wrong order of execution between programs and common memory.

Hope I'm being clear