What I described is a technique that is known to work on all multiprocessor/multicore architectures Linux runs on. hell-student specifically excluded everything else.
Feel free to argue about standards and standards compliance with someone else. I'm not interested.