I'd do two things:
1. Use an oscilloscope/logic analyzer to view the waveform.
2. Use single bits to indicate different timing parameters, e.g.
Code:
// Wait a bit more. This should not make effect, but makes.
outb(0x02, 0x378);
nanosleep(&ts, NULL);
outb(0x00, 0x378);
outb(0x01, 0x378);
usleep(1000);
outb(0x00, 0x378);
This assumes that your freqeuncy analyzer is on Data0 of the parallell port (from memory, that's pin 2, but I could be completely wrong on that). Connect a scope to the Data1 (which would be pin 3 if above Pin 2 is correct) and see what it does and how it varies.
I would also look at the variation on data0, as I suspect you'll find that _WITH_ the nanosleep, it will be more variation than without - because it calls schedule, which means that your current process gives up the rest of the timeslice, and only wakes up whenever the scheduler decides it has to run...
--
Mats