If you are having problems with your program not being on the processor enough, why not use renice and set your priority to a lower value?
If you are having problems with your program not being on the processor enough, why not use renice and set your priority to a lower value?
I'm not sure if that would have any useful effect on this kind of program. All it's doing is moving files. Seems like a highly I/O bound operation (depending on the filesystem obviously).
I think part of the problem could be that the FTP server is receiving (as he said) somewhere up to 50 files per second. So those have to be written to disk at the same time the files are being moved. That might lead to some disk thrashing in itself.
If the problem is disk thrashing, one thing to try is just adding some more RAM. This will get you a bigger block cache, hopefully reducing the thrashing. Or, if the files are small enough, have the FTP server stick them on a ramdisk instead of a real filesystem, and have the moving process copy them to hard storage (while organizing them however it wants to).
Hello, after a few days of testing, what i've experienced seems to be exactly what Salem suggested in his post:
The whole problem is that rename speed is not linear but quadratic on the number of files in the directory thus moving 1000fps is easy for less than 5000 files, but when I get to 50.000 it easily drop to 500 fps, things get nastier when over 100.000 files.Mmmmm.
http://www.redhat.com/archives/rhl-l.../msg03266.html
http://linuxgazette.net/102/piszcz.html
If what they're saying is true, then 25K files in a list is going to be really expensive when it comes to removing a single file from the directory file list. I don't know if you can guess the order, but you might be able to manipulate the order in which you delete files in your favour. Also, blowing away the entire directory in one hit may be more efficient than removing each file individually.
zacs7's idea looks a lot better IMO. By processing the files when there are fewer in the directory, you minimise the amount of extra work in manipulating directory file lists.
I would suggest further research on "benchmarking filesystems".
All this tests were made in a ext3 filesystem, next thing I'm going to try is benchmarking with reiserfs and XFS.
Thank's for all your great suggestions!!!
Marc.