Anyone know why tmpfile () only returns about 120 handles before failing on Windows XP and Windows 7?
Anyone know why tmpfile () only returns about 120 handles before failing on Windows XP and Windows 7?
All operating systems have limits on the number of files that can be open at once...
Congratulations, you found it!
Frankly if you're writing code that requires more than 1 (one) tmpfile... you seriously need to rethink what you're doing.
What does the return/error code from tmpfile say? Can you continue to open non-temporary files? Are these all open simultaneously (why would you need 120 temp files open at once)? Most systems have a limit for the number of total open files on the system as well as a per-process limit. I'm not sure what that is in Windows, but maybe you're bumping up against it. Not sure this is really a C question, so you might get better help in a Windows related forum.
If the file system is full, you might find that the limit is zero.
Don't take any apparent limit you find as being in any way a guarantee that you will always be able to create "that many".
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
If at first you don't succeed, try writing your phone number on the exam paper.
I've written a driver that is multi-threading and each user might need about 3 temporary file for sorting and storing results. I'm searching here because I wasn't sure whether the limit was defined somewhere in the C headers or in the operating system. FOPEN_MAX is about 250 and implies that more than 120 temporary files should be possible. Is the operating system limit alterable? I've not found a reference yet. The return code is 2 [ENOENT], but I don't know why after 120 calls. I've just changed my code to leave the temporary files open and the driver keeps running opening other files OK until it tries to call tmpfile () again.
I know you can alter the OS system-wide and per-process limits in Linux, I would imagine you can in Windows too (you could do this in the DOS days with something like FILES=40 in your config.sys). But there are other files open besides your temp files. There is always stdin, stdout and stderr. In Linux, any network sockets you open count against your file limit as do shared libraries, I believe. I don't know how similar Windows is. In Linux, threads share file descriptors (i.e. they're per-process, not per-thread), but that might differ in Windows, so you may be losing 3 file descriptors to every thread for stdin, stdout and stderr.
A driver for what?
To me, a driver runs in the OS context, and is supposed to do the minimal amount of work necessary before handing off the rest of the work to some other process (possibly in user space). It certainly wouldn't get involved in sorting results and writing to a file system.
Or do you have some other definition of a driver? Perhaps you mean a DLL?
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
If at first you don't succeed, try writing your phone number on the exam paper.
tmpfile (CRT)
Failure can occur if you attempt more than TMP_MAX (see STDIO.H) calls with tmpfile.
I think he means a 'test program' / 'program which uses this piece of code'. I've seen it used in that context on here before, at least.
I wouldn't use tmpfile anyway. [On Windows, and with the MS CRT] It stores the files it creates in the root of the current directory's drive. If the current drive is the one where Windows is installed, then it's going to fail on Vista & 7 unless the app is running elevated.
O_oit's going to fail on Vista & 7 unless the app is running elevated
Nice catch.
It will also fail on any version of Windows installed on a "NTFS" partition if the user doesn't have permission to create a file at that level. This, I believe, is the default since 2000 SP3.
Soma
Before everyone gets carried away with solutions that don't use tmpfile (), let me say that this is a complex ODBC driver which allows each of any number of independent web users to simultaneously gather and merge selective data from multiple tables in multiple databases, sort the merged data, and then hold the sorted data in a results file until the user calls it down. That is the purpose of an ODBC driver. If there's another way of doing this without tmpfile (), I'll welcome it, but meanwhile, how do I increase the number of simultaneous files that I can have open? Note than when I've exhausted the number of temporary files I can have open, I can continue to open other files normally.
Currently I can open about 120 temporary files, FOPEN_MAX is 250 and TMP_MAX is 65,535. Full Administrator permissions are available to the driver.
It may be that the driver needs a File Server which should be able handle a huge number of files open simultaneously, but even there, I would need to know what parameters effect the maximum number of temporary files that can be open.
One thought would be to use database techniques here... build a blather file with everything in it then maintain an index of pointers to where each users's chunk of file starts... 2 files is a lot better than 200...
> and then hold the sorted data in a results file until the user calls it down.
And how long are you planning to wait before deciding to
- flush the temp file to a real file
- delete it, because they went away or lost the connection
Compared to sending a volume of data over a network connection, the cost of opening/closing a file is pretty minimal.
For each client, all you really need is
- the associated filename
- a fseek() position of where you got to, sending them data.
Buffering internally say 100K of data might be a later optimisation.
Speaking of which, this whole exercise of 1000's of temp files seems like one big exercise in premature optimisation.
Your baseline should be based on opening and closing files on demand. This you profile with real requests from real clients to see exactly where all the bottlenecks are. When you know where a real problem exists, you can address it with a real solution.
As opposed to potentially chasing down a blind alley with this "how many temp files" approach, which is going to prove both difficult and ugly, just to find it doesn't make a bean of difference at all.
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
If at first you don't succeed, try writing your phone number on the exam paper.
O_o
That's not really how "ODBC" applications work in the wild.
They act only as an intermediary API acting as a standard gateway between a third-party application and a third-party database engine. (Yes, IBM, Microsoft, and many others all offer "turnkey" solutions, but this still applies.) When a request comes in data can be sorted in one of a few standard ways depending on how the connection is established. The "ODBC" application doesn't do its on database management. It depends on the third-party database engine to do its job and do it well.
Consider, all major database engines requiring administrative server installation support streaming results and key column and row sorting. Take a given database, say "PostgreSQL" for example, all you need to do for conformance is translate the "ODBC" connection configuration into a compatible "SQL" statement, spawn a process or a thread to connect to "PostgreSQL", and stream those results to the client. They will already be sorted and merged from multiple databases because "PostgreSQL" has done the job it was designed to do.
If you are going to try to add facilities to the usual "ODBC" application specifications like cross-party merge with sorting, you are going to have to fully implement a database engine, use one of the target database engine connections as an implementation server, or define a much better strategy then using multiple temporary files with contents to be managed. You just aren't going to sort 100,000,000 rows on data with a joined 15 keys with on three key target columns for a few dozen client connections from three difference database engines fast enough to get any traction just using a big old file.
One option would be streaming the pre-sorted order tables from multiple engines using serial processing techniques to sort them as the data comes in ala "MergeSort" still trusting the individual database engines to do their job streaming and caching.
Soma