why is write slow

**Hitsuin** · 06-18-2009

Why is it when I'm writing from an array or a writing a byte at a time write works so slowly. When copying a file the write function is fairly quick, but if I'm manipulating data or anything like that an average mp3 takes half a minute. 100mb takes about an hour. Why is that? Does anyone know?

**Brafil** · 06-18-2009

Cuz your computer is writing to disk, which can be awfully slow. Long IO waits etc. I propose using fwrite and FILE *s than write and file descriptors. A byte at at time is again a really fair bit slower.

**cpjust** · 06-18-2009

What does your code look like?

**Hitsuin** · 06-18-2009

@ cpjust, I don't have one, it happens whenever I use write.

@ Brafil, I'll look into learning how to use those. Thanks a lot.

**Brafil** · 06-18-2009

You should be using them anyway. They're much faster and more portable.

**m37h0d** · 06-18-2009

probably because whatever you're using to crunch other data types is much more efficient than the code you wrote.

**brewbuck** · 06-18-2009

Are you referring to the write() system call? That's an extremely inefficient way to write a single byte at a time. Each call to write() entails a context switch to kernel space, copying the user byte to a kernel buffer, and a context switch back to user space.

Do NOT call write() with a single byte at a time. Use an appropriately buffered I/O method, like a std:: ostream. Or for C style I/O, use fwrite().

**Cactus_Hugger** · 06-18-2009

Originally Posted by Brafil

Cuz your computer is writing to disk, which can be awfully slow.

My hard disk can do 20MB/s. 100MB is therefore 5 seconds -- 100MB in an hour is not typical performance.

Originally Posted by brewbuck

Do NOT call write() with a single byte at a time. Use an appropriately buffered I/O method, like a std:: ostream. Or for C style I/O, use fwrite().

Even then, fwrite() is not a magic fix all. Re-evaluate what you're doing, and see if it is possible to request larger chunks of data. fwrite()ing 16KB / call is still going to outperform fwrite()ing 1 byte per call.

**brewbuck** · 06-18-2009

Originally Posted by Cactus_Hugger

fwrite()ing 16KB / call is still going to outperform fwrite()ing 1 byte per call.

Buffering the data yourself in order to pass it in chunks to fwrite() is simply adding a layer of buffering on top of a layer of buffering. The entire purpose of buffered I/O is to allow you to read or write as little or as much data as is convenient, without worrying about the system call overhead.

If your code naturally generates data in large chunks, then write it in large chunks. If it naturally generates a byte at a time, then write a byte at a time. You should trust the standard I/O buffering mechanisms, not second-guess them.

**cpjust** · 06-18-2009

Originally Posted by brewbuck

Buffering the data yourself in order to pass it in chunks to fwrite() is simply adding a layer of buffering on top of a layer of buffering. The entire purpose of buffered I/O is to allow you to read or write as little or as much data as is convenient, without worrying about the system call overhead.

If your code naturally generates data in large chunks, then write it in large chunks. If it naturally generates a byte at a time, then write a byte at a time. You should trust the standard I/O buffering mechanisms, not second-guess them.

I'm sure Cactus_Hugger was referring to the overhead of repeatedly calling fwrite() thousands of times rather than once, but without testing it, I'm not sure how big of a performance hit that would be.

**brewbuck** · 06-18-2009

Originally Posted by cpjust

I'm sure Cactus_Hugger was referring to the overhead of repeatedly calling fwrite() thousands of times rather than once, but without testing it, I'm not sure how big of a performance hit that would be.

Okay, I admit being mistaken here. I'm actually shocked at how bad the C++ ostream buffering SUCKS. Here's what I used to test:

Code:

#include <iostream>
#include <fstream>
#include <string>

class BufferBuffer
{
public:
    BufferBuffer( std::ostream &stream, std::size_t bufSize )
	: mStream( stream ),
	  mBufSize( bufSize ),
	  mBuf( new char[ mBufSize ] ),
	  mBufPos( 0 )
    {
    }

    ~BufferBuffer()
    {
	// Flush any remaining data
	if( mBufPos > 0 )
	    mStream.write( mBuf, mBufPos );
	delete[] mBuf;
    }

    void Put( char ch )
    {
	// Flush buffer if full
	if( mBufPos == mBufSize )
	{
	    mStream.write( mBuf, mBufPos );
	    mBufPos = 0;
	}
	mBuf[ mBufPos++ ] = ch;
    }

private:
    std::ostream &mStream;
    std::size_t   mBufSize;
    char         *mBuf;
    std::size_t   mBufPos;
};

int main( int argc, char **argv )
{
    std::ofstream outfile( "./tmp.dat" );
    std::string mode = "direct";

    if( argc > 1 )
	mode = argv[ 1 ];

    if( mode == "direct" )
    {
	for( int i = 0; i < 20000000; ++i )
	{
	    char ch = i & 0x7F;
	    outfile.put( ch );
	}
    }
    else if( mode == "buffered" )
    {
	// Use a 16K buffer
	BufferBuffer buffer( outfile, 16384 );
	for( int i = 0; i < 20000000; ++i )
	{
	    char ch = i & 0x7F;
	    buffer.Put( ch );
	}
    }
    else
    {
	std::cerr << "Mode must be 'buffered' or 'direct'" << std::endl;
	return -1;
    }

    return 0;
}

And here's the output:

Code:

scott@scott-intel-mini-ubuntu:/tmp$ time ./bufferbench direct

real    0m0.779s
user    0m0.724s
sys     0m0.052s

scott@scott-intel-mini-ubuntu:/tmp$ time ./bufferbench buffered

real    0m0.149s
user    0m0.056s
sys     0m0.088s

I still think that buffering on top of buffering is stupid. The real problem here is that the GNU C++ library's stream buffering blows, apparently.

**Cactus_Hugger** · 06-18-2009

I'm was a bit ambiguous, though... both. brewbuck raises an interesting point, one that I didn't not think of. However, I've always lived by the advice of giving everything that you have, but keeping that as small as can be when writing, and then reading as much as possible. What I was trying to get at with the "Re-evaluate" is that data very rarely comes in 1 byte chunks. An integer is 4 bytes, a string is however long that string is, etc. Often, I know the size of a larger structure is fixed, and I can I/O the whole thing in one large chunk. Such as a file header is 20 bytes - that's a 20 byte IO - then that header reveals 2MB worth of data, and another 2MB I/O. Very rarely do I find myself reading/writing 1 byte chunks.
Moreover, however, is that calling fwrite() or fread() less results in less error checking on the file (you do check your errors, don't you? ;-) ), so unless you're in a tight loop, I particularly appreciate this benefit. It did sound like the OP was in a tight loop of some sort, but he mentions an array - is it possible to I/O that entire array, all at once? Maybe, maybe not - hence "re-evaluate".

However, I'll also content that fread/write'ing more per call is still faster:

Code:

$ gcc -o fwrite fwrite.c 
$ ./fwrite -1
1GB took 15.091726 seconds.
$ ./fwrite -2 1024
1GB took 4.674032 seconds.
$ cat fwrite.c
#include <stdio.h>
#include <sys/time.h>
#include <stdint.h>
#include <stdlib.h>

double now()
{
	struct timeval tv = {0};
	gettimeofday(&tv, NULL);
	return tv.tv_sec + tv.tv_usec / 1000000.0;
}

void output_time(uint64_t count, double time)
{
	printf(
		"\r\x1b[2K%f, %fMB/s",
		count / 1024.0 / 1024.0,
		count / 1024.0 / 1024.0 / time);
	fflush(stdout);
}

int main(int argc, char *argv[])
{
	FILE *fp;
	int c, i, method;
	unsigned char *buf;
	size_t bufsize;
	uint64_t count = 0, stop = 1 << 30;
	double start, end;

	if(strcmp(argv[1], "-1") == 0)
		method = 1;
	else if(strcmp(argv[1], "-2") == 0)
	{
		method = 2;
		bufsize = atoi(argv[2]);
		// yes, I know, we leak this...
		buf = malloc(bufsize);
	}
	else
	{
		fprintf(stderr, "Fail.\n");
		return 1;
	}

	fp = fopen("/dev/zero", "rb");
	start = now();
	while(count < stop)
	{
//		if(next_count <= count)
//		{
//			output_time(count, now() - start);
//			next_count = count + (1 << 24);
//		}
		if(method == 1)
		{
			c = fgetc(fp);
			++count;
		}
		else if(method == 2)
		{
			fread(buf, bufsize, 1, fp);
			for(i = 0; i < bufsize; ++i)
			{
				c = buf[i];
				++count;
			}
		}
	}
	end = now();
	printf("1GB took %f seconds.\n", end - start);
	return 1;
}

There's a big, huge grain of salt in here though - even though the single byte fgetc()s were slower, they still I/O'd data at 60MB/s, which for me, is faster than my disk. (The large chunk reads were ~200MB/s)

**Sebastiani** · 06-18-2009

>> The real problem here is that the GNU C++ library's stream buffering blows, apparently.

Heh.

I think the "canonical" default buffer size is either 512 or 1024, which might explain the less-than-stellar performance. You might get some improvement using some_ostream.rdbuf()->pubsetbuf (or setvbuf, if using C). I haven't actually tried either of them out, though, so I'm not really sure how much of a difference it would make.

**Cactus_Hugger** · 06-18-2009

buffer size is either 512 or 1024

My custom program above was using a 1024 size buffer, so I'm not sure that's it. But I'd say the point is still academic.

**Brafil** · 06-19-2009

The greater the buffer, the greater the speed & memory consumption. And fwrite is really much faster, since it saves the bytes for later and then calls write 1 time when the buffer is full. A system call is one very expensive thing. But I think somebody has mentioned that before.

Thread: why is write slow

Thread Tools

Search Thread

Display

why is write slow

Similar Threads

How can I read and write a structure to and from a file?

program to make a floppy write protected

Reroute where programs write to

Function to write string 3 times

write in c