Is there an efficient way to toss binary data in C++?

**yarft** · 07-23-2007

Below if the code I'm using. I wonder if anyone knows a faster way to do what I'm doing, which is taking in a file (piping it from a bash terminal like "./a.out < input > output") by reading its data byte by byte, and throwing out every other byte (so the final output is 1/2 the original).

I find that that way I'm doing it is really slow (it takes about a second to process 1.2 Mbytes). Is there a way to do this faster? I think that reading the file in at once might speed things up so that cin doesn't have to be called over and over, but I don't know how to go about doing that.

Code:

#include<iostream>

using namespace std;

main()
{int i=0;
char ch;

while(cin.get(ch))
        {if(i == 1)
                {cout << ch;
                i = 0;
                }
        else if(i == 0)
                {i = 1;
                }

        else {exit(-1);}

        }

}

**QuestionC** · 07-23-2007

The data from cin is already buffered. You will probably not get a performance improvement by using something other than .get().

**QuestionC** · 07-23-2007

Before optomizing, do try to figure out what the theoretical limits you are dealing with here.

Code:

#include <iostream>

int main (void) {
   char ch;
   while (std::cin.get(ch)) {
      std::cout << ch;
   }
}

**yarft** · 07-23-2007

I'm pretty sure now, that constantly calling cin for every character (8 bytes) caused a lot of unnesseccary overhead. I read the data files all at once, and they are now processed almost as fast as they are generated. =)

I modified code to copy a file and here is the result:

Code:

// Copy a file
#include <fstream>
using namespace std;

int main () {

  char * buffer;
  char * buffer2;
  long size;
  int i = 0, j = 0;

  ifstream infile ("test.txt",ifstream::binary);
  ofstream outfile ("new.txt",ofstream::binary);

  // get size of file
infile.seekg(0,ifstream::end);
  size=infile.tellg();
  infile.seekg(0);

  // allocate memory for file content
buffer = new char [size];
buffer2 = new char [size/2];

  // read content of infile
infile.read (buffer,size);



  // filter every other byte

while(i != size)
  {if((i&#37;2) < 1)
    {buffer2[j] = buffer[i];
    i++; j++;
    }

  else if((i%2) >= 1)
    {i++;}

  else {exit(-1);}
  }



  // write to outfile
outfile.write (buffer2,size/2);
  
  // release dynamically-allocated memory
delete[] buffer;
delete[] buffer2;

  outfile.close();
  infile.close();
  return 0;
}

**robwhit** · 07-23-2007

Code:

    {buffer2[j] = buffer[i];

you'll overrun buffer2 because it's only half the size of buffer.

**brewbuck** · 07-23-2007

Originally Posted by yarft

I'm pretty sure now, that constantly calling cin for every character (8 bytes) caused a lot of unnesseccary overhead. I read the data files all at once, and they are now processed almost as fast as they are generated. =)

Yep. You've discovered the good old speed-memory tradeoff. You gain speed, but use a lot more RAM.

Buffered I/O layers are a spectrum, with completely unbuffered calls to the OS at one end, and completely buffered input (what you've implemented) at the other. iostreams is somewhere in between. Calling istream::get() repeatedly is much more efficient that calling the OS file reading function repeatedly, but not as efficient as simply reading the whole chunk in one shot.

Thread: Is there an efficient way to toss binary data in C++?

Thread Tools

Search Thread

Display

Is there an efficient way to toss binary data in C++?

Similar Threads

Changing header data in binary file

Bitmasking Problem

How to write image data to binary PGM file format(P5)?

Binary comparison

gcc problem