Thread: Streaming file input problem

  1. #1
    Registered User
    Join Date
    May 2005
    Posts
    22

    Streaming file input problem

    What's the easiest way to stream unsigned short ints from a binary file into an array of unsigned short ints?
    I expected the language to be able to understand how to stream from an ifstream into the array but it appears
    to need delimiters between each unsigned short int in the file. Unfortunately, my "real" input file won't have delimiters.

    I was doing something like this:

    Code:
       unsigned short int bytesRead[MAX_VALS];
       ifstream inputFile(INPUT_FILE_NAME, ios::in | ios::binary | filebuf::sh_none);
    
       while(!dataFile.eof() && bytesRead <= MAX_VALS)
       {
          inputFile >> values[bytesRead++];
       }

  2. #2
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    You should be able to do this:

    Code:
    unsigned int counter = 0;
    while( counter < MAX_VALS && inputFile.read((char*)(&bytesRead[counter]),sizeof(unsigned short)) ) counter++;
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  3. #3
    Registered User
    Join Date
    Apr 2003
    Posts
    2,663
    You have several problems with your code:

    1) The >> operator is programmed to keep reading input until it encounters a whitespace character(spaces, tabs, newlines), so it is inappropriate for your needs.

    2) bytesRead[MAX_VALS] is out of bounds. When you declare an array like this:

    int myArray[10];

    the length of the array is 10, but the index positions run from 0-9, so myArray[10] is out of bounds, and it has some junk value. In your loop control, you have this:

    <= MAX_VALS

    so you are allowing the index of the array to equal MAX_VALS, which is out of bounds.

    3) This doesn't make any sense:

    while(..... bytesRead <= MAX_VALS)

    You declared bytesRead like this:

    unsigned short int bytesRead[MAX_VALS];

    So, you are trying to compare an array name, which is actually an address, to the int MAX_VALS. That means you have a type mismatch, and that produces an error. You can't compare array names to ints.

    4) There is also a problem with the other part of your loop control:

    while(!dataFile.eof() .....)

    Reading from a file can fail for other reasons besides encountering eof. So, you want to make your read statement part of the loop control. If the read statement fails for any reason, then the read statement will evaluate to false and end the loop. So, the general form for reading from files will look like this:

    Code:
    while( inputFile>>someVar )
    {
    	...
    }
    
    or
    
    while(  inputFile.getline()  )
    {
    	....
    }
    
    or
    
    while( inputFile.read() )
    {
    	...
    }
    The return value from each of those read statements is a stream object, and if there is anything wrong with the stream that will prevent further reading from it, the stream object will evaluate to false causing the loop to end.
    Last edited by 7stud; 07-27-2005 at 12:12 PM.

  4. #4
    Registered User
    Join Date
    Apr 2003
    Posts
    2,663
    What's the easiest way to stream unsigned short ints from a binary file
    As I understand it, a file that is accessed in binary mode does not have unsigned short ints in it. All the compiler sees is a series of characters. So, you have to convert those characters to a number. For instance, if the file contains 23, then you have to come up with a way to read the characters '2' and '3' from the file and convert them into the number 23. Not an easy task.

    If I understand it correctly, hk_mp5kpdw's solution first involves writing the number to the file using an identical cast. When writing to the file, the cast says to go to the memory address of the variable and treat the computer's internal binary representation of the number as a string, and write that string to the file. For instance if you were writing the unsigned short int 2 to the file, the computer's internal binary representation of 2 would look like this:

    0000 0000 0000 0010

    The cast says to treat that as a string and write it to the file.

    Then, when you want to read that back from the file, you read the proper number of bytes, and take that string and use the same cast to write the string into the memory address of another unsigned short int variable. After doing that, your computer just sees the binary representation of the number 2 at the memory address for the variable.

    That's my understanding of what is happening. The bottom line is that using binary mode is fraught with perils, and from what I can tell most experienced people on this forum don't recommend using binary mode.
    Last edited by 7stud; 07-27-2005 at 03:35 PM.

  5. #5
    Registered User
    Join Date
    May 2005
    Posts
    22

    endian issue?

    I think I see where you're coming from. I used hk_mp5kpdw's approach and I am at least getting data in now. My problem is that it appears to be reverse endian from what it should be (i.e. big-endian instead of little-endian).

    I am using the following program to create my input file...
    Code:
    int main(int argc, char* argv[])
    {
    	unsigned short int num;
    
       ofstream outputFile("input.bin", ios::trunc);
       for(num = 100; num < 200; num++)
       {
          outputFile << num;
          cout << "added " << num << "...\n";
       }
       return 0;
    }
    and the following to read it into my program and write the values to the console...
    Code:
       while( counter < MAX_VALS && dataFile.read((char*)(&values[counter]),sizeof(VALUE_TYPE)) )
       {
          counter++;
          cout << "#vals = " << counter - 1 << "  value read = " << (VALUE_TYPE)values[counter - 1] << endl;
       }
    My output looks like this...
    #vals = 0 value read = 12337
    #vals = 1 value read = 12592
    #vals = 2 value read = 12592
    #vals = 3 value read = 12337
    #vals = 4 value read = 12594
    #vals = 5 value read = 13104
    #vals = 6 value read = 12337
    #vals = 7 value read = 12596
    #vals = 8 value read = 13616
    #vals = 9 value read = 12337

    If you convert 12337 from decimal to hex, you get 3031 (ascii for 01). If you convert the next value, 12592, you get 3130 (ascii for 10). The digits for the first value in my input file should be 100, and if you flip the bytes here (change endian order) you get 10 01; the first three characters are my value of 100.

  6. #6
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    Since what you are reading is ascii (versus binary), you would need to know how many bytes each int is in the file to reconstruct it. Unfortunately, looking at your output, there appears to be no spaces (20 hex) between each int, so unless you know how wide each number is, you can't reconstruct it, because how do you know where one int ends, and the other begins?

    If each number is a fixed number of ascii characters, then you're ok. Read it one char at a time, then convert the ascii to a digit, and combine all digits.

  7. #7
    Registered User
    Join Date
    Apr 2003
    Posts
    2,663
    If you do something like this:

    outFile<<12;

    and then open the file in a text editor, what do you see? You should see "12". However, a text editor assumes everything written in the file is an ascii code and it converts the ascii codes to their corresponding characters for display. That means if you see "12", what is written to the file has to be the ascii codes for the characters "1" and "2". The ascii codes for the characters "1" and "2" are 49 and 50, and in binary format the codes look like:
    Code:
      "1"       "2"
    00110001 00110010  (there wouldn't be a space between them in your file)
    However, what you want to do is write the unsigned short int 12 to the file, which looks like this in binary format:

    0000000000001100

    You can't do that with the >> operator. The >> operator converts numbers to characters.

    With your code, you end up reading the ascii codes for "1" and "2" into your unsigned short int variable:

    0011000100110010

    which is the number 12,594. However, there is something I don't quite grasp about the process because when I run the following code it displays 12,849 instead. Maybe someone else can comment on why that is.
    Code:
    #include <string>
    #include <iostream>
    #include <fstream>
    
    using namespace std;
    
    int main()
    {
    	unsigned short int num = 12;
    	ofstream outputFile("C:\\TestData\\data.txt");
    	
    	outputFile<<num;
    
    	outputFile.close();
    	
    	ifstream dataFile("C:\\TestData\\data.txt");
    	unsigned short int input;
    	dataFile.read( reinterpret_cast<char*>(&input), sizeof(unsigned short int) );
    	
    	cout << "value read = " << input << endl;
    			
        
    	return 0;
    }
    Last edited by 7stud; 07-28-2005 at 12:12 PM.

  8. #8
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    because when I run the following code it displays 12,849 instead. Maybe someone else can comment on why that is.
    Hint: 3231 hex = 12849

  9. #9
    Registered User
    Join Date
    May 2005
    Posts
    22
    Quote Originally Posted by 7stud
    However, what you want to do is write the unsigned short int 12 to the file, which looks like this in binary format:

    0000000000001100
    Actually, what I want is the data to be a binary stream of the numbers from 100-199 (for my testing purposes).
    Meaning I want binary 01100100 followed by 01100101, etc, with no spaces between them. Then my program would need to read those numbers back in (into unsigned shorts).

  10. #10
    Registered User
    Join Date
    Apr 2003
    Posts
    2,663
    I want binary 01100100 followed by 01100101, etc, with no spaces between them.
    Yes, you've already said that. It's my understanding that's what opening a file in binary mode means.
    Last edited by 7stud; 07-28-2005 at 01:14 PM.

  11. #11
    Registered User
    Join Date
    Apr 2003
    Posts
    2,663
    Hint: 3231 hex = 12849


    It was my expectation that my program would display 12,594.

  12. #12
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    >It was my expectation that my program would display 12,594.
    Well you could go buy a Sun workstation.

  13. #13
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    >Actually, what I want is the data to be a binary stream of the numbers from 100-199 (for my testing purposes).
    Then you should use write() to output your data.
    Code:
    int main(int argc, char* argv[])
    {
       unsigned short int num;
    
       ofstream outputFile("input.bin", ios::binary | ios::trunc);
       for(num = 100; num < 200; num++)
       {
          outputFile.write( reinterpret_cast<char*> (&num), sizeof(num) );
          cout << "added " << num << "...\n";
       }
       return 0;
    }
    Use write() and read() for binary I/O.
    Last edited by swoopy; 07-28-2005 at 02:06 PM.

  14. #14
    Registered User
    Join Date
    Apr 2003
    Posts
    2,663
    Sorry, I don't understand. Why doesn't my program display 12,594?

  15. #15
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    >Sorry, I don't understand. Why doesn't my program display 12,594?
    Because you are on a little endian machine. Let me guess, a PC?

    This is one of the reasons Prelude stresses portability so much.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. can someone help me with these errors please code included
    By geekrockergal in forum C Programming
    Replies: 7
    Last Post: 02-10-2009, 02:20 PM
  2. Data Structure Eror
    By prominababy in forum C Programming
    Replies: 3
    Last Post: 01-06-2009, 09:35 AM
  3. file input output problem..
    By epidemic in forum C++ Programming
    Replies: 10
    Last Post: 12-03-2006, 03:55 AM
  4. Replies: 3
    Last Post: 03-04-2005, 02:46 PM
  5. File input problem
    By robert_sun in forum C Programming
    Replies: 1
    Last Post: 05-14-2004, 05:54 AM