Thread: Formatting a text file...

  1. #1
    Registered User
    Join Date
    Apr 2008
    Posts
    6

    Formatting a text file...

    Hello, I've been trying to carry out the fairly simple task of formatting a text file and am stuck on a number of points, I have spent quite some time trawling Google, and ended up with more questions than I had in the first place as a result.

    Specifically I want to capitalise any lowercase characters in the file. I figure the following steps are needed:

    1. User passes in a command with the file name as the argument.
    2. Read file into buffer.
    3. Convert file to all uppercase using "toupper".
    4. Overwrite the contents of the file with the contents of the buffer.

    Having got some working code for opening a file, reading it into a buffer and then writing it out on the screen, that I understand *most* of (I certainly won't pretend to understand exactly what every line is doing), I have been trying to alter that.

    Firstly I'll show my current code (please don't look if you're of a weak disposition, it may give you quite a fright):

    Code:
    #include <unistd.h>      
    #include <sys/file.h>     
    #include <string.h>      
    
    const int BUFSIZE=4096;  		//Set buffer size
    
    void fail( char file[] );
    void copy( char file[] );
    
    int main( int argc, char *argv[], char *env[] )
    {
      for ( int i=1; i<argc; i++ )      	// for each file passed as
      {					// an argument write the 
        copy( argv[i] );           		// contents to the screen            
      }					// (or some other operation)
    }
    
    void copy( char file[] )
    {
      int fd = open( file, O_RDONLY, 0 );	// Open the file       
      if ( fd >= 0 )   			// If the file descriptor has some 
                                                                    // info in it then do the following
      {					
        bool eof = false;     		// Have we reached the end of the file                  
        char buf[BUFSIZE];             	// Buffer1 to put file contents in         
        char bufx[BUFSIZE];			// Buffer2 to put altered contents in
        while ( !eof )          		// While we are not at the end of the file                
        {
          int bytes = read( fd, buf, BUFSIZE ); 	// bytes = the whole contents of the file, I think?
          if ( bytes > 0 )    			// Is there anything in the file?                  
    	
    	//c = toupper( c );		// Misplaced "toupper"
       write( 1, buf, bytes );            	// write the contents of "buf", which is "bytes" long
          //write( 1, bufx, bytes );
          else
            eof = true;  			// Nothing left to write in the file so change eof flag                      
        }
        close( fd );   			// Close the file (referred to by it's file descriptor                         
      } else {
        fail( file );                          
      }
    }
    
    
    void fail( char file[] )		// Display info about why it didn’t work
    {
      write( 2, "cat: ", 5 );
      write( 2, file, strlen( file ) );
      write( 2, " no such file or directory\n", 27 );
    }

    I think I need to use a loop just before the write method with “lseek” to move through the file one character at a time, applying "toupper" as I go, then writing the result to a second buffer, then when that loop is done for the whole file and the second buffer contains the altered file I write the contents of the buffer to the file, making sure I overwrite rather than append. However I'm sure this shouldn’t need a second buffer, and does "write" work properly for this? I guess I'll list my questions in a nice neat orderly fashion for anyone kind enough to either answer them or point me in the direction of answers:

    1. Am I going about this the right way? Is there an easier and quicker way?
    2. I read up on “pwrite”, and initially I thought I would need to use it to write to the file, then I read something else that changed my mind, does write work ok, or is it just for displaying information to the screen (I am confused by the documentation)
    3. Do I need to include “cctype.h” or “ctype.h” to use toupper? I've found examples and what not which show the use of both of them and again I'm confused...
    4. Should I be thinking about strings at all? Or stick with chars?

    Any help would be much appreciated

    Thank you

    ~Dagorsul


    I accidently posted this in the C++ forums when it should really have been posted here. Below is the first reply I got and my response:

    Quote Originally Posted by robwhit View Post
    A few things first:

    This is C, not C++. You posted in the C++ forum. Did you want to write C?
    You are using UNIX functions, not standard C or C++ functions. It won't run on other platforms, like Windows. Do you want this?

    cctype.h is not a header. It's either ctype.h if you're using C or cctype if you're using C++.
    I forgot to mention that I want to use unix functions (note to self.. they are not methods...) doh :P It can be written in either C or C++, but as you say, seeing as how the chunk of code I got ahold of and tried to put to use was written in C rather than C++, it would indeed make sense to write it in C :S. Thank you for pointing out my mistake with the header.

    I guess I will repost this query in the C forum and put a link to the new thread in this post (I will shortly edit it). My apologies for posting in the wrong place.
    (I will request the thread in the C++ forums is deleted)

    Thank you for your time

    ~Dagorsul

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    > 1. Am I going about this the right way? Is there an easier and quicker way?

    Well you copied part of the file into a buffer. You now have an oppotunity to process the output, so I would loop buffer[i] = toupper( buffer[i] ); before you write out buffer.

    > 2. I read up on “pwrite”, and initially I thought I would need to use it to write to the file, then
    > I read something else that changed my mind, does write work ok, or is it just for displaying
    > information to the screen (I am confused by the documentation)

    write is fine, although I have no idea why you are using the old file descriptor API instead of the FILE* API. pwrite is for pipes, and you aren't using a pipe at all.

    > 3. Do I need to include “cctype.h” or “ctype.h” to use toupper? I

    <cctype> is a C++ library header, <ctype.h> is a C library header. A chief difference in the headers is that C++ declares all the function names as part of the std namespace. C doesn't have namespaces, so what may be appropriate depends on the language you really want to write this in.

    > 4. Should I be thinking about strings at all? Or stick with chars?

    You seem to be on the right track, so don't think too hard.

  3. #3
    Registered User
    Join Date
    Apr 2008
    Posts
    6
    Thank you for the reply citizen, I have dutifully popped a loop into the code to use with toupper and corrected my perception of the differences between headers in C and C++

    I've been fiddling with it for a few hours more and here's what I've got at the moment:

    Code:
    #include <unistd.h>      
    #include <sys/file.h>     
    #include <string.h>  
    #include <ctype.h>    
    
    const int BUFSIZE=4096;  	//Set buffer size
    
    void fail( char file[] );
    void copy( char file[] );
    
    int main( int argc, char *argv[], char *env[] )
    {
      for ( int i=1; i<argc; i++ )      	// for each file passed as
      {			                // an argument write the 		
        copy( argv[i] );                    // contents to the screen            
      }			                // (or some other operation)		
    }
    
    void copy( char file[] )
    {
      int fd = open( file, O_RDONLY, 0 );	// Open the file       
      if ( fd >= 0 )   			// If the file descriptor has some 
      {				// info in it then do the following
        bool eof = false;     	// Have we reached the end of the file                  
        char buf[BUFSIZE];          // Buffer to put file contents in         
        while ( !eof )          	// While we are not at the end of the file                
        {
          int bytes = read( fd, buf, BUFSIZE ); 	// bytes = the number
          if ( bytes > 0 )    		        // of characters in the file?                 
    	{
    	for ( int u=bytes; u>0; u-- )      // Loop move through the array
    	{			           // of char's within the file one
    	buf[u] = toupper( buf[u] );        // at a time and convert to upper case
    	}
    	write( 1, buf, bytes );            // write the contents of 
          else                                 // "buf", which is "bytes" long
            eof = true;	// Nothing left to write in the file so change eof flag      
    	}								                
        }
        close( fd );           // Close the file                       
      } else 
    	{
        fail( file );                          
    	}
    }
    
    
    void fail( char file[] )		// Display info about why it didn’t work
    {
      write( 2, "cat: ", 5 );
      write( 2, file, strlen( file ) );
      write( 2, " no such file or directory\n", 27 );
    }
    One thing I'm now not too sure about is the "bytes" int variable, is this in fact a count of the number of characters in the file?

    At the moment I am getting the compile error:

    expected primary-expression before "else"
    expected `;' before "else"
    I've tried checking my syntax in a number of different ways but still cannot get past this road block. I wonder if I do need that second buffer after all? Although that probably has nothing to do with my current error. Perhaps I do not properly understand how arrays work in C? I have used them in java (I'm not much good with java either, just a little more experienced at using it to bludgeon problems with).

    Any help would again be most welcome, at the moment it's either my head or the wall, and I'm not sure which is going to give in first

  4. #4
    Nub SWE
    Join Date
    Mar 2008
    Location
    Dallas, TX
    Posts
    133
    You're just missing a closing curly brace for this statement:
    Code:
    if ( bytes > 0 )
    ...inside your while(!eof) loop, which, incidentally, is not the correct way of using EOF to control a loop. Read the FAQ on EOF and loops.

    You also need to return 0 from your main().

  5. #5
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    inside your while(!eof) loop, which, incidentally, is not the correct way of using EOF to control a loop. Read the FAQ on EOF and loops.
    it is ok here because the eof is just application flag indicating that read failed... of course we can get rid of it if the while loop will be written as
    Code:
    int bytes ;
    while ( (bytes = read( fd, buf, BUFSIZE )) > 0)          	// While we are not at the end of the file                
    {
    	for ( int u=bytes; u>0; u-- )      // Loop move through the array
    	{			           // of char's within the file one
    		buf[u] = toupper( buf[u] );        // at a time and convert to upper case
    	}
    	write( 1, buf, bytes );            // write the contents of 
    }
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  6. #6
    Nub SWE
    Join Date
    Mar 2008
    Location
    Dallas, TX
    Posts
    133
    Good catch. My apologies.

  7. #7
    Registered User
    Join Date
    Apr 2008
    Posts
    6
    Quote Originally Posted by JDGATX View Post
    You're just missing a closing curly brace for this statement:
    Code:
    if ( bytes > 0 )
    ...inside your while(!eof) loop, which, incidentally, is not the correct way of using EOF to control a loop. Read the FAQ on EOF and loops.

    You also need to return 0 from your main().
    Thank you for pointing me at the correct bit of code, turned out I was not only missing an end curly bracket but a begining one as well :P I have read various bits of the tutorials, but not much on EOF, I'll go look it up

    Quote Originally Posted by vart View Post
    it is ok here because the eof is just application flag indicating that read failed... of course we can get rid of it if the while loop will be written as
    Code:
    int bytes ;
    while ( (bytes = read( fd, buf, BUFSIZE )) > 0)          	// While we are not at the end of the file                
    {
    	for ( int u=bytes; u>0; u-- )      // Loop move through the array
    	{			           // of char's within the file one
    		buf[u] = toupper( buf[u] );        // at a time and convert to upper case
    	}
    	write( 1, buf, bytes );            // write the contents of 
    }
    That does indeed look a simpler way of doing it, I'll give it a go as soon as I've gotten the darn thing to write the buffer to the file

    Which brings me onto another issue I'm now having, the buffer works just as it should and prints out exactly what I want it to (after a tweak), however I would like to now get it to print to the file rather than the screen. I thought this might work:

    Code:
    write( fd, buf, bytes );
    But no such luck, I've spent ages sifting through google, there are plenty of examples out there, but I find most of them a bit to cryptic with little explanation about what's going on in plain English:

    Code:
    write(int fildes, const void *buf, size_t nbyte);
    
    Does this mean I should end up with something like this:
    
    write( int fd, const char buf[BUFSIZE], bytes );

    Thank you all for the help so far, I appreciate your time

  8. #8
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    write( fd, buf, bytes );
    should be ok

    fd should be file descriptor opened for writing
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  9. #9
    Nub SWE
    Join Date
    Mar 2008
    Location
    Dallas, TX
    Posts
    133
    Run a search for fopen(). This function establishes your file pointer, which is FILE * in type.

  10. #10
    Registered User
    Join Date
    Oct 2001
    Posts
    2,129
    Quote Originally Posted by JDGATX View Post
    Run a search for fopen(). This function establishes your file pointer, which is FILE * in type.
    fopen uses FILE *s, sure, but UNIX functions use file descriptors. They use an int for a file descriptor, and write takes an int file descriptor, not a FILE *, because it is a UNIX function, which the OP said he/she wanted to use.

  11. #11
    Nub SWE
    Join Date
    Mar 2008
    Location
    Dallas, TX
    Posts
    133
    I'm missing a lot of things on this post. I'll slink back to the darkness.

  12. #12
    Registered User
    Join Date
    Oct 2001
    Posts
    2,129
    Did you mean you wanted me to explain it better or you missed what the OP said?

  13. #13
    Registered User
    Join Date
    Apr 2008
    Posts
    6
    Well I've spent a while trying... but

    write( fd, buf, bytes );

    just doesnt seem to work, I wonder if anyone could suggest a test I could perform to find out whats happening? It seems that the buffer simply isnt being written to the file, I know the buffer contains the correct information...

    ***EDIT

    Doh... It's always daft things...

    int fd = open( file, O_RDONLY, 0 );

    Am I blind? Probably....
    Last edited by dagorsul; 05-02-2008 at 04:18 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Formatting the contents of a text file
    By dagorsul in forum C++ Programming
    Replies: 2
    Last Post: 04-29-2008, 12:36 PM
  2. struct question
    By caduardo21 in forum Windows Programming
    Replies: 5
    Last Post: 01-31-2005, 04:49 PM
  3. Simple File encryption
    By caroundw5h in forum C Programming
    Replies: 2
    Last Post: 10-13-2004, 10:51 PM
  4. checking values in a text file
    By darfader in forum C Programming
    Replies: 2
    Last Post: 09-24-2003, 02:13 AM
  5. what does this mean to you?
    By pkananen in forum C++ Programming
    Replies: 8
    Last Post: 02-04-2002, 03:58 PM