Thread: Copying Files

  1. #1
    Registered User ch4's Avatar
    Join Date
    Jan 2007
    Posts
    154

    Copying Files

    Check the above, it is for copying files.

    So my question is : Using an int for copying, is it safe for every file e.x. mp3's pdf's etc. Or is there another way such as byte (bit) copy just for safety ?



    Code:
    #include <stdio.h> 
    #include <stdlib.h> 
    
    static FILE *open_file ( char *file, char *mode )
    {
      FILE *fp = fopen ( file, mode );
    
      if ( fp == NULL ) {
        perror ( "Unable to open file" );
        exit ( EXIT_FAILURE );
      }
    
      return fp;
    }
    
    int main ( int argc, char *argv[] )
    {
      int ch;
      FILE *in;
      FILE *out;
    
      if ( argc != 3 ) {
        fprintf ( stderr, "Usage: %s <readfile1> <writefile2>\n", argv[0] );
        exit ( EXIT_FAILURE );
      }
    
      in = open_file ( argv[1], "r" );
      out = open_file ( argv[2], "w" );
    
      while ( ( ch = fgetc ( in ) ) != EOF )
        fputc ( ch, out );
    
      fclose ( in );
      fclose ( out );
    
      return EXIT_SUCCESS;
    }
    Last edited by ch4; 02-14-2009 at 01:55 PM.

  2. #2
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by ch4 View Post
    So my question is : Using an int for copying, is it safe for every file e.x. mp3's pdf's etc. Or is there another way such as byte (bit) copy just for safety ?
    If you are using fgetc, you need to use an int as the return value. However, if you want to copy a "binary" (such as an mp3, etc), you would want to use fread into a signed char buffer. This is because a signed char is literally one byte, and a file is literally a series of bytes, which are signed char values (-128 to 127). Although this is a number, it is not at all a C integer, which is 4 bytes (or 8 bytes on 64bit) long.

    There may be a problem with how fgetc reacts to 0, which can exist in a binary file but does not occur in text files.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  3. #3
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Quote Originally Posted by MK27 View Post
    If you are using fgetc, you need to use an int as the return value. However, if you want to copy a "binary" (such as an mp3, etc), you would want to use fread into a signed char buffer. This is because a signed char is literally one byte, and a file is literally a series of bytes, which are signed char values (-128 to 127). Although this is a number, it is not at all a C integer, which is 4 bytes (or 8 bytes on 64bit) long.

    There may be a problem with how fgetc reacts to 0, which can exist in a binary file but does not occur in text files.
    more standard way will be to use unsigned char buffer with fread and represent bunary data as hex 0x00-0xFF

    fgetc has no problem with 0x00 char. Why do you have any doubt in it?

    you need int to store the return value of fgetc because data read from file use range from 0 to 255 and EOF condition is signaled with the negative value EOF (on most compilers -1)
    Last edited by vart; 02-14-2009 at 02:24 PM.
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  4. #4
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by vart View Post
    more standard way will be to use unsigned char buffer with fread and represent bunary data as hex 0x00-0xFF

    fgetc has no problem with 0x00 char. Why do you have any doubt in it?
    Fair enough. I don't use fgetc, so that's why I wrote may. I suppose there is no difference between a signed and unsigned char in this context as long as you do not mix them up.

    you need int to store the return value of fgetc because data read from file use range from 0 to 255 and EOF condition is signaled with the negative value EOF (on most compilers -1)
    That would definitely be a problem then, since a binary file will contain a -1 sooner or latter. Use fread -- or just use file pointers and read(), which I think is how fread is actually implimented.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by MK27
    That would definitely be a problem then, since a binary file will contain a -1 sooner or latter.
    Ah, but as the 1999 edition of the C Standard puts it, "the fgetc function obtains that character as an unsigned char converted to an int".
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by laserlight View Post
    Ah, but as the 1999 edition of the C Standard puts it, "the fgetc function obtains that character as an unsigned char converted to an int".
    So fgetc should be fine, is what you're saying. I kind of thought that was the case, since an EOF is not actually a -1. However, it will be awkward if you want to get the whole file into memory before you copy it out -- or for that matter retain anything more than a single byte.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  7. #7
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by MK27
    an EOF is not actually a -1.
    As vart pointed out, EOF is typically -1, and in any case it must be a negative int constant.

    Quote Originally Posted by MK27
    However, it will be awkward if you want to get the whole file into memory before you copy it out -- or for that matter retain anything more than a single byte.
    Awkward? Not likely, since you can just copy into an array. On the other hand, one might as well read in blocks.
    Last edited by laserlight; 02-14-2009 at 03:37 PM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  8. #8
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by laserlight View Post
    As vart pointed out, EOF is typically -1, and in any case it must be a negative int constant.
    If you use a signed char, you will find lots of -1 in (eg) a .jpg image. And if you use an unsigned char, there is no such thing as -1. I do not think the EOF is even one real byte, in fact.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by MK27
    If you use a signed char, you will find lots of -1 in (eg) a .jpg image. And if you use an unsigned char, there is no such thing as -1.
    fgetc() does not return a signed char or an unsigned char, but an int.

    Quote Originally Posted by MK27
    I do not think the EOF is even one real byte, in fact.
    It is not. EOF is a value used to denote "end of file", or some error. This is why vart is so certain that "data read from file use range from 0 to 255", and that there will not be conflicts between actual data read by fgetc() and EOF (though vart does assume that the range of unsigned char is [0,255], but that is a reasonable assumption).
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. using mmap for copying large files
    By rohan_ak1 in forum C Programming
    Replies: 6
    Last Post: 05-13-2008, 08:12 AM
  2. accessing all files in a folder.
    By pastitprogram in forum C++ Programming
    Replies: 15
    Last Post: 04-30-2008, 10:56 AM
  3. Copying Files
    By HLA91 in forum C++ Programming
    Replies: 8
    Last Post: 10-25-2007, 03:24 AM
  4. Help with loading files into rich text box
    By blueparukia in forum C# Programming
    Replies: 3
    Last Post: 10-19-2007, 12:59 AM
  5. copying files
    By Unregistered in forum C++ Programming
    Replies: 3
    Last Post: 03-08-2002, 03:41 AM