File reading and writing

**quzah** · 12-28-2001

re: CPP vs C in using malloc...

C++ is a bit more strict where void* are concerned. C doesn't require a cast, like you've noticed, C++ does. Since it ends in .cpp, the compiler assumes it's C++ code.

Quzah.

**QuestionC** · 12-28-2001

You know, I can't help but wonder what exactly you want to do with these files?

**GaPe** · 12-29-2001

Code:

#include <stdio.h>
#include <stdlib.h>

struct file{
	char *p;
};

int main(void){
	FILE *in, *out;
	struct file f;
	in = fopen("c:\\windows\\desktop\\in.txt", "r");
	out = fopen("c:\\windows\\desktop\\out.txt", "w");

	f.p = (char *)malloc(10 * sizeof(f.p));

	fread(f.p, 10, 1, in);
	fwrite(f.p, 10, 1, out);

	return EXIT_SUCCESS;
}

I tried this, and it is not working. Why not?

**quzah** · 12-29-2001

Because it's horrible?

int c;
while( (c=fgetc(fin)) != EOF ) fputc( c, fout );

Or did you want the real reason?

Check your values to make sure your files have actually opened correctly. Use printf() to see what you've read first. This way you can track down your problem.

Quzah.

**GaPe** · 12-30-2001

Code:

int c; 
while( (c=fgetc(fin)) != EOF ) fputc( c, fout );

This code is slow!

I know how to copy files. Three days ago I created a file splitter. It's easy, but it's slow. I tested with 300 MB large file and I split it on 3 parts (100 MB). The program was working on this about 7 minutes. That's why I want to learn how to copy with pointers (malloc). So, please help me with that.

**Prelude** · 12-30-2001

When you copy a file, everything in that file gets copied, so speed is difficult. Breaking the file into parts for copying is good because it keeps your program from choking (for the most part). You tried it with a 300Mb file and it took about 7 minutes, this sounds about right for copying a retail program of 300Mb from a CD to hard disk.

You can move the data with pointers, but you're still moving the data. Pointers won't speed up the throughput of your computer, they will help with the efficiency of the program, but not the data.

You know what the most common way of dealing with slow copying is? Hiding it with pictures and status bars and little rotating hour glasses that tell the user the computer has not in fact frozen AND keeps them interested enough to not terminate the program.

-Prelude

**GaPe** · 12-30-2001

But what about Windows Commander command "Split File". I tested with the same file (300 MB) and it took about 30 seconds.
Why?

And Prelude, please tell me why your code for copying files is not working.

**Prelude** · 12-30-2001

>I tested with the same file (300 MB) and it took about 30 seconds. Why?
Not a clue, when you find out be sure to let me know though

If you're using it exactly as I wrote it then you're probably transferring too little data. Create a variable to hold the size of your file and replace every reference to 10 with that variable. In that example program I was only copying 10 bytes because that was the size of my test file.

-Prelude

**Salem** · 12-30-2001

In order to speed things up, you need to read/write much larger blocks of data - one char at a time is simply too expensive.

Here's my idea

Code:

#include <stdio.h>
#include <time.h>

#define FRAGMENT_SIZE   (10*1000*1000L)     // size of file fragments

int main ( ) {
    char buff[BUFSIZ];
    FILE *in, *out = NULL;
    long int num_blocks = FRAGMENT_SIZE / BUFSIZ;
    long int remainder  = FRAGMENT_SIZE % BUFSIZ;
    long int count = 0;
    int      num_out_files = 0;
    size_t   read_size;
    clock_t  start, end;

    start = clock();
    in = fopen( "f:in.dat", "rb" );
    while ( !feof(in) ) {
        // determine the maximum amount of data to read
        if ( count == num_blocks ) {
            read_size = remainder;
        } else {
            read_size = BUFSIZ;
        }

        // read/write that amount - the actual amount to write
        // depends on the actual amount read, which will be less than
        // read_size at the end of the in file
        read_size = fread( buff, 1, read_size, in );
        if ( read_size != 0 ) {
            if ( count == 0 ) {
                char filename[50];
                if ( out != NULL ) {            // close previous file
                    fclose( out );
                }
                sprintf( filename, "f:out%03d.dat", num_out_files++ );
                out = fopen( filename, "wb" );  // open next file
            }
            fwrite( buff, 1, read_size, out );  // should check result...
        }

        // determine if we need to read more blocks to fill the current
        // out file, or start a new out file
        if ( count == num_blocks ) {
            count = 0;
        } else {
            count++;
        }
    }
    if ( out != NULL ) {
        fclose( out );
    }
    end = clock();
    printf( "Splitting took %d seconds\n",
            (end-start)/CLOCKS_PER_SEC );
    return 0;
}

BUFSIZ is supposed to be optimal for average use, but you might like to experiment with other values, particularly those which are multiples of the basic block size on your file system (eg. 512 bytes), or perhaps the cluster size.

**GaPe** · 12-30-2001

Salem

I see that you use the "!feof(fin)". Whenever I used this it has never read a whole file, always was some KB's left.

Salem, can YOU tell me why Windows Commander Split file takes only 30s to complete with splitting a 300 MB large file?

**Salem** · 12-31-2001

> Whenever I used this it has never read a whole file, always was some KB's left
Perhaps posting some code so we can see how YOU use feof() would help determine why.

> why Windows Commander Split file takes only 30s to complete with splitting a 300 MB large file?
Because it uses the same mechanism as I do - a large buffer.
Did you try my code?

**GaPe** · 12-31-2001

Yes, I tried. It's working

.

I have few questions:

1) FRAGMENT_SIZE (10*1000*1000L)

What that "L" means? And is this size in bytes?

2) num_blocks = FRAGMENT_SIZE / BUFSIZ;
remainder = FRAGMENT_SIZE % BUFSIZ;

Why and for what do you need this?

3) fread( buff, 1 (*Why just 1?*), read_size(*Why so many times?*), in );

**Salem** · 12-31-2001

> What that "L" means?
1000L means a long value, not short or int

> Why and for what do you need this?
because
buff = malloc( FRAGMENT_SIZE );
will probably fail if you're trying to split a file into 100MB blocks.
So instead, use a fixed sized block, and work out how many times we need to read a full buffer (num_blocks) and the number of remainder bytes (remainder) to total FRAGMENT_SIZE

FRAGMENT_SIZE = num_blocks * BUFSIZ + remainder;

> 3) fread( buff, 1 (*Why just 1?*), read_size(*Why so many times?*), in );
Read the manual

**GaPe** · 12-31-2001

Thank you Salem and happy new year!

Thread: File reading and writing

Thread Tools

Search Thread

Display

Similar Threads

segmetation fault (reading and writing file in c)

Reading out of and writing into the same file

airport Log program using 3D linked List : problem reading from file

file writing and reading

what does this mean to you?