Thread: Best way to Reverse a file

  1. #1
    Registered User
    Join Date
    Aug 2005
    Posts
    113

    Best way to Reverse a file

    How to reverse a text file character by character that is last character of file must be first character of output file.

    Though I did it by taking the text of whole file in character array and then applying string reverse function and then writing reversed variable in new file. But I think this is not a good approach since some files can be pretty large so that they cannot be stored in a variable.

  2. #2
    Registered User
    Join Date
    Aug 2005
    Posts
    1,267
    you can create a second temp file, read the original file backwards and store each character read into the temp file. When finished, delete the original file and rename the temp file to the name of the original file.

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > But I think this is not a good approach since some files can be pretty large
    So like all problems with large files (eg sorting), you apply the tranformation on parts of the file, producing tmp1, tmp2, tmp3 etc, then you combine the results to produce your final output.

    Or you could just treat the whole file as a big array and mimic arr[n] accesses with appropriate calls to seek(), get() and put()
    Assuming you know how to reverse an array that is.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >What about a deque template STL type. Read it into that and you can than read it into a new file in reverse fashion.
    Why a deque? Sure, it would work (sort of), but so would any other container that supports reverse iterators. I say sort of because even if your container grows to meet its needs, the file might be larger than the available fast memory. You'll end up getting a lot of thrashing, and the program would be painfully slow. That's a big reason why Salem's suggestion is so common. If you handle the problem in smaller pieces you can decrease the chances of slowdowns like that.
    My best code is written with the delete key.

  5. #5
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Um...like load the file into a buffer, reverse the buffer, and save the buffer to the file.

  6. #6
    Registered User
    Join Date
    May 2006
    Posts
    903
    I'd do this:
    Code:
    std::ifstream file("myfile.txt");
    std::string tmp;
    std::stringstream file_content;
    
    while(std::getline(std::cin, tmp, '\n'))
    {
      file_content << tmp;
      tmp.clear();
    }
    
    std::reverse(file_content.begin(), file_content.end());
    
    std::ofstream new_file("newfile.txt");
    new_file << file_content.str();

  7. #7
    Registered User
    Join Date
    Aug 2005
    Posts
    113
    Can We write something to beginning of a file?

  8. #8
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Quote Originally Posted by Salem
    Or you could just treat the whole file as a big array and mimic arr[n] accesses with appropriate calls to seek(), get() and put()
    And then call the resulting executable "die_hard_disk.exe"

    Nah, Salem's first solution is best. Just read the file in chunks of some megabytes (ideally, you'd adapt the size to the amount of RAM the current computer has - say, read in chunks of 1/100th of the RAM), reverse those, and write them back to another file, then at the end delete the old and rename the new.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  9. #9
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Why so?
    Every single seek / get / put doesn't imply a disk access. Any reasonable level of caching should be able to keep at least two sectors-worth of the file in memory.

    Time for an experiment, if there are any volunteers (vaibhav).
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  10. #10
    Registered User
    Join Date
    Aug 2005
    Posts
    1,267
    Ok here is a little quickt in C (I don't know how to do it using c++ fstream class)
    Code:
    #include <stdio.h>
    
    int main(int argc, char* argv[])
    {
    	char iobuf[BUFSIZ];
    	FILE* in = fopen("infile.txt","rb");
    	if(in == NULL)
    	{
    		printf( "cannot open input file\n");
    		return 1;
    	}
    	FILE* out = fopen("outfile.txt","wb");
    
    	fseek(in,0,SEEK_END);
    	unsigned long file_size = ftell(in);
    	unsigned long offset = file_size;
    	unsigned long nBytesToRead = BUFSIZ;
    	if(offset > BUFSIZ)
    		offset -= BUFSIZ;
    	else
    		offset = 0;
    	while(nBytesToRead > 0)
    	{
    		fseek(in,offset,SEEK_SET);
    		size_t nBytes = fread(iobuf,1,nBytesToRead,in);
    		char* ptr = iobuf + nBytes;
    		while(nBytes > 0)
    		{
    			putc(*ptr,out);
    			ptr--;
    			nBytes--;
    		}
    		if(offset > BUFSIZ)
    			offset -= BUFSIZ;
    		else if(offset > 0)
    		{
    			// read remainder of file
    			nBytesToRead = offset;
    			offset = 0;
    		}
    		else
    			nBytesToRead  = 0; // all done folks!
    		
    	}
    	fclose(in);
    	fclose(out);
    
    	return 0;
    }
    Last edited by Ancient Dragon; 07-31-2006 at 11:55 AM.

  11. #11
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Byte-by-byte reversal as suggested by Salem.

    Code:
    #include <stdio.h>
    
    int reverse_file(const char *filename);
    
    int main(int argc, char **argv)
    {
    	if(argc < 2) {
    		fprintf(stderr, "Usage: reverse <file>\n");
    		return 1;
    	}
    
    	return reverse_file(argv[1]);
    }
    
    int reverse_file(const char *filename)
    {
    	FILE *file;
    	long front = 0, back = -1, halflength;
    
    	file = fopen(filename, "r+b");
    	fseek(file, 0, SEEK_END);
    	halflength = ftell(file) / 2;
    	rewind(file);
    
    	for(; front < halflength; ++front, --back) {
    		int t1, t2;
    
    		fseek(file, front, SEEK_SET);
    		t1 = fgetc(file);
    		if(t1 == EOF) {
    			fclose(file);
    			return 1;
    		}
    
    		fseek(file, back, SEEK_END);
    		t2 = fgetc(file);
    		if(t2 == EOF) {
    			fclose(file);
    			return 1;
    		}
    
    		fseek(file, back, SEEK_END);
    		if(fputc(t1, file) == EOF) {
    			fclose(file);
    			return 1;
    		}
    
    		fseek(file, front, SEEK_SET);
    		if(fputc(t2, file) == EOF) {
    			fclose(file);
    			return 1;
    		}
    	}
    
    	return 0;
    }
    I killed it after it took 20 minutes to reverse not even 10 MB of a 100MB file.

    I'll write a blockwise reverser now.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  12. #12
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Here's the block and temporary file version. This one uses a 1MB block.
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    
    #define BLOCK (1024 * 1024)
    #define REVERSETMP "tmp_reverse"
    
    int reverse_file(const char *filename);
    void reverse_mem(unsigned char *mem, size_t size);
    
    int main(int argc, char **argv)
    {
    	if(argc < 2) {
    		fprintf(stderr, "Usage: reverse <file>\n");
    		return 1;
    	}
    
    	return reverse_file(argv[1]);
    }
    
    int reverse_file(const char *filename)
    {
    	FILE *file;
    	FILE *temp;
    	int tempfd;
    	long size, offset;
    	unsigned char *buffer;
    
    	file = fopen(filename, "rb");
    	if(!file) {
    		return 1;
    	}
    	temp = fopen(REVERSETMP, "wb");
    	if(!temp) {
    		fclose(file);
    		return 1;
    	}
    	buffer = malloc(BLOCK);
    	if(!buffer) {
    		fclose(file);
    		fclose(temp);
    		return 1;
    	}
    
    	fseek(file, 0, SEEK_END);
    	size = ftell(file);
    
    	for(offset = size - BLOCK; offset >= 0; offset -= BLOCK) {
    		fseek(file, offset, SEEK_SET);
    		fread(buffer, 1, BLOCK, file);
    
    		if(ferror(file)) {
    			fclose(file);
    			fclose(temp);
    			free(buffer);
    			return 1;
    		}
    
    		reverse_mem(buffer, BLOCK);
    
    		fwrite(buffer, 1, BLOCK, temp);
    
    		if(ferror(temp)) {
    			fclose(file);
    			fclose(temp);
    			free(buffer);
    			return 1;
    		}
    	}
    
    	/** There may remain a bit of the file. */
    	if(offset < 0) {
    		offset += BLOCK;
    		rewind(file);
    		fread(buffer, 1, offset, file);
    
    		if(ferror(file)) {
    			fclose(file);
    			fclose(temp);
    			free(buffer);
    			return 1;
    		}
    
    		reverse_mem(buffer, offset);
    
    		fwrite(buffer, 1, offset, temp);
    
    		if(ferror(temp)) {
    			fclose(file);
    			fclose(temp);
    			free(buffer);
    			return 1;
    		}
    	}
    
    	fclose(file);
    	fclose(temp);
    	free(buffer);
    
    	unlink(filename);
    	rename(REVERSETMP, filename);
    
    	return 0;
    }
    
    void reverse_mem(unsigned char *mem, size_t size)
    {
    	unsigned char *start = mem, *end = mem + size - 1;
    	unsigned char t;
    
    	for(; start < end; ++start, --end) {
    		t = *start;
    		*start = *end;
    		*end = t;
    	}
    }
    Takes less than 2 seconds for the 100MB file.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  13. #13
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Finally, for what it's worth, here's the program I used to generate the test files.
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    
    int generate_file(const char *filename, long size);
    
    int main(int argc, char **argv)
    {
    	if(argc < 3) {
    		fprintf(stderr, "Usage: generate <name> <size in kb>\n");
    		return 1;
    	}
    
    	return generate_file(argv[1], atol(argv[2]) * 1024L);
    }
    
    int generate_file(const char *filename, long size)
    {
    	FILE *file;
    	unsigned char value = 0;
    
    	file = fopen(filename, "wb");
    	if(!file) {
    		return 1;
    	}
    
    	for(; size > 0; --size) {
    		fputc(value++, file);
    	}
    
    	return 0;
    }
    It simply creates a file of the specified size that cycles through the byte values.

    A final note on these programs. All of them are, except for the failure program exit code which should be EXIT_ERROR (or whatever that constant is called) instead of 1, 100% ANSI C. They should compile and run just about anywhere that is capable of allocating 1MB of continuous memory. They are also (in theory) capable of handling files up to 2^63-1 bytes long, if compiled on a system where long is 64-bit.
    Last edited by CornedBee; 07-31-2006 at 02:02 PM.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  14. #14
    Registered User
    Join Date
    Aug 2005
    Posts
    1,267
    I used Cornedbee's program to geenrate a 10M (10,000K) file. My program processed it in under 1/2 seconds (405 clock ticks using clock()).
    [edit]100Mg (100,000Kb) file in 13 seconds[/edit]
    Last edited by Ancient Dragon; 07-31-2006 at 03:41 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Encryption program
    By zeiffelz in forum C Programming
    Replies: 1
    Last Post: 06-15-2005, 03:39 AM
  2. Simple File encryption
    By caroundw5h in forum C Programming
    Replies: 2
    Last Post: 10-13-2004, 10:51 PM
  3. Unknown Memory Leak in Init() Function
    By CodeHacker in forum Windows Programming
    Replies: 3
    Last Post: 07-09-2004, 09:54 AM
  4. Making a LIB file from a DEF file for a DLL
    By JMPACS in forum C++ Programming
    Replies: 0
    Last Post: 08-02-2003, 08:19 PM
  5. Need a suggestion on a school project..
    By Screwz Luse in forum C Programming
    Replies: 5
    Last Post: 11-27-2001, 02:58 AM