Thread: File Splitter and Merger in C++

  1. #1
    Registered User
    Join Date
    Mar 2010
    Location
    pkr
    Posts
    32

    File Splitter and Merger in C++

    I just wanted to do a project on the topic.
    What would I have to learn to complete this work.
    Which topics to cover?

    I have recently completed a book OOP in C++ by Robert Lafore.

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    How well you do on this project depends on how well you know file I/O, and this type of project can certainly teach you that if you don't know file I/O.

  3. #3
    Registered User
    Join Date
    Mar 2010
    Location
    pkr
    Posts
    32
    I think I do not know that much of file I/O then.
    Its not assignment sort of thing, I just want to do it to learn so where I must be heading now?

  4. #4
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    Where must you be heading? It isn't like there is a single answer, it depends on the resources you have. If you have a local library, check out a book on C++ programming and find the chapter(s) that discuss I/O and/or files. If you prefer the internet, you could search the forum for threads about file writing and reading, or read tutorials at cprogramming.com or elsewhere... If you go to school you could ask a computer science teacher... All roads lead to Rome.

  5. #5
    Registered User
    Join Date
    Mar 2010
    Location
    pkr
    Posts
    32
    I made it to this much please help me from here


    Code:
    #include <iostream>
    #include <fstream>
    #include <string>
    
    using namespace std;
    int main(int argc, char** argv)
    {
    	int parts;
    	cout <<"Enter the number of parts";
    	cin >> parts;
    	ifstream bigfile;
    	bigfile.open(argv[1],ios::in | ios::binary);
    	int size=0;	
    	
       int achunck = size/parts;
       while( !bigfile.eof() )
    			{
    			char ch;
    			bigfile.get(ch);
    			cout <<ch;		
    			size++;
    			}
    	// seek the file to bigining here
    	for (int i = 0; i < parts; i++)
    	{
    		//extratct a chunk here and write it to the next file	
    	}
    	
       return 0;
    }
    Last edited by meadhikari; 09-07-2010 at 09:21 PM.

  6. #6
    Registered User
    Join Date
    Mar 2010
    Location
    pkr
    Posts
    32
    I am having problem with seekg in pointing to the start of the file and getting a chunck out a file

    please help me on this

  7. #7
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    If we look at the reference page for seekg() we have three directions: ios_base::beg, ios_base::cur, and ios_base::end, and according to that page this is where the offset is applied.

    So it follows that a call:

    seekg(-5, ios_base::cur);

    would move five characters back from the current place in the file. So if you know exactly how much you've read, that is one way to achieve the desired affect. Looking at it another way, consider what applying an offset of zero to the beginning of the file means.

    Also, to read files there are several functions available, one of which is called read(). Once you know how read works I expect this program will be easy to complete.

  8. #8
    Registered User
    Join Date
    Mar 2010
    Location
    pkr
    Posts
    32
    The splitter is working but the only problem is when the file is big.....
    I receive a error saying segmentation failiure
    Having problem storing size of large file
    What data type should the length be so as to hod the value of a file of 1GB in bytes

    Code:
    #include <iostream>
    #include <fstream>
    #include <string>
    #include <sstream>
    #include <cstdlib>
    
    using namespace std;
    int main(int argc, char** argv)
    {
    	int parts = atoi(argv[2]);
    	
    	ifstream bigfile;
    	bigfile.open(argv[1],ios::in | ios::binary);
    		
    	// get length of file:
      bigfile.seekg (0, ios::end);
      double length = bigfile.tellg();
      bigfile.seekg (0, ios::beg);
       int achunk = length/parts;
       char buffer [achunk];
       for (int i = 0; i < parts; i++)
    	{
    		//creating the file name
    		 //Build File Name
             string partname = argv[1];  //The original filename
             string charnum;     //archive number
             stringstream out;
             out << "." << i;
             charnum = out.str();
             partname.append(charnum);  //put the part name together
    		//chunking and writing to small files
    		bigfile.read (buffer,achunk);
    		ofstream smallfile;
    		smallfile.open(partname.c_str(),ios::out | ios::binary);
    		smallfile <<buffer;
    		smallfile.close();
    			
    	}
    	
       return 0;
    }

  9. #9
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    Code:
       int achunk = length/parts;
       char buffer [achunk];
    You are only allowed to do that with constant expressions. Rather than declare an array and worry about where to store it and things like that, you should try adapting vector to this purpose.

    Code:
    vector<char> buffer (achunk);
    
    bigfile.read (&buffer[0], achunk);
    
    smallfile.write (&buffer[0], achunk);
    Something like that will work well. vector's storage is continuous like an arrays, so an expression like &buffer[0] amounts to char *.

    I think this resolves that segfault problem, but be sure to divide >=1GB files into several chunks, at least as an intermediate step. Not because the files themselves can't be big: If you try to commit too much memory to temporarily storing really big parts, you'll run into allocation problems even if the RAM is there physically. I'm perhaps being pessimistic. There are for instance platform specific allocators that can request a lot more RAM. I expect you don't want to delve into that though.

    You need to spend a moment to make your indentation straight. It's hard to read.

  10. #10
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Hmm....WinMerge and Araxis Merge come to my mind.

  11. #11
    Registered User
    Join Date
    Mar 2010
    Location
    pkr
    Posts
    32

    Thumbs up Splitter Finally Over but Merger has quite a few problem

    Splitter Its working fine.....
    Code:
    #include <iostream>
    #include <fstream>
    #include <string>
    #include <sstream>
    #include <cstdlib>
    #include <vector>
    using namespace std;
    int main(int argc, char** argv)
    {
    	int parts = atoi(argv[2]);
    	
    	ifstream bigfile;
    	bigfile.open(argv[1],ios::in | ios::binary);
    		
    	// get length of file:
    	bigfile.seekg (0, ios::end);
    	int length = bigfile.tellg();
    	bigfile.seekg (0, ios::beg);
    	
    	int achunk = length/parts;
           vector<char> buffer (achunk);
       
           for (int i = 0; i < parts; i++)
    	{
    		 
    		 //Build File Name
    		 string partname = argv[1];
                     string charnum;     //archive number
                     stringstream out;
                     out << "." << i;
                     charnum = out.str();
                     partname.append(charnum);  //put the part name together
    		
    		//chunking and writing to small files
    		bigfile.read (&buffer[0], achunk);
    		ofstream smallfile;
    		smallfile.open(partname.c_str(),ios::out | ios::binary);
    		smallfile.write (&buffer[0], achunk);
    		smallfile.close();
    			
    	}
    	
       return 0;
    }
    Merger with few errors
    Code:
    #include <iostream>
    #include <fstream>
    #include <string>
    #include <sstream>
    #include <cstring>
    #include <cstdio>
    #include <cstdlib>
    #include <vector>
    
    
    using namespace std;
    int main(int argc, char** argv)
    {
    	int parts = atoi(argv[2]);
    	
       
    	//obtaining the name of the main file
        string temp = argv[1];
        int lengthoftemp =temp.size();
        int lengthofbigfile = lengthoftemp-1;
        string  bigfilename = temp.substr(0,lengthofbigfile);
        
        //opening the main file 
        ifstream smallfile;
         smallfile.open(argv[1],ios::out | ios::binary);//instead of argv[1] bigfilename needs to be but why can not it be done
    	
    	
        // get length of file:
        smallfile.seekg (0, ios::end);
        int length = smallfile.tellg();
        smallfile.seekg (0, ios::beg);
        vector<char> buffer (length);
        smallfile.close();
      
       
    	 for (int i = 0; i < parts; i++)
    	{
    		  
    		 
    		 //Build File Name for small files
    		 string partname = bigfilename;
                     string charnum;     //archive number
                     stringstream out;
                     out << i;
                     charnum = out.str();
                     partname.append(charnum);  //put the part name together
             
    		 //writing chunck to bigfile(am I conceptually wrong somewhere here)
    		 ifstream smallfile;
    		 smallfile.open(partname.c_str(),ios::out | ios::binary);
    		 smallfile.read (&buffer[0], length);
    		 ofstream bigfile;
    		 bigfile.open(argv[1],ios::out | ios::binary);
    		 bigfile <<smallfile;
    		 smallfile.close();
    	}	
    	return 0;
    }

Popular pages Recent additions subscribe to a feed