Thread: Search through file

  1. #1
    Registered User
    Join Date
    Aug 2007
    Posts
    66

    Search through file

    I want to make a programm that reads the text file link.txt
    and then look for the phrase "video_id="

    Imagine the the link.txt is something like this:
    dSDJSD723jsd843jsd(#$7dkhs=sd=342video_id=YfDHTDhf d/sds093hjf_dfs';

    I want to output the text AFTER video_id
    for example the output of the above might be:
    YfDHTDhfd/sds093hjf_dfs

    BEWARE: without '; of the ending.

    How can I make this real C++ programm?

  2. #2
    Kiss the monkey. CodeMonkey's Avatar
    Join Date
    Sep 2001
    Posts
    937
    Well, it would help to know where, approximately, this "video_id" will be. This way, you could read in chunks of a certain size knowing that you will cover the whole string "video_id" entirely within one of the chunks. You also could read the entire file into memory, depending on how big it is.
    Is this a video file? If so, then you will need to account for the binary nature of the file, and, of course, its size.

    Say you know that the file is arranged in 1 KB packets of information. Then you can read 1 KB at a time and be assured that you won't split the string "video_id=".
    Code:
    #include <fstream>
    #include <sstream>
    #include <string>
    #include <iostream>
    
    int main()
    {
        std::ifstream fi("file path",std::ios::binary);
        const unsigned bfsz = 1024;
        char* buf = new char[bfsz];
        const std::string tag("video_id=");
    
        while(fi.read(buf,bfsz))
        {
            std::string s(buf);
            std::size_t vipos = s.find(tag);
            if(vipos == std::string::npos)
                continue;
    
            for(unsigned i = 0; i < (bfsz - vipos+tag.size()); ++i)
                fi.unget();
    
            fi.read(buf,bfsz);
            std::istringstream iss(std::string(buf));
            std::getline(iss, s, ';'); //this line produces an error on my compiler, but I think it's standard...
            std::cout << "This is the id: " << s << std::endl;
            delete [] buf;
            return 0;
        }
        std::cout << '\"' << tag << "\" not found in file." << std::endl;
        delete [] buf;
    }
    This is just the idea:
    Code:
    while(read packet)
    {
       look for video_id
       if not found, skip the rest of loop
       
       find out where the '=' is, and "go back" in the file to the char right after '='
       read until you encounter a ';' but discard the ';'
       print what you read
       exit program
    }
    getting here means we didn't exit program, so we didn't find video_id
    exit program
    Last edited by CodeMonkey; 02-22-2009 at 04:00 PM.
    "If you tell the truth, you don't have to remember anything"
    -Mark Twain

  3. #3
    Registered User
    Join Date
    Aug 2007
    Posts
    66
    for the real thing the link.txt contains the following:

    The link.txt file :
    var fullscreenUrl = '/watch_fullscreen?video_id=UHvAXLBMWGI&l=51&t=OEgsT oPDskKBFLkte03dOnia39Rzd63j&sk=7Ayx8rGROMLJrkh6LkX j7AC&fs=1&title=CompizFusion';
    I want to isolate:
    UHvAXLBMWGI&l=51&t=OEgsToPDskKBFLkte03dOnia39Rzd63 j&sk=7Ayx8rGROMLJrkh6LkXj7AC&fs=1&title=CompizFusi on

  4. #4
    Registered User
    Join Date
    Aug 2007
    Posts
    66
    Thank my friend.

    Unfortunatelly this line error my compiler too...
    can you fix this ?

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Two issues:
    1) stringstream does not appear to have a constructor that takes just a string, or at least gcc reports that iss is not a stringstream object but a function taking a string and returning a stringstream. Changing to
    Code:
    std::stringstream iss(std::string(buf), std::ios_base::out|std::ios_base::in);
    worked wonders.
    2) I changed istringstream to stringstream just because I thought it looked weird to read out of a istringstream. It may well work with an istringstream as well.

  6. #6
    Registered User
    Join Date
    Aug 2007
    Posts
    66
    If the programm is this:

    Code:
    #include <fstream>
    #include <sstream>
    #include <string>
    #include <iostream>
    
    int main()
    {
        std::ifstream fi("link.txt",std::ios::binary);
        const unsigned bfsz = 1024;
        char* buf = new char[bfsz];
        const std::string tag("video_id=");
    
        while(fi.read(buf,bfsz))
        {
            std::string s(buf);
            std::size_t vipos = s.find(tag);
            if(vipos == std::string::npos)
                continue;
    
            for(unsigned i = 0; i < (bfsz - vipos+tag.size()); ++i)
                fi.unget();
    
            fi.read(buf,bfsz);
            std::stringstream iss(std::string(buf), std::ios_base::out|std::ios_base::in);
            std::getline(iss, s, ';'); //this line produces an error on my compiler, but I think it's standard...
            std::cout << "This is the id: " << s << std::endl;
            delete [] buf;
            return 0;
        }

    and in the same directory I have the link.txt which is:
    panosg89@siduxbox:~/scripts$ cat link.txt
    Code:
                    var fullscreenUrl = '/watch_fullscreen?fs=1&fexp=52460&plid=AARjhkSgdRDLA7PmAAAAoAQcYQA&iv_storage_server=http%3A%2F%2Fwww.google.com%2Freviews%2Fy%2F&creator=tearbd2&video_id=xC5uEe5OzNQ&l=272&sk=rRLOCm6jr8EoL24c2OEYVJLeztH3gxOUC&fmt_map=34%2F0%2F9%2F0%2F115&t=vjVQa1PpcFMTYcmsyIsJHMSgPFLendWhhpUseGsZwWI%3D&hl=en&cr=US&vq=None&iv_module=http%3A%2F%2Fs.ytimg.com%2Fyt%2Fswf%2Fiv_module-vfl79529.swf&title=WINDOWS VISTA AERO VS LINUX UBUNTU BERYL';
    I get output that there IS NOT video_id .. . . . ..

  7. #7
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Note that the code above will always print that the tag is not found. The important part is, does it print something up above it?

    Now, the answer is probably no, because read will set the failbit even if it read something, if it didn't read the full 1K. So you should probably change that while loop into a do-while loop.

    Admittedly, when I did that, it didn't print the actual video_id, because the read failed, because the failbit was still set. So look at this as an idea, and not the solution itself, and we'll all be much happier.

  8. #8
    Registered User
    Join Date
    Aug 2007
    Posts
    66
    Well all I need is a c++ programa to do the job. To output:
    Code:
    xC5uEe5OzNQ&l=272&sk=rRLOCm6jr8EoL24c2OEYVJLeztH3gxOUC&fmt_map=34%2F0%2F9%2F0%2F115&t=vjVQa1PpcFMTYcmsyIsJHMSgPFLendWhhpUseGsZwWI%3D&hl=en&cr=US&vq=None&iv_module=http%3A%2F%2Fs.ytimg.com%2Fyt%2Fswf%2Fiv_module-vfl79529.swf&title=WINDOWS VISTA AERO VS LINUX UBUNTU BERYL';
    I simply ask: Can you make it ?
    because I can't make it cause I do not have the knowledge that you have.

  9. #9
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    I can make it -- I fixed my stupid error, which was not the fault of failbit but me instead.

    I suppose we can discuss further if your company would pay my invoice for your on-the-job training; otherwise you should be able to take the very very very large hints provided here and turn out a working program.

  10. #10
    Registered User
    Join Date
    Aug 2007
    Posts
    66
    I am not working... I am a student of university...
    I am running out of time here because I have to post this to my teacher on Tuesday morning.

    I know that you are helping me with your advices but I can't find the free time to study the C++ and I/O files in order to sit down and understand the script. But I will do it sometime in the future definitely.

    Anyway... this is a good forum and I think that there will be some others with more open source thoughts to give me the working source code without asking for money.

  11. #11
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by BlackSlash12 View Post
    I am not working... I am a student of university...
    I am running out of time here because I have to post this to my teacher on Tuesday morning.

    I know that you are helping me with your advices but I can't find the free time to study the C++ and I/O files in order to sit down and understand the script. But I will do it sometime in the future definitely.

    Anyway... this is a good forum and I think that there will be some others with more open source thoughts to give me the working source code without asking for money.
    Heh. I was hoping that you weren't trying to cheat your way through school. Guess I was wrong.

    (Edit: And it's not like I would have gotten the money either -- it would have gone to the monkey.)
    Last edited by tabstop; 02-22-2009 at 06:03 PM.

  12. #12
    Kiss the monkey. CodeMonkey's Avatar
    Join Date
    Sep 2001
    Posts
    937
    Why do people at my university graciously take my help in passing their C class, and then drop it anyway?

    Ah -- unwillingness to think.

    Edit: Yes, now that I look at it again that code I wrote was quite flawed. All of the functions and methods you'd need are there, though. Except perhaps for the do-while tabstop mentioned.
    Last edited by CodeMonkey; 02-22-2009 at 06:30 PM.
    "If you tell the truth, you don't have to remember anything"
    -Mark Twain

  13. #13
    Registered User
    Join Date
    Aug 2007
    Posts
    66
    Thank you guys finally I made it:

    Code:
    #include <fstream>
    #include <sstream>
    #include <string>
    #include <iostream>
    using namespace std;
    
    int main()
    {
        char FileName[] = "link.txt";
        ifstream myFile;	//  create object myFile from ifstream class
        myFile.open (FileName, ios::binary );
        char *buffer;
        const string sTag("video_id=");
    
        //Get Length of file
        int length;
        myFile.seekg (0, ios::end);
        length = myFile.tellg();
        myFile.seekg (0, ios::beg);
        
        // allocate memory
        buffer = new char [length];
    
        myFile.read(buffer,length); // FileName is stored in buffer pointer
        
            string s(buffer); // Store buffer into a string called s
            size_t vipos = s.find(sTag); // vipos hold the number of chars from start to sTag
    
            for(unsigned i = 0; i < (length - vipos) ; ++i) // remaining chars = all chars - chars from start to sTag
                myFile.unget(); 		// remove the character
    
            myFile.read(buffer,length);
            std::stringstream iss(std::string(buffer), std::ios_base::in);
            std::getline(iss, s, '\'');
            std::cout <<s;
            delete [] buffer;
            return 0;
    
    
        
        std::cout << '\"' << sTag << "\" not found in file " << FileName << std::endl;
        delete [] buffer; 
    
    }

  14. #14
    Kiss the monkey. CodeMonkey's Avatar
    Join Date
    Sep 2001
    Posts
    937
    What works, works.

    Still, since you read the entire file into memory in one go, you don't need to do all of that rewinding that I wrote.

    Instead, you could use std::string::substr() and get your result in one line, without having to refer to the file again, as you now do.

    edit: You have cut and paste syndrome.
    "If you tell the truth, you don't have to remember anything"
    -Mark Twain

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Need Help Fixing My C Program. Deals with File I/O
    By Matus in forum C Programming
    Replies: 7
    Last Post: 04-29-2008, 07:51 PM
  2. Possible circular definition with singleton objects
    By techrolla in forum C++ Programming
    Replies: 3
    Last Post: 12-26-2004, 10:46 AM
  3. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  4. Request for comments
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 01-02-2004, 10:33 AM
  5. System
    By drdroid in forum C++ Programming
    Replies: 3
    Last Post: 06-28-2002, 10:12 PM