Thread: Extract Title from plain text file

  1. #1
    Registered User
    Join Date
    Apr 2008
    Posts
    122

    Extract Title from plain text file

    I am writing a program that basically works like Project Gutenberg. What this does is it takes plain text files as input and outputs these text files in HTML format. I am just beginning the project and have started by trying to extract the title from a file. What I know is that the title ALWAYS begins with "Title:" and only lasts the line that "Title:" is on. I have written code but it doesn't work (of course, that's why I came here!"). To me, my code seems logically correct but it just doesn't run. Any help would be appreciated.

    getTitle() function:

    Code:
    string Book::getTitle(string title)
    {
        size_t pos;
    
        pos = title.find("Title:");
    
        title = title.substr(pos);
        if(title.find("Title:"))
        {
            return title;
        }
        else
            return "";
    }
    Process book to store title in variable:

    Code:
    Book::Book(std::istream& in)
    {
        std::string title, line;
        string::size_type start = 0;
    
        getline(in, line);
        while(in)
        {
            if(title[start] == ' ')
            {
                getline(in, line);
            }
            else
            {
                title = getTitle(line);
                break;
            }
        }
    }

  2. #2
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    Code:
    string Book::getTitle(string title)
    {
        size_t title_start;
        size_t title_end;
    
        title_start = title.find("Title:");
        title_end = title.find("\n");
    
        if ( title_start == title.npos || title_end == title.npos )
           return "";
        title = title.substr(title_start, title_end);
        return title;
    }
    I may be wrong, but I think that is right.

    edit: I can't remember if getline discards the '\n' character or not.

  3. #3
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    hmm it compiles and runs but doesn't print any title. It just prints a bunch of empty lines.

  4. #4
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    What does the file look like that you are parsing. Then maybe I can come up with a better solution?

    edit: also if the file isn't very big then you could just load the entire thing into memory, then do searches that way. (by big I mean over 10 megabytes )

  5. #5
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    Well I just made a really simple text file to test this theory. But the actual test files are going to be much much bigger. Here is the test I am using:

    Code:
    dfghdfg
    dfghdfgh
    dfghdfgh
    Title: This, should work
    
    This is a test.
    We want to see what happens.
    Awesome.

  6. #6
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    Try changing this line
    Code:
    if(title[start] == ' ')
    to
    Code:
    if ( title.empty() )
    and use this getTitle instead
    Code:
    const string& Book::getTitle(string title)
    {
        size_t title_start;
    
        title_start = title.find("Title:");
    
        if ( title_start == title.npos  )
           return "";
        title = title.substr(title_start);
        return title;
    }
    as I am pretty sure std::getline() discards the delimiting character.

  7. #7
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    It's still printing blanks

  8. #8
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    > It just prints a bunch of empty lines.
    Why do you call .find() for the same thing twice?

    Especially since you've already removed the string you're looking for.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  9. #9
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    I don't see where I do that?

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    pos = title.find("Title:");
    title = title.substr(pos);
    if(title.find("Title:"))
    These?

    Is the 2nd find ever going to succeed?

    Try a test file with
    Title: OK, does this Title: count
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  11. #11
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    Quote Originally Posted by Salem View Post
    pos = title.find("Title:");
    title = title.substr(pos);
    if(title.find("Title:"))
    These?

    Is the 2nd find ever going to succeed?

    Try a test file with
    Title: OK, does this Title: count
    Ahh that's right. Sorry, simple mistake. Got it working! Thanks a lot.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Formatting a text file...
    By dagorsul in forum C Programming
    Replies: 12
    Last Post: 05-02-2008, 03:53 AM
  2. Formatting the contents of a text file
    By dagorsul in forum C++ Programming
    Replies: 2
    Last Post: 04-29-2008, 12:36 PM
  3. How to use FTP?
    By maxorator in forum C++ Programming
    Replies: 8
    Last Post: 11-04-2005, 03:17 PM
  4. struct question
    By caduardo21 in forum Windows Programming
    Replies: 5
    Last Post: 01-31-2005, 04:49 PM
  5. what does this mean to you?
    By pkananen in forum C++ Programming
    Replies: 8
    Last Post: 02-04-2002, 03:58 PM