Thread: Reading Multiple Multi-Part Strings From a Text File

  1. #1
    Registered User
    Join Date
    Sep 2007
    Posts
    127

    Reading Multiple Multi-Part Strings From a Text File

    Hi,

    I made a simple program that reads 3 names from a text file, each of which has a first name, middle name and surname. I think there would be a few ways to improve the program, although I'm not sure how to go about it. So I have a couple of questions, if anyone could help:

    1. When I create the three vectors of strings, if I don't start off by putting something into them, the program crashes when it's run (lines 19-21). How come? Is there any way I can save code by not having to put something in them when I create them?

    2. Is there any way of avoiding having to declare three separate vector<string> variables? Would it be possible to do this using just a vector of vectors? Or could you do it using one vector that's just 3 x 3 strings? Would a list of vectors be better?

    Thanks guys.


    In the code, the file "names.dat" is just a text file with the following in:

    Arnold Adam Andrews
    Brian Barry Buttfield
    Colin Charles Coleman

    Code:
    #include <iostream>
    #include <string>
    #include <vector>
    #include <fstream>
    #define FIRSTNAME 0
    #define MIDNAME 1
    #define SURNAME 2
    using namespace std;
    
    ifstream file_in("names.dat");
    
    int main()
    {
        char charname[99];
        string string_from_file;
        int pos;
        int num_of_lines = 0;
        int typeofname = 0;
        vector<string> first_names;
        vector<string> middle_names;
        vector<string> last_names;
        first_names.push_back(string_from_file);
        middle_names.push_back(string_from_file);
        last_names.push_back(string_from_file);
    
        while(!file_in.eof())   // Get the names from the names.dat file.
        {
            file_in.getline(charname, 99);
            pos = 0;
            while (typeofname <= SURNAME)
            {
                string_from_file[string_from_file.size()] = NULL;
                string_from_file.erase(0, 99);
                while (((typeofname<=MIDNAME) && (charname[pos] != ' ')) || ((typeofname==SURNAME) && (charname[pos])))
                {
                    string_from_file += charname[pos];
                    pos++;
                }
                pos++;
                if (typeofname == FIRSTNAME)
                    first_names.push_back(string_from_file);
                else if (typeofname == MIDNAME)
                    middle_names.push_back(string_from_file);
                else if (typeofname == SURNAME)
                    last_names.push_back(string_from_file);
                typeofname++;
            }
            num_of_lines++;
            typeofname = 0;
        }
        for (typeofname = 0; typeofname <= SURNAME; typeofname++)   // Output the names by type.
        {
            pos = 0;
            while (pos <= num_of_lines)
            {
                if (typeofname == FIRSTNAME)
                    cout << first_names[pos] << endl;
                else if (typeofname == MIDNAME)
                    cout << middle_names[pos] << endl;
                else if (typeofname == SURNAME)
                    cout << last_names[pos] << endl;
                pos++;
            }
        }
    }

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Ok, so I must confess to the lack of reading skills: I started reviewing the code, then read the questions. So first the "code review comments" - which actually re-asks/answers your questions in part:


    Code:
    #define FIRSTNAME 0
    #define MIDNAME 1
    #define SURNAME 2
    Why is this #define, and not an enum?

    Code:
        int typeofname = 0;
    Not = FIRSTNAME? (Same other places?)

    Code:
                string_from_file[string_from_file.size()] = NULL;
    Seems extremely fishy.

    Code:
                string_from_file.erase(0, 99);
    Why not?
    Code:
        string_from_file.clear();
    (It also works if your string should ever happen to be longer than 99, for example).

    Code:
        first_names.push_back(string_from_file);
        middle_names.push_back(string_from_file);
        last_names.push_back(string_from_file);
    What exactly are you pushing back here, before you've read from the file?

    Code:
            pos = 0;
            while (pos <= num_of_lines)
            {
                if (typeofname == FIRSTNAME)
                    cout << first_names[pos] << endl;
                else if (typeofname == MIDNAME)
                    cout << middle_names[pos] << endl;
                else if (typeofname == SURNAME)
                    cout << last_names[pos] << endl;
                pos++;
            }
    Why not a for-loop.
    And think about how many loop iterations you actually do here. Just walk through the code for example imagining that num_of_lines is 4. How many lines get printed, and how much data do you expect to have in such a case? [Answers Q1]

    But overall: Why do you use getline, when it would make more sense to use the >> operator? That way, you could get rid of the entire while-loop in the middle of the code [the loop splitting the string] - which by the way could probably be simplified by just doing "charname[pos] != ' ' && charname[pos]" - if you reach end of string BEFORE you get to surname, it's probably wrong anyways, but currently, you don't even detect that scenario, and traipse on to the next line. If someone put "Matthew Mailer" (no middle name) in your file, you would then go on reading whatever rubbish happens to be after "\0" in memory - this will definitely lead to bad data output if it's not leading to a crash.

    Further, is it part of your assignment to use three parallel arrays, rather than a single array containing a struct/class with first, middle and last name? [Sort of answer to Q2]

    And on a pedantic note: perhaps you want to decide whether you use the term surname or last name - currently you have both in your code.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User
    Join Date
    Sep 2007
    Posts
    127
    Ok, so I must confess to the lack of reading skills: I started reviewing the code, then read the questions. So first the "code review comments" - which actually re-asks/answers your questions in part:


    Code:

    #define FIRSTNAME 0
    #define MIDNAME 1
    #define SURNAME 2

    Why is this #define, and not an enum?
    Well I've been learning from this book, C++ Without Fear, and the guy never used enum in it. But thanks for the advice, I'll change it. I looked at a tutorial about enum and from that, it seems like I'm supposed to add this to the start of the program:

    Code:
    enum nametypes_t {FIRSTNAME = 0, MIDNAME = 1, SURNAME = 2};
    Is that all I should change with this? I also tried changing typeofname to an enum but the compiler told me I can't use ++ with it in the loops.

    Code:

    int typeofname = 0;

    Not = FIRSTNAME? (Same other places?)
    Yeah it should be that's a mistake. Oops.

    Code:

    string_from_file[string_from_file.size()] = NULL;

    Seems extremely fishy.

    Code:

    string_from_file.erase(0, 99);

    Why not?
    Code:

    string_from_file.clear();

    (It also works if your string should ever happen to be longer than 99, for example).
    Ok, cool, thanks. I'll try that.


    Code:

    first_names.push_back(string_from_file);
    middle_names.push_back(string_from_file);
    last_names.push_back(string_from_file);

    What exactly are you pushing back here, before you've read from the file?
    Well, to be honest I'm not sure why that's necessary but like I said if I don't do that it crashes when it runs. I'm not really sure why I tried doing that there in the first place but it just seemed to make it work. Of course if there's a better way to do it I'd rather learn that.


    Code:

    pos = 0;
    while (pos <= num_of_lines)
    {
    if (typeofname == FIRSTNAME)
    cout << first_names[pos] << endl;
    else if (typeofname == MIDNAME)
    cout << middle_names[pos] << endl;
    else if (typeofname == SURNAME)
    cout << last_names[pos] << endl;
    pos++;
    }

    Why not a for-loop.
    Ok, makes sense.

    And think about how many loop iterations you actually do here. Just walk through the code for example imagining that num_of_lines is 4. How many lines get printed, and how much data do you expect to have in such a case? [Answers Q1]
    Sorry, I don't mean to sound stupid, but I'm not sure what you mean. If num_of_lines was 4 there would be 4 different sets of names to print out so surely it would all print out ok?

    But overall: Why do you use getline, when it would make more sense to use the >> operator? That way, you could get rid of the entire while-loop in the middle of the code [the loop splitting the string] - which by the way could probably be simplified by just doing "charname[pos] != ' ' && charname[pos]" - if you reach end of string BEFORE you get to surname, it's probably wrong anyways, but currently, you don't even detect that scenario, and traipse on to the next line. If someone put "Matthew Mailer" (no middle name) in your file, you would then go on reading whatever rubbish happens to be after "\0" in memory - this will definitely lead to bad data output if it's not leading to a crash.
    I used getline because I'm experimenting with stuff that will let me read more complicated text than 3 names separated by spaces. I'm learning to program so that I can eventually make some kind of chat bot type things for games. Basically the dialogue might end up being something like the following, with * being replaced by a bot's name:

    Hey, *, how's it going?
    *, how you doing?
    Hi, *. Good to see you. How's things?

    So for each line I'd end up getting a string for part 1 which would be upto the *, and then another string for part 2, which would be from the * to the end of the line. And then for each bot you could replace the * with the name later. Ultimately I'd make a load of different lines like that, having them stored in a text file, so that you could use combinations of different lines to make conversations up. I used names in the example program just because it was simpler.

    Further, is it part of your assignment to use three parallel arrays, rather than a single array containing a struct/class with first, middle and last name? [Sort of answer to Q2]
    It doesn't really matter how I do it, whatever works best.

    Anyway thanks for the response, that's some useful feedback there.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Reading text file twice, can it be done once?
    By CaeZaR in forum C++ Programming
    Replies: 4
    Last Post: 02-04-2006, 04:18 PM
  2. Reading text file twice, can it be done once?
    By CaeZaR in forum C Programming
    Replies: 2
    Last Post: 02-03-2006, 03:34 PM
  3. A bunch of Linker Errors...
    By Junior89 in forum Windows Programming
    Replies: 4
    Last Post: 01-06-2006, 02:59 PM
  4. reading from text file
    By jamez in forum C Programming
    Replies: 3
    Last Post: 11-30-2005, 07:13 PM
  5. Reading Part Of A Text File
    By Okiesmokie in forum C++ Programming
    Replies: 1
    Last Post: 03-01-2002, 01:59 PM