Thread: Vector of a struct containing another vector

  1. #46
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    Okay why can't I do this:

    Code:
    void Index::addWord(string word, int pageNumber)
    {
        if(word.length() > 3)
        {
            indexWord index;
            for (vector<indexWord>::iterator current=data.begin(); current!=data.end(); ++current)
            {
                if(word != current->value)
                {
                    index.value = word;
                    index.locations.push_back(pageNumber);
                    //sort(data.begin(), data.end());
                    data.push_back(index);
                }
            }
        }
    }
    Once I take the iterator statement out it works fine but I can't compare the string portion of the vector to a string?? I am so confused?!?

  2. #47
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Can't as in compile error? If so, what error?
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  3. #48
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    No there is no error. It just doesn't add to the vector correctly.

  4. #49
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Not correctly as in "it's not there" or as in "it's in the wrong place"?
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  5. #50
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Here's what your addWord does:

    Suppose word (the one passed as argument 1 of addWord) is "Markov". And suppose you already have 100 indexWords in the vector data, but none of them has value=="Markov".

    addWord looks at the first of the 100 elements of data, finds that its value member is not "Markov", so it assigns index.value="Markov", assigns index.locations[0]=pageNumber (this is a "new" indexWord so its locations vector starts out empty), and pushes this index onto the back of the vector data.

    Then it looks at the next element of data and does the same thing again, and so on ...
    ...so it ultimately pushes 100 copies of "Markov" onto the vector, then continues, looking at each of those new copies but since for each of them the value is "Markov", it doesn't add any more copies.

    On the other hand, if an indexWord with value=="Markov" is already in data, addWord simply skips over that element and goes on to the next, and never adds a second element to the locations vector of any indexWord.

  6. #51
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    God, I must be completely blind.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  7. #52
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    Quote Originally Posted by R.Stiltskin View Post
    Here's what your addWord does:

    Suppose word (the one passed as argument 1 of addWord) is "Markov". And suppose you already have 100 indexWords in the vector data, but none of them has value=="Markov".

    addWord looks at the first of the 100 elements of data, finds that its value member is not "Markov", so it assigns index.value="Markov", assigns index.locations[0]=pageNumber (this is a "new" indexWord so its locations vector starts out empty), and pushes this index onto the back of the vector data.

    Then it looks at the next element of data and does the same thing again, and so on ...
    ...so it ultimately pushes 100 copies of "Markov" onto the vector, then continues, looking at each of those new copies but since for each of them the value is "Markov", it doesn't add any more copies.

    On the other hand, if an indexWord with value=="Markov" is already in data, addWord simply skips over that element and goes on to the next, and never adds a second element to the locations vector of any indexWord.
    That's exactly what is SHOULD do

  8. #53
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Really? So if, the first time you read "Markov" there are already 100 other words in the index, you want to now have 200 entries in the index (100 of them with value "Markov")?

  9. #54
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    Quote Originally Posted by R.Stiltskin View Post
    Really? So if, the first time you read "Markov" there are already 100 other words in the index, you want to now have 200 entries in the index (100 of them with value "Markov")?
    Oh wait no. That is what it does now which is bad. Sorry, didn't read it close enough. Trying to get rid of that. It prints out like 100 of the same string. I am trying to fix it by doing a binary search to see if it's in the vector and if it is, just add the page number to the vector it is associated with...if you get my point.

  10. #55
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    What a waste of time. A vector of structs with vectors is a map. But if you want to do an O(n) search through the first vector I guess that is your choice.

  11. #56
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    Quote Originally Posted by Bubba View Post
    What a waste of time. A vector of structs with vectors is a map. But if you want to do an O(n) search through the first vector I guess that is your choice.
    I'm sure it is a waste of time but I haven't learned maps yet

  12. #57
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    I would say then this is a good opportunity to learn about them. Fact is if you have used one container in the STL you've almost used them all. Most follow the iterator pattern so if you've done iterators in vectors doing them in maps is nearly the same thing.

    A map is a container where each item in the map has a key and an associated value. You can do a search on the key and the map will give you the value. If the key is not in the map then it will return one past the end of the map or simply map_object.end(). In a map each item has a unique key. There are also multimaps where items multiple items can have the same key. There is also a hash_map (although not yet standardized) which allows you to specifiy a hash function and 'can' be faster than map given the hash function is the correct one for the job.

    Vectors are nice and they are fast but if you need to search through them they are much slower than a map. As well when you remove an item from a vector it invalidates everything past that point. So if you are relying on the indices of the vector as object ID's and you remove an object in the middle of the vector, all IDs beyond the point of removal are now incorrect. A map of your structs would give you better lookup time and also makes it easier on you to code. A vector of structs with a vector is a bit confusing as opposed to a map keyed off by the name in your struct which then gives you the vector.

    You can sort the vector but when there is a container more suited to your needs than vector is I see no point. It's not that vector cannot do what you want it's that map can probably do it more efficiently. The STL containers are extremely robust and the functions that can be used on them are numerous. It is very possible to gain most of the functionality you need out of almost every container but that does not mean you have chosen the right container for the job.

    So a vector of integers is:
    Code:
    std::vector<int> IntVector;
    And a map of integers keyed off by strings is:
    Code:
    std::map<std::string,int> IntMap;
    Maps require 'pairs' which can be done via std::pair<> or via std::map<>::value_type(). To insert into the IntMap I've shown you would do:

    Code:
    //Using std::pair
    IntMap.insert(std::pair<std::string,int>("Test",100));
    
    //Using map::value_type
    IntMap.insert(std::map<std::string,int>::value_type("Test",100));
    There is another way to insert using the array access operator, however, I highly recommend you steer clear of the array access operator when it comes to maps. IMO, it has some 'hidden' behavior that is a bug waiting to happen.

    Many of us could tell you tons of ways to optimize your vector and make it faster but in the end if you are using the wrong container for the job, none of the optimizations will be as fast or suited to the task as using the right container for the job.

    My general rule of thumb is when you see a bunch of vectors inside of vectors then perhaps you are using the wrong container for the job. It really depends on the situation and the task as to what container you use. But the most important lesson here when it comes to the STL is use the right container and the STL will reward you.
    Last edited by VirtualAce; 12-05-2008 at 06:10 PM.

  13. #58
    The larch
    Join Date
    May 2006
    Posts
    3,573
    A map<string, set<unsigned> > is what I thought about too.

    Here's how it might look like. Note that inserting a keyword and a pagenumber, while making sure that they are in sorted order and unique takes just one simple line.

    Code:
    #include <iostream>
    #include <string>
    #include <map>
    #include <set>
    #include <cctype>
    
    int main()
    {
        typedef std::map<std::string, std::set<unsigned> > Index;
        Index index;
        std::string line;
        unsigned line_num = 0;
    
        //enter empty line to stop
        while (std::cout << ++line_num << ": " && std::getline(std::cin, line) && !line.empty()) {
            //tokenize
            std::string::iterator word_start, word_end = line.begin();
            while (
                word_start = std::find_if(word_end, line.end(), std::ptr_fun<int, int>(std::isalnum)),
                word_end = std::find_if(word_start, line.end(), std::not1(std::ptr_fun<int, int>(std::isalnum))),
                word_start != word_end
            ) {
                //could be converted to lower-case
                if (word_end - word_start > 3)
                    index[std::string(word_start, word_end)].insert(line_num);
            }
        }
        //display what we got
        for (Index::iterator word_it = index.begin(); word_it != index.end(); ++word_it) {
            std::cout << word_it->first << ':';
            for (std::set<unsigned>::iterator line_it = word_it->second.begin(); line_it != word_it->second.end(); ++line_it) {
                std::cout << ' ' << *line_it;
            }
            std::cout << '\n';
        }
    }
    However, perhaps if you are going to index once but use the index a lot, you might also just throw the line (page) numbers into a vector and finally sort them and remove duplicates.
    Last edited by anon; 12-05-2008 at 06:17 PM.
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  14. #59
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    Nice I didn't think about a map of sets. Tis why I love the STL containers. So much functionality just waiting to be used.

  15. #60
    Registered User
    Join Date
    Apr 2008
    Posts
    122
    Quote Originally Posted by anon View Post
    A map<string, set<unsigned> > is what I thought about too.

    Here's how it might look like. Note that inserting a keyword and a pagenumber, while making sure that they are in sorted order and unique takes just one simple line.

    Code:
    #include <iostream>
    #include <string>
    #include <map>
    #include <set>
    
    int main()
    {
        typedef std::map<std::string, std::set<unsigned> > Index;
        Index index;
        std::string line;
        unsigned line_num = 0;
    
        //enter empty line to stop
        while (std::cout << ++line_num << ": " && std::getline(std::cin, line) && !line.empty()) {
            //tokenize
            std::string::iterator word_start, word_end = line.begin();
            while (
                word_start = std::find_if(word_end, line.end(), std::ptr_fun<int, int>(std::isalnum)),
                word_end = std::find_if(word_start, line.end(), std::not1(std::ptr_fun<int, int>(std::isalnum))),
                word_start != word_end
            ) {
                //could be converted to lower-case
                if (word_end - word_start > 3)
                    index[std::string(word_start, word_end)].insert(line_num);
            }
        }
        //display what we got
        for (Index::iterator word_it = index.begin(); word_it != index.end(); ++word_it) {
            std::cout << word_it->first << ':';
            for (std::set<unsigned>::iterator line_it = word_it->second.begin(); line_it != word_it->second.end(); ++line_it) {
                std::cout << ' ' << *line_it;
            }
            std::cout << '\n';
        }
    }
    However, perhaps if you are going to index once but use the index a lot, you might also just throw the line (page) numbers into a vector and finally sort them and remove duplicates.
    Wow that's perfect. Don't really have a clue about how it works though. Would be perfect for this project though.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. can some one please tell me the cause of the error ?
    By broli86 in forum C Programming
    Replies: 8
    Last Post: 06-26-2008, 08:36 PM
  2. Vector class
    By Desolation in forum C++ Programming
    Replies: 2
    Last Post: 05-12-2007, 05:44 PM
  3. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  4. towers of hanoi problem
    By aik_21 in forum C Programming
    Replies: 1
    Last Post: 10-02-2004, 01:34 PM
  5. Operators for 3D Vector Mathematics
    By Anarchist in forum C++ Programming
    Replies: 10
    Last Post: 01-31-2003, 07:33 PM