Thread: map, unordered_map or vector ?

  1. #1
    Registered User
    Join Date
    Jan 2011
    Posts
    222

    map, unordered_map or vector ?

    Hi i have a very simple question. I am trying to store a 2d array of integers but i don't know should i choose vector or map or unordered_map to store the data. So my particular case is:

    A function in my program generates integers based on some input and those integers can span from 0 to 5*10^10. usually i get around two million ints that i need to associate to some other 2 mill ints. latter on i have to find and modify every stored int at least two times. exmp:


    input > generate ints > store them into hash table

    2 -> 198776
    43239 -> 55
    356 -> 762212
    ...

    now i have to locate int associated with 2 and repeat this process for all numbers in the table -> i have a while loop from 0 to 5*10^10 and as a reach a particular number i check if it exists in my table and then change its value according to the present value (i repeat this twice) )

    so given the above case what is the best way to store my numbers: map, vector or unordered_map? I need something that uses the smallest amount of memory and is fast. in other words efficient.

    baxy

  2. #2
    [](){}(); manasij7479's Avatar
    Join Date
    Feb 2011
    Location
    *nullptr
    Posts
    2,657
    Quote Originally Posted by baxy View Post
    I need something that uses the smallest amount of memory and is fast. in other words efficient.
    std::vector will use the smallest amount of memory and will be the slowest. (Plain array underneath)
    std::map will have binary tree like performance. (Both params between vector and unordered_map)
    std::unordered_map will be the fastest and use the most memory.
    Use it if you want a hashtable.
    The memory isn't really that much.

  3. #3
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    Quote Originally Posted by manasij7479 View Post
    The memory isn't really that much.
    if it's an embedded system, memory is everything.

  4. #4
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Elkvis View Post
    if it's an embedded system, memory is everything.
    I didn't see any mention of embedded. At any rate, if you're dealing with 2*2 million integers, no matter how you store them, it's going to take up some space (at least 32 megabytes).

    The requirements were not fully specified, but I would suggest a std::vector<std:: pair<int, int> >, kept in sorted order (sorted by the first element), and use binary search to locate elements. Same lookup complexity as std::map<> but with zero memory overhead (you store the integers themselves and nothing else)
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  5. #5
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    i have a while loop from 0 to 5*10^10 and as a reach a particular number i check if it exists in my table and then change its value according to the present value
    O_o

    Cache misses are going to eat you alive.

    Call me Sally-joe Mcgee, but why is it necessary to iterate over the possible values to check if they exist instead of just directly iterating over the values that are known to be in the input because you've just read them!? (It is remarkably trivial to figure out where you'd be in the domain from that information anyway if you are using simple iteration.) Or is every single pass intended to update a value multiple times because the position in the set changed? Why can't you do all updates in a single pass?

    How did we get those three containers as choices? They do different things. The `std:vector' isn't an associative container. If you need an associative container, why is listed? Alternatively, if you don't need an associative container, how are you managing the associated values? Where are these associated values coming from? Are they calculated? Are they supplied by a file? Are the associations reversible?

    As far as that goes, what does anything you said have to do with a "2d array"? An array isn't associative regardless of its dimensionality. (I don't want to hear arguments about indices as association. He has "5*10^10" domain; that would almost certainly require more RAM than he has available.)

    std::unordered_map will be the fastest and use the most memory.
    If he is updating the same value in the list multiple times in a single pass he is guaranteed to get useless expansions.

    I don't think I'd recommend a `std::unordered_map' here, but then I'm not sure what he is doing either so you may be exactly right.

    Same lookup complexity as std::map<> but with zero memory overhead
    The cost of keeping the list sorted through all the potential mutations might be brutal.

    Or if he doesn't need to process the list with the adjusted value he could use a tuple with that strategy.

    Soma

  6. #6
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    There is nowhere near enough information to make a correct decision.

    All sorts of things matter, like whether the first number can appear in multiple pairs. E.g.
    2 -> 198776
    43239 -> 55
    2 -> 42
    356 -> 762212
    2 -> 12345
    As that would make a map and unordered_map off the table entirely. You'd then probably be looking at multi_map, unordered_multimap, [sorted] vector, or even a map of int to int-vector etc. There are far more choices than you realise.
    The frequency distribution of the numbers also matters.
    What device it's for, i.e. how important it is to conserve memory matters.
    How often you insert into the structure matters.
    How often you perform direct lookups matters.
    Whether you need the values in order matters.

    Provide more background information.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Converting string vector to integer vector
    By CPlus in forum C++ Programming
    Replies: 4
    Last Post: 05-08-2010, 05:43 AM
  2. Storing a vector inside a vector
    By csonx_p in forum C++ Programming
    Replies: 1
    Last Post: 09-14-2008, 05:15 AM
  3. Replies: 2
    Last Post: 05-06-2008, 10:41 AM
  4. Simple Question memset(vector<int>, 0, sizeof(vector<int>))
    By HyperShadow in forum C++ Programming
    Replies: 6
    Last Post: 12-10-2007, 04:56 PM
  5. connect vector<int> and vector<char>
    By hdragon in forum C++ Programming
    Replies: 1
    Last Post: 04-27-2006, 01:16 PM