Thread: searching files bottom up.

  1. #1
    Registered User
    Join Date
    Oct 2012
    Posts
    2

    searching files bottom up.

    Hello. I'm trying to read a text file from the bottom up. This text file contains name/value pairs. The names don't get changed but the values do periodically. I need to read the latest value for any given name. So the txt file might look like this:
    Code:
    ubuntu=12
    fedora=13
    bsd=14
    mint=199
    fedora=76
    fedora=96
    fedora=62
    centos=9
    ubuntu=56
    Say I want to read the "fedora" value. I am only interest in the latest entry (ie 62).

    What's the best way to do this?

  2. #2
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Read all the values in normal order (top down), but only retain the last value.

    It is only worth the added complexity to read the file in "reverse order", unless the file is known to be large (say hundreds of kilobytes or more) and the interesting entries are likely to be (repeated) near the end of the file. In that case one might read just the last couple of kilobytes of the file opportunistically, to see if the entry is listed there. Reading backwards is complicated, because you must overlap by a full entry. It's just not worth it in the typical cases.

    Implementation-wise, you'll need a loop that reads each line at a time.

    If you want just one key ("fedora"), you can check if the beginning of the line matches the key. If the key matches, read and store the value, otherwise ignore the line.

    If you want all entries, but only the last value for each, I'd read the name and value, then store the name (and associated value) in a hash table. Instead of adding a new entry, the add-to-hash-table function would only update the value if the name is already in the hash table.

    If you consider this awk snippet,
    Code:
    awk 'BEGIN { FS="[\t\v\f ]*=[\t\v\f ]*" } { data[$1]=$2 } END { for (k in data) printf "%s=%s\n", k, data[k] }' input-file
    it actually does exactly what I described for all entries case; practically all awk variants use a hash table for associative arrays. If you run that, you'll also notice that the order in which the entries are listed, does not match the input; that's indicative of a hash table, too. Given your input data, GNU awk 3.1.8 outputs
    Code:
    bsd=14
    fedora=62
    mint=199
    centos=9
    ubuntu=56
    If you are not familiar with hash tables, they're not complicated to implement -- at least if you know basic dynamic memory management (malloc(), realloc(), free()); I think this would be a perfect exercise for hash tables.

  3. #3
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Regardless of the direction, you'll need to read the entire file in order to find all the occurrences of unique names.

  4. #4
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    If you want to scan the file for all the lastest unique entries then, as rcgldr said, the entire file must be read, so direction is irrelevent. As nominal animal said, a hash table would do nicely here.

    If you want to scan the file for the latest of a single (or a few) key values, then it's still not worth reading the file in reverse it it isn't that large (and I have a feeling it probably isn't in your case).

    If the file IS large and you're looking for just one or a few key values, the it might be best to read it in reverse using memory-mapping (see mmap() on *nix). That would actually be fairly easy, the only tricky bit is dealing with records that span the buffer size. (This problem actually exists when reading the file forwards too, but is handled for you by the buffering mechanism of FILE objects.)
    The cost of software maintenance increases with the square of the programmer's creativity. - Robert D. Bliss

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Searching for files on the net
    By geek@02 in forum Windows Programming
    Replies: 4
    Last Post: 12-09-2009, 10:59 AM
  2. Replies: 19
    Last Post: 09-14-2006, 10:36 AM
  3. Searching through files
    By Atropos in forum C++ Programming
    Replies: 5
    Last Post: 03-31-2004, 03:09 PM
  4. searching in files
    By bigzigi@hotmail in forum C Programming
    Replies: 1
    Last Post: 04-03-2003, 07:52 AM
  5. searching files
    By Gil22 in forum C++ Programming
    Replies: 1
    Last Post: 03-03-2003, 06:10 PM