searching files bottom up.

**dirtyfishtank** · 08-22-2013

Hello. I'm trying to read a text file from the bottom up. This text file contains name/value pairs. The names don't get changed but the values do periodically. I need to read the latest value for any given name. So the txt file might look like this:

Code:

ubuntu=12
fedora=13
bsd=14
mint=199
fedora=76
fedora=96
fedora=62
centos=9
ubuntu=56

Say I want to read the "fedora" value. I am only interest in the latest entry (ie 62).

What's the best way to do this?

**Nominal Animal** · 08-22-2013

Read all the values in normal order (top down), but only retain the last value.

It is only worth the added complexity to read the file in "reverse order", unless the file is known to be large (say hundreds of kilobytes or more) and the interesting entries are likely to be (repeated) near the end of the file. In that case one might read just the last couple of kilobytes of the file opportunistically, to see if the entry is listed there. Reading backwards is complicated, because you must overlap by a full entry. It's just not worth it in the typical cases.

Implementation-wise, you'll need a loop that reads each line at a time.

If you want just one key ("fedora"), you can check if the beginning of the line matches the key. If the key matches, read and store the value, otherwise ignore the line.

If you want all entries, but only the last value for each, I'd read the name and value, then store the name (and associated value) in a hash table. Instead of adding a new entry, the add-to-hash-table function would only update the value if the name is already in the hash table.

If you consider this awk snippet,

Code:

awk 'BEGIN { FS="[\t\v\f ]*=[\t\v\f ]*" } { data[$1]=$2 } END { for (k in data) printf "%s=%s\n", k, data[k] }' input-file

it actually does exactly what I described for all entries case; practically all awk variants use a hash table for associative arrays. If you run that, you'll also notice that the order in which the entries are listed, does not match the input; that's indicative of a hash table, too. Given your input data, GNU awk 3.1.8 outputs

Code:

bsd=14
fedora=62
mint=199
centos=9
ubuntu=56

If you are not familiar with hash tables, they're not complicated to implement -- at least if you know basic dynamic memory management (malloc(), realloc(), free()); I think this would be a perfect exercise for hash tables.

**rcgldr** · 08-22-2013

Regardless of the direction, you'll need to read the entire file in order to find all the occurrences of unique names.

**oogabooga** · 08-22-2013

If you want to scan the file for all the lastest unique entries then, as rcgldr said, the entire file must be read, so direction is irrelevent. As nominal animal said, a hash table would do nicely here.

If you want to scan the file for the latest of a single (or a few) key values, then it's still not worth reading the file in reverse it it isn't that large (and I have a feeling it probably isn't in your case).

If the file IS large and you're looking for just one or a few key values, the it might be best to read it in reverse using memory-mapping (see mmap() on *nix). That would actually be fairly easy, the only tricky bit is dealing with records that span the buffer size. (This problem actually exists when reading the file forwards too, but is handled for you by the buffering mechanism of FILE objects.)

Thread: searching files bottom up.

Thread Tools

Search Thread

Display

searching files bottom up.

Similar Threads

Searching for files on the net

How do we know the memory arrangement using in microprocessors? Top-Bottom,Bottom-Top

Searching through files

searching in files

searching files