Here is our first article for discussion. It is on Searching Techniques.

Printable View

- 01-28-2005kermi3Our First Article - Searching Techniques By Prelude
Here is our first article for discussion. It is on Searching Techniques.

- 01-28-2005jverkoey
Very good article, good read, however, in the first few lines it talks about hashing, and I personally haven't ever worked with hashing before...is it relatively simple to implement as far as a searching algorithm goes? I would be really interested in learning how to use this sort of thing and also, what data type works best with hashing...if any?

Also, the article mentions binary trees for searching faster, but would a binary tree be faster than just implementing a binary search, or are the two things the same thing in a sense? - 01-29-2005major_small
well written indeed. my only thought is that the first bit of code may be overly complex for what it is demonstrating. what I mean is that some people may get lost in the syntax, when what is more important is the algorithm behind it.

for example, something like this may get the point across better:

Code:`#include<iostream>`

int main()

{

int arr[]={1,32,4,6,4,253,6,1,24,10,35};

int size=sizeof(arr)/sizeof(int);

int index=-1;

for(int i=0;i<size;i++)

{

if(arr[i]==10)

{

index=i;

break;

}

}

if(index>=0)

std::cout<<"10 was found at index "<<index<<std::flush;

else

std::cout<<"10 was not found"<<std::flush;

std::cin.get();

return 0;

}

- 01-29-2005PerspectiveQuote:

Originally Posted by**jverkoey**

Quote:

Originally Posted by**jverkoey**

Quote:

Originally Posted by**jverkoey**

- 01-29-2005PerspectiveQuote:

Originally Posted by**major_small**

My biggest complaint with the example you posted would be that the it doesnt offer an "as is" usable function because the entire algorithm is coded in main. Algorithms like this should be broken into [re]usable components. The code in the article is usable "as is".

I do agree with your underlying point though, the article could supply a driver program to reduce confusion and give the reader something concrete to exeriment with. (though i did notice that writing a driver program was the intent of Exercise 2) - 01-29-2005quzah
A binary tree is probably faster than a binary search. With a tree, all you're doing is moving down to the next pointer. With a binary search, you're preforming at least some math each time. I suspect it will always be faster to simply access one more pointer than it will be to find the next spot to jump to using said math.

Code:`middle = ( beginning + end ) >> 1;`

If I'm incorrect, feel free to let me know, because I haven't played around calculating notations and the like.

Quzah. - 01-29-2005Sang-drax
Good article.

Perhaps it could be mentioned that interpolation search will only be O(log log n) if the data is uniformly distributed in the range between low and high, because the interploation used is linear, unlike binary search where O(log n) is the worst case.

Quote:

Originally Posted by**Perspective**

- 01-29-2005Sang-drax
If I'm not mistaken, interpolation search can actually be slower than binary search even for extremely huge lists, if the entries in the list are chosen carefully.

Consider the list L with N elements (sorted).

The element l[n] at position n is defined as:

l(n) = N^n

So, the list looks like this:

{N^0, N^1, N^2, ... , N^N}

When searching the list with binary search, the time complexity will be on average O(log2 n), because each iteration halves the number of entries where the wanted element might be.

Now, when using interpolation search, the first iteration will yield (using Prelude's variable names):

key = N^n

low_diff = N^n - 1

range_diff = N^N - 1

count_diff = N - 0 = N

range = N*(N^n - 1) / (N^N - 1) + 0

range < N*(N^n - 1) / N^N = 1/ N^(N-n-1) - 1/N^(N-1) < 1 for all n<N.

So, if the element searched for isn't the last (n<N), the first calculation of range will equal 0 (when truncated) and the new value of low will be 1 instead of 0.

It continues like this, the low boundary will increase by one each iteration, so the complexity of the interpolation search will be O(n).

Note that this is by no means any argument against using interpolation search. This is more of theoretical interest, if it's of any interest at all.

EDIT:

Hmm, actually the list hasn't N elements; it has N+1. The argument still holds though. Remove the first element from the list (N^0). - 01-29-2005Hunter2
An idea about a possible application for the article (where hashing is impractical):

You may have a hash table that 'maps' keys to values, but what if you want the inverse operation (find the key of a value)? I suppose you could generate two different hash tables, one for each direction, but if it's more difficult to hash the value than the key (i.e. integer key, structure value), it may be more expedient to simply implement a search for the inverse operation rather than going to the trouble of coming up with an effective hash algorithm. - 01-29-2005LuckYQuote:

I personally haven't ever worked with hashing before...is it relatively simple to implement as far as a searching algorithm goes?

Quote:

Also, the article mentions binary trees for searching faster, but would a binary tree be faster than just implementing a binary search, or are the two things the same thing in a sense?

- 01-30-2005jverkoeyQuote:

Originally Posted by**LuckY**

Code:`unsigned int ParseTag(char* tag)`

{

srand((unsigned int)tag[0]);

return (tag[0] + rand()%128) + (tag[1] + rand()%64) + (tag[2] + rand()%32) + (tag[3] + rand()%16);

}

- 01-30-2005CornedBee
The argument about hashing - well, the article talks about searching algorithms on common (perhaps sorted) arrays and linear lists. Not about binary trees or hash tables. In that case, skiplists should be discussed too. So perhaps we should have an article about data structures.

Hash functions must be deterministic, jverkoey. They must not contain a random element. - 01-30-2005VirtualAce
Good article. And I don't think the code is hard to read at all. Rather the functions look very succint and to the point. Very easy to follow.

So what's with having Prelude write the first article....yeah like we are all just gonna jump right into writing articles now. :D Tough act to follow.

;) - 01-30-2005Zach L.Quote:

Originally Posted by**CornedBee**

Anyway, a great article. I much prefer the code in usable functions as they are as opposed to in a small program as major_small suggested. The target audience could rather readily write small test programs that use these functions.

I really liked Prelude's use of problems in the article as well. They were all of an appropriate level, I thought, and all very valuable to the learning process. The code was clear as well.

I'll comment more (in a more helpful manner) later on.

Cheers - 01-31-2005CornedBee
Right, Zach, my fault. I didn't look at the argument to srand() and assumed the usual time(NULL).