search algorithm - making statistical decisions

Let's assume I have two containers.

Code:

`std::vector<unsigned long> vec1;`

std::vector<unsigned long> vec2;

Both containers will hold a large number of possible values. However, __they are mutually exclusive__. Also, the containers __are both already sorted__.

Now,

I want to search one particular value and the search will be done repeteadly for a large number of values. Naturally, I start with one container and if I don't find the value, I move on to the other.

But... what about optimizing this process based on probabilities? Instead of simply starting on Vec1 and then moving to Vec2, I'm thinking avoiding the best I can the linear compexity of the find algorithm and write my own, based also on the fact the vectors are sorted.

For this I first determine the distance of the value to search (lets call it *val*) from both containers boundaries:

Code:

`unsigned long distVec1Max = *--vec1.end() - val; // and the same for vec2`

unsigned long distVec1Min = val - vec1.begin();

Based on this info I decide which container to search first. Typically I want to first search the vector in which distance to the value to be found is less, doing a reverse search if appropriate. If I reach a value that is higher than the value being searched (or lower if doing a reverse search), I stop and move on to the next container.

Question is... Can you see a possible gain if I do this? The calculations prior to the search and the switching of containers can put a tax on the performance that is probabilistically not justifiable, considering the alternative of doing a linear search on each vector?

(note: A set would solve this, of course. But I just used a simple example here with containers of unsigned long. Let's assume the contents do not, and cannot, have a < operator.)

EDIT: Another alternative is searching both containers at the same time, one index at a time, still using the distance to determine what kind of search to do on each container (if normal or reversed)....