We're now trying to figure out a different model though in which you would somehow sort the domain tree (no longer a binary search tree) in a method that you would have an uneven distribution of users (more likely) and the highest user lists would sit on the top of the tree. Can't quite figure how you'd transverse this list though.
This seems hard, unnecessarily complicated, and not necessarily an improvement to efficiency.
I think your best solution is to just change your sorting algorithm. Sort by the domain name first. For example, you might represent your email addresses as:
Code:
struct email_addr {
std::string domain;
std::string name;
bool operator<(const email_addr& foo) {
return (domain < foo.domain) || (domain == foo.domain && name < foo.name);
}
};
Then you can efficiently find the first email address with that domain name. To find all email addresses with the domain name, just find the first and then iterate forward until you get a different domain name.
This gives the same order of efficiency as the datastructure you're proposing, assuming that iterating is on average O(1) operation.
One caveat with this approach is that you're storing a lot of redundant domain names. If this is really important to you, you could always use a pointer to a string, and then, on insertion, check neighboring nodes to see if the domain name is redundant or not, and on deletion, check neighboring nodes to see if you need to free the pointer. Not fun. And depending on the implementation of std::string, that's probably not the best solution.
In fact, ignore everything I've said up to this point.
It's probably simplest to just use an algorithm that rotates a string about the at sign. Pardon any incorrect C++ (with templates); I'm kind of rusty.
Code:
template <class iter, class value_type>
bool rotate_about(iter beg, iter end, value_type ch) {
iter pivot
= std::find(beg, end, ch);
if (pivot == end) {
return false;
}
std::reverse(beg, pivot);
std::reverse(pivot + 1, end);
std::reverse(beg, end);
return true;
}
Call rotate_about(s.beg(), s.end(), '@') before inserting email addresses (or call some equivalent procedure), and then use ordinary string comparison, with an ordinary std::set. You can use myset.lower_bound(domain) to get the first domain name, and then iterate forward to get further email addresses with that domain name. If you want to search just by user names of a particular domain name, you'll have to do it by constructing a string with the domain name, first.