# String hashing

Show 80 post(s) from this thread on one page
Page 1 of 2 12 Last
• 09-10-2009
jack_carver
String hashing
Is there any way to design a hash function to detect strings of type:

abcdef, defabc

Notice that one is formed just by rotation of the other .. is it possible to make a hash function H such that

H("abcdef")=H("defabc")

Thank you
• 09-10-2009
itCbitC
Yes but do you know the axis of rotation which splits the array into two parts which are then swapped.
• 09-10-2009
jack_carver
Re

But cant if be generalized ? As in

H("abc")=H("bca")=H("cba")=... etc

??
• 09-10-2009
sean
Sure, it CAN be generalized, but that doesn't mean it would do what you need. You need to be more specific.

For example, you could just add up the numerical value of each character in the string and consider that the "checksum". Then H("abcdef") == H("defabc") would be true. However, that wouldn't create a UNIQUE sum for each combination of letters*, and the sum would vary depending on the character set used of the host machine**. So again - it depends on exactly what you want to do.

* H("bc") == H("ad") would also be true, to cite just one example

** If one used ASCII, and the other didn't
• 09-10-2009
Salem
Well simple addition works, if you can tolerate "abcd" being the same as "aadd"

Failing that, sort the characters and then hash, then you remove the internal ordering.
• 09-10-2009
jack_carver
sorting wont work
Hmm..sorting wont work because it will give same result for string like "abad" & "aabd"
• 09-10-2009
sean
So what you're saying is that the string has to be exactly the same except for one rotation at any arbitrary location in the string? You need to be way more specific - we can't guess what you're trying to do. Come up with the exact rules, and then put that into pseudocode, and then put that into C.
• 09-10-2009
jack_carver
Re
sorry for being so vague...
Here is exactly what i want to implement

If two strings are gives say A,B i want to know if B can be obtained just by pure rotation for charecters in A. I know alternate ways to do this such as brute force or using factoradics ... etc
But some one told me that hashing is one the ways to do this efficienlty

ex: given "abcde" .. possible rotations are "bcdea","cdeab","deabc","eabcd'.

So given any two strings from the set S={"abcde", "bcdea","cdeab","deabc","eabcd"} i need a function that will return true
H("abcde")==H("deabc") ...etc

hope u get it

thank u
• 09-10-2009
itCbitC
So if I get it the only strings that matches "abcde" in set S is "cdeab" and "bcdea" and "deabc" obtained with a single rotation of the original substring?
• 09-10-2009
jack_carver
Re
the number of rotations do not matter ... :)

it should return "equal" for any two strings in that set
• 09-10-2009
Salem
Are you always limited to a small alphabet, with a limited string length?
I doubt you'll get a unique function that will work on any length of string (lets say a DNA fragment).

One simple way would be to strcat() a copy of one string to itself, then use strstr() on the other.

Say
Code:

```strcpy(temp,"cdeab"); strcat(temp,"cdeab"); // find cdeabcdeab if ( strstr( temp, "abcde" ) ) {   // match }```
• 09-10-2009
sean
Quote:

Hmm..sorting wont work because it will give same result for string like "abad" & "aabd"
Unless I'm misunderstanding rotations, can't you derive one of those strings by rotations on the other?
• 09-10-2009
itCbitC
Quote:

Originally Posted by jack_carver
the number of rotations do not matter ... :)

it should return "equal" for any two strings in that set

IMHO what you're trying to do is break up the string into two (un)equal parts and switch those around.
Now you want to test if this flipped-around string is the same as the original one. Isn't that correct?
• 09-10-2009
iMalc
Quote:

Originally Posted by jack_carver
If two strings are gives say A,B i want to know if B can be obtained just by pure rotation for charecters in A. I know alternate ways to do this such as brute force or using factoradics ... etc
But some one told me that hashing is one the ways to do this efficienlty

That's correct, it will help the process to be efficient. However what the hash does is tell you if of the two strings one definitely cannot be rotated to turn one into the other. It does not tell you if one can be rotated to be the other.
In other words, you use it as a fast rejection method, and in the rare cases where a hash match is found, you do the same hard work you were going to without hashing.
• 09-10-2009
jack_carver
@ iMalc
Quote:

However what the hash does is tell you if of the two strings one definitely cannot be rotated to turn one into the other. It does not tell you if one can be rotated to be the other.
How can i make such a function ?

Thank u
Show 80 post(s) from this thread on one page
Page 1 of 2 12 Last