# Thread: Algorithm help: Find a string in another string?

1. ## Algorithm help: Find a string in another string?

My friend gave me this question that he got on a job interview. You're trying to see if its possible for firstString to equal secondString by simply deleting characters in the firstString. You're not actually altering anything, just checking to see if its possible and then returning true or false. In this case, its possible and I have worked the logic out on paper but I'm stumped on translating it into code. Can anyone point me in the right direction?

Code:
```#include<stdio.h>
#include<string.h>

void checkString(char firstString[], char secondString[]);

int main(){

char firstString[20] = "aaabababaaasdfg";
char secondString[20] = "absd";
checkString(firstString, secondString);

}
void checkString(char firstString[], char secondString[]){

int i,j;

for(j = 0; j < strlen(secondString); j++){
for(i = 0; i < strlen(firstString); i++){
if(secondString[j] != firstString[i]){
....
}
else{

....
}
}
}
}```

2. What is the logic that you worked out on paper?

3. This was my systematic method of checking if the first string contains the second:
"aaabababaaasdfg" - (first) string
"absd" - (second) string

a (second) = a (first), so
move to next character b(second), compare to all the characters after a(first). Stop at b(first) because b(second) = b(first).
Move to next character s(second), compare to all the characters after b(first). Stop at s(first) because s(second) = s(first).
Move to next character d(second), compare to all the characters after s(second). Stop at d(first) because d(second) = d(first). Done.

4. Originally Posted by atac
This was my systematic method of checking if the first string contains the second:
"aaabababaaasdfg" - (first) string
"absd" - (second) string

a (second) = a (first), so
move to next character b(second), compare to all the characters after a(first). Stop at b(first) because b(second) = b(first).
Move to next character s(second), compare to all the characters after b(first). Stop at s(first) because s(second) = s(first).
Move to next character d(second), compare to all the characters after s(second). Stop at d(first) because d(second) = d(first). Done.
That's difficult logic to make work. You're dealing with every letter from both strings, across every position of the letter and value.

Instead, think of something more as a summary of the count of each letter.

This is a good trick to remember, btw. I've used it several times - oddly, just yesterday. This is the body of the compareStrings function. I just defined MAX to be 127, which is easiest to work with.

Code:
```   for(i=0;i<MAX;i++) {
count1[firstString[i]]++;
count2[secondString[i]]++;
}
for(i=0,ok=1;secondString[i];i++) {
if(count2[secondString[i]] > count1[secondString[i]]) {
ok=0;
}
}```

5. Originally Posted by atac
This was my systematic method of checking if the first string contains the second:
I think your method is a correct method. The thing is, you effectively have two pointers: one points to the current character of the first string; the other points to the current character of the second string. Then, you make a single pass through each string. This means that you should not have a nested loop that resets the index to 0. (You might not need nested loops to begin with.)

Originally Posted by Adak
Instead, think of something more as a summary of the count of each letter.

This is a good trick to remember, btw. I've used it several times - oddly, just yesterday. This is the body of the compareStrings function. I just defined MAX to be 127, which is easiest to work with.
Only deletion is allowed, not tranposition, so a count would not suffice as order matters.

6. OK. This is even easier. I used the void return type for checkString(), like your code had, although your description mentions an int return to it is needed.

Code:
```#include<stdio.h>
#include<string.h>

#define MAX 20

void checkString(char string1[], char string2[]);

int main(void) {
char string1[MAX] = "abbcbasdfg";
char string2[MAX] = "absdg";
checkString(string1, string2);
printf("\n");
return 0;
}
void checkString(char string1[MAX], char string2[MAX]){

int i,j,ok=0;

for(i=0,j=0;string1[i];i++) {     //smoke & mirrors! ;)
if(string2[j]==string1[i]) {
++j;
}
}
if(j==strlen(string2)) {
ok=1;
printf("Yes, the second string can be made from letters in the first string\n");
}else {
printf("No, the second string can't be made from letters in the first string\n");
}
}```

7. Here is my approach:
Code:
```#include <errno.h>

/* Return 0 if deleting characters from sequence can produce required.
* Otherwise, return nonzero (ENOENT, or EINVAL if sequence is NULL).
*/
int partial(const char *sequence, const char *required)
{
/* Abort if sequence is NULL. */
if (!sequence)
return EINVAL;

/* If required is NULL, return success. */
if (!required)
return 0;

/* Scan each character in required. */
while (*required) {

/* Find the character in sequence that
* matches the current required character. */
while (*sequence != *required)
if (!*(sequence++))
return ENOENT; /* End of sequence, so we fail. */

/* This pair matches, so we advance both. */
required++;
sequence++;
}

/* All characters in required have been matched,
* and the rest of the characters in sequence may be skipped.
* Therefore, success. */
return 0;
}```
Although I'd describe the logic a bit differently (as instructions instead of results), I think it is pretty much exactly the logic atac described in post #3.

Of course, I used char pointers instead of indices, and reverse return value (mostly to make sure you don't just copy this), but you should be able to see the logic easier:

The outer loop loops over the required characters in order.

The inner loop skips characters in the sequence that do not match the required character. Note that this loop continues where it left off the last time, it does not restart at the beginning of the string. If it encounters the end of sequence, then the function fails.

If you have no more characters in the required string left, you know you can always just "delete" whatever is left in the sequence (if anything). So, the function returns success then.

atac, I suspect your problem in trying to get it your logic expressed in C was because you decided on a for loop, which happens to be the one that does not map that easily to the case here -- you don't reset the inner loop, but continue from where you left off the last iteration of the outer loop, so you do not have any start value (except where you left off last time).

Popular pages Recent additions