Thread: Substring finding ideas

  1. #1
    Registered User
    Join Date
    Feb 2009
    Posts
    17

    Substring finding ideas

    I have some strings, ex:
    A[10] = "message";
    B[25] = "AAmSSeDDsFFsGGaHHgJJe";
    C[10] = "messaeg";

    So i need some ideas for finding substrings like this: in the example, A is substring of B coz with eleminating some characters (only eleminating, not moving their place!) we can get string A, and A is not substring of C.
    I have tried with strncmp() and strchr() but no success.

    Any ideas about this ?

  2. #2
    Registered User
    Join Date
    Sep 2008
    Posts
    58
    You should post your code for people to see where you went wrong.

  3. #3
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    You could use a loop which does strchr() to find the a particular character, and if you find that character, search from that position onwards for the next character, and so on. If you find the end of the "needle" string, then you have a substring by your definition.
    And by lucky coincidence, strchr() returns a pointer to the string you want to search for, so you could just use that for the next string - just make sure that if strchr() returns NULL, you end the search.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  4. #4
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Based solely on what you've posted, I'm guessing that what you've failed to take into account is that you must start searching for character 2 from the place where you found character 1, and not from the start of the string.

  5. #5
    Registered User
    Join Date
    Feb 2009
    Posts
    17
    Quote Originally Posted by blurx View Post
    You should post your code for people to see where you went wrong.
    I did not go wrong anywhere because i don't even have the code - i was asking for some ideas how to solve the problem, that's all.

    @matsp:
    Thanks matsp, you gave really good ideas.

  6. #6
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by adrian2009 View Post
    I did not go wrong anywhere because i don't even have the code - i was asking for some ideas how to solve the problem, that's all.
    One is forced to wonder how you tried strncmp and strchr if you don't even have code.

  7. #7
    Registered User
    Join Date
    Feb 2009
    Posts
    17
    OK, so I am stuck in here again.

    I have tried to solve this problem, but with no success. I tried the matsp's idea and i think that i implemented it well, but i have no results on strings with characters that are repeating.

    Here is the code:

    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main()
    {
    int y, x[100], i, j = 0, k, n, z, g = 0;
    char a[251], b[251];
    FILE *file;
    file = fopen("text.txt", "r");
    fscanf(file, "%d", &n);
    for(i = 0;i < n;i++)
    {
    fscanf(file, "%s %s", a, b);
    y = strlen(a);
    z = strlen(b);
    for(k = 0;k < y;k++)
    {
    for(j;j < z;j++)
    {
    if(a[k] == b[j])
    	{
    	x[k] = j;
    	j++;
    	break;
    	}
    }
    printf("x%d-[%d] = %d\n", i, k, x[k]);
    }
    
    }
    fclose(file);
    return 0;
    
    }

  8. #8
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    Quote Originally Posted by adrian2009 View Post
    but i have no results on strings with characters that are repeating ...
    Depends if repeating characters are considered valid and part of the string.

  9. #9
    Registered User
    Join Date
    Feb 2009
    Posts
    17
    Quote Originally Posted by itCbitC View Post
    Depends if repeating characters are considered valid and part of the string.
    Yes, it is allowed to have repeating characters in the string. Ex:
    message IS substring of mmmeeesssssssgggaaagggeeeeggggwwwaaassseee

    Some help about this ?

  10. #10
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    Quote Originally Posted by adrian2009 View Post
    Yes, it is allowed to have repeating characters in the string. Ex:
    message IS substring of mmmeeesssssssgggaaagggeeeeggggwwwaaassseee

    Some help about this ?
    Why is that any different than normal processing?
    Just skip o'er the duplicates until a matching char is found?
    Make sure to advance the pointer to the next char on the substring.

  11. #11
    Registered User
    Join Date
    Feb 2009
    Posts
    17
    Quote Originally Posted by itCbitC View Post
    Why is that any different than normal processing?
    Just skip o'er the duplicates until a matching char is found?
    Make sure to advance the pointer to the next char on the substring.
    The problem is that anything i try simply does not give me any positive result.
    Here is the code of what i have tried to do, based on the idea you gave.
    Still no results.

    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main()
    {
    int i, n, k, c, d[251], y, z , j = 0, u = 0, r = 0;
    char a[251], b[251];
    FILE *file;
    file = fopen("text.txt", "r");
    fscanf(file, "%d", &n);
    for(i = 0;i < n;i++)
    {
    fscanf(file, "%s %s", a, b);
    y = strlen(a);
    z = strlen(b);
    
    for(k = 0;k < y;k++)
    {
    for(j;j < z;j++)
    {
    if(a[k] == b[j]) {
    	if(j = d[k]) break;
    	d[k] = j;	
    	j++;
    		
    	}
    }
    printf("d[%d] = %i\n", k, d[k]);
    
    }
    
    }
    fclose(file);
    return 0;
    }

  12. #12
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    Can you explain the code as it's not very obvious which is the string and which is the substring.
    Note that the pointer over the substring doesn't move until a match is found but not so for the string.

  13. #13
    Registered User
    Join Date
    Feb 2009
    Posts
    17
    b is the string (the larger one), and a should be substring of b.

  14. #14
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    I suggest that you simplify your test program:
    Code:
    #include <stdio.h>
    
    int juxtaposeFind(const char *search, const char *subject);
    
    void testFound(const char *search, const char *subject);
    void testNotFound(const char *search, const char *subject);
    
    int main(void)
    {
        testFound("message", "AAmSSeDDsFFsGGaHHgJJe");
        testNotFound("messaeg", "AAmSSeDDsFFsGGaHHgJJe");
    
        return 0;
    }
    
    int juxtaposeFind(const char *search, const char *subject)
    {
        /* ... */
    }
    
    void testFound(const char *search, const char *subject)
    {
        printf("%s\n", juxtaposeFind(search, subject) ? "pass" : "fail");
    }
    
    void testNotFound(const char *search, const char *subject)
    {
        printf("%s\n", !juxtaposeFind(search, subject) ? "pass" : "fail");
    }
    Without loss of generality, implement juxtaposeFind() in the above program such that you pass both test cases (and more, which can easily be added via the testFound() and testNotFound() functions). Once you are certain that your algorithm and its implementation are correct, you can then add the code that reads from a file.
    Last edited by laserlight; 03-07-2009 at 03:04 PM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  15. #15
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    Here's how the algorithm looks like:
    Code:
    read in substring a and string b
    set a's index to zero
        for loop (until b's end)
            if (elements of a and b match)
                increment a's index by one, and
                if (a[index] is NULL)
                    print match found and exit
        print match not found and exit
    Last edited by itCbitC; 03-07-2009 at 03:06 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. finding substring
    By rits in forum C Programming
    Replies: 14
    Last Post: 03-12-2009, 09:02 AM
  2. I need help with creating a substring program
    By CProgramingBegg in forum C Programming
    Replies: 9
    Last Post: 02-06-2009, 09:50 AM
  3. Replacing a substring with another
    By Dan17 in forum C Programming
    Replies: 3
    Last Post: 09-14-2006, 10:37 AM
  4. Finding URLS
    By johnchain in forum C Programming
    Replies: 5
    Last Post: 10-07-2005, 07:56 AM
  5. idea's idea's idea's
    By mithrandir in forum A Brief History of Cprogramming.com
    Replies: 10
    Last Post: 04-29-2002, 12:30 AM