Thread: Get a substring from between two different substrings.

  1. #1
    Registered User
    Join Date
    Jun 2018
    Posts
    3

    Get a substring from between two different substrings.

    I've been trying to write a C program that will get the string from between two substrings for weeks now. I cant get it to work.
    I made some progress, but my program has a bug I cant find, it
    works on some strings some times but not others.


    I've tried, but I new to C and need help.


    The data is inside a large (40 meg) text file I read in , here are examples...

    [CODE]
    Pesudo code example...
    char string[80] = "This is a test string, it has som err < ors insi[} de of it."

    char d1 = ", it";
    char d2 = "} d";
    printf ("extracted substring= %s", getsub(string,d1,d2));

    printf output= has som err < ors insi[}
    Here's my DEFECTIVE C Coding attempt...
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <stdint.h>
    #include <assert.h>
    #include <limits.h>
    
    
    int main(void)
    {
      FILE *fp;
      char s[80]; 
      char fname[] = "tst-file";
      const char *PATTERN1 = "<h . som >errw";
      const char *PATTERN2 = "</h gg-txtw>";
    
    
      if ((fp= fopen(fname, "r")) == NULL) {
            printf("cannot open file");
            exit(1);
      }
      while(!feof(fp)) {
            fgets(s,80,fp);
    
    
        char *target = NULL;
        char *start, *end;
    
    
        if ( start = strstr( s, PATTERN1 ) )
        {
            start += strlen( PATTERN1 );
            if ( end = strstr( start, PATTERN2 ) )
            {
                target = ( char * )malloc( end - start + 1 );
                memcpy( target, start, end - start );
                target[end - start] = '\0';
            }
        }
        if ( target ) printf( "%s\n", target );
        free( target );
    }
    fclose(fp);
        return 0;
    }

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > The data is inside a large (40 meg) text file I read in , here are examples...
    ...
    > fgets(s,80,fp)

    So are all your error conditions within just one line of 80 characters in your large file?

    Or can pattern1 be on one line, and pattern2 on a line later in the file?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Jun 2018
    Posts
    3
    Just so you know, I'm in high school, This problem is for my English teacher to help recover data that was lost in this file.
    I'm trying to learn C. I would like to be a programmer someday.

    The patterns can span multiple lines PATTERN1 can be on one line and PATTERN2 be 5 lines later there can also be blank lines between.
    I've tried making the buffer larger also, like 256 or 1024. I have struggled because I'm very very new to C. I understand pointers some I guess.
    I wrote this program, I would think this should be kinda simple for someone with advanced skills I would think.

    I've almost been successful, And I've only been programming in C for a short time, but I'm really stuck and cant fix this. And I really need to for my
    teacher. I've looked everywhere trying to make this work.

    I've worked hard, but cant get it to work right, and I'm all out of ideas.

    So thanks so very much for any help
    Last edited by commanderklag; 01-22-2019 at 08:20 AM.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    40Mb is no biggie for modern desktop/laptop machines.

    You may as well just read the whole file into memory.
    Code:
    size_t size = 1024 * 1024 * 40; // 40Mb
    unsigned char *buff = malloc( size );
    fp= fopen(fname, "rb"); // must be "binary" mode.
    fread( buff, size, 1, fp );
    Since you're talking about "recovery" data, there's no guarantee that anything you have qualifies as printable text or strings.
    So you should use memcmp() to walk your way through the buffer to find things of interest.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Need help on substrings and streams
    By TaiL in forum C++ Programming
    Replies: 2
    Last Post: 10-08-2008, 06:18 PM
  2. vector of substrings
    By manav in forum C++ Programming
    Replies: 47
    Last Post: 05-10-2008, 02:05 PM
  3. Substrings, how would I use this function?
    By JFonseka in forum C Programming
    Replies: 5
    Last Post: 10-31-2007, 03:20 AM
  4. substrings
    By arjunajay in forum C++ Programming
    Replies: 30
    Last Post: 06-10-2005, 09:13 PM
  5. Substrings
    By reRanger in forum C++ Programming
    Replies: 0
    Last Post: 11-28-2004, 12:10 PM

Tags for this Thread