Thread: Get an unknown string of known length from the middle of another string

  1. #1
    Registered User
    Join Date
    Jul 2010
    Posts
    28

    Get an unknown string of known length from the middle of another string

    Hi all,

    This is my second time posting here and I need help from you all again.

    I don't know how should I search in this forum about my case, so I guess I might just make a thread instead. Please kindly link me to any similar problem in this thread if you spot one.

    Here's the problem.
    Say, I have this string
    Code:
     2.56942078E+00-8.59741137E-05 4.19484589E-08
     3.28253784E+00 1.48308754E-03-7.57966669E-07
    I want to make it into
    Code:
     2.56942078E+00
    -8.59741137E-05
     4.19484589E-08
    
     3.28253784E+00
     1.48308754E-03
    -7.57966669E-07
    strtok doesn't work because I don't have a consistent delimiter (sometimes it's -, sometimes it's space) in between them.
    I tried using strncpy,
    something like
    Code:
    strncpy(temp,&buff[i][0],15);
    strncpy(temp2,&buff[i][4],15);
    where buff is the string, and the temp and temp2 is what I want to copy into.
    It sounds correct to me, but it doesn't do the work.

    That part of the code produces:
    Code:
     2.56942078E+00         8.59741137E-05
     3.28253784E+00         1.48308754E-03-
    for temp and temp2 respectively.

    That is in fact to my surprise, as I was thinking of using &buff[i][15] to get my second decimals starting from the 16th character, but instead by trial and error, it's closer to use &buff[i][4], but that's actually reading from the 17th character of the string.

    How could I solve this?
    Last edited by tanjinjack; 03-07-2011 at 11:06 AM.

  2. #2
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    strtok() should be able to search for multiple separators such as "- +=:" without too much trouble and will match on any character in the separator string...

    From the PelesC help file...
    Purpose:
    Breaks up a string into tokens. The safer strtok_s function is also available.

    Syntax:
    char * strtok(char * restrict string1, const char * restrict string2);

    Declared in:
    <string.h>

    Description:
    A sequence of calls to the strtok function breaks the string pointed to by string1 into a sequence of tokens, each of which is delimited by a character from the string pointed to by string2. The first call in the sequence has a non-null first argument; subsequent calls in the sequence have a null first argument. The separator string pointed to by string2 may be different from call to call.

    The first call in the sequence searches the string pointed to by string1 for the first character that is not contained in the current separator string pointed to by string2. If no such character is found, then there are no tokens in the string pointed to by string1 and the strtok function returns a null pointer. If such a character is found, it is the start of the first token.

    The strtok function then searches from there for a character that is contained in the current separator string. If no such character is found, the current token extends to the end of the string pointed to by string1, and subsequent searches for a token will return a null pointer. If such a character is found, it is overwritten by a null character, which terminates the current token. The strtok function saves a pointer to the following character, from which the next search for a token will start.

    Each subsequent call, with a null pointer as the value of the first argument, starts searching from the saved pointer and behaves as described above.

    Note! The strtok function uses an internal static object to remember the search position. Therefore, you cannot search more than one string at a time. However, you can use strtok simultaneously from multiple threads.

    Returns:
    A pointer to the first character of a token, or a null pointer if there is no token.


    See also:
    The wcstok function.

  3. #3
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    As Tater pointed out, strtok will work with multiple delimiters, but the bigger problem is that it will get confused when it sees a negative sign for you exponent as well as the one at the beginning of the number. Some other suggestions:
    1. If they are always exactly 15 characters, then your strncpy solution should work. You're probably doing something wrong (can't tell, since I can't see all the definitions and code) that's making it seem to fail.
    2. Use sscanf to parse the buffer and sprintf to print it back into a buffer in the desired format.
    3. Write your own parser.

  4. #4
    Registered User
    Join Date
    Jul 2010
    Posts
    28
    Quote Originally Posted by anduril462 View Post
    As Tater pointed out, strtok will work with multiple delimiters, but the bigger problem is that it will get confused when it sees a negative sign for you exponent as well as the one at the beginning of the number. Some other suggestions:
    1. If they are always exactly 15 characters, then your strncpy solution should work. You're probably doing something wrong (can't tell, since I can't see all the definitions and code) that's making it seem to fail.
    2. Use sscanf to parse the buffer and sprintf to print it back into a buffer in the desired format.
    3. Write your own parser.
    Indeed. That is the reason why I am staying away from strtok.

    Here's my code.
    There are some if conditions imposed, but I think that doesn't affect the code.
    Any idea where did I get it wrong?

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    void main()
    {
        //printf("something");
        char *buff[50][81];
        char *temp[15];
        char *temp2[15];
        double dec1;
        double dec2;
        int n=0;
        int i=0;
        float a=0;
        int len;
        FILE *fp;
        fp=fopen("thermo2.dat","r");
    
    
        while ( fgets( buff[i], sizeof(buff[i]), fp ) != NULL)
        {
    
      if(strlen(buff[i])==81)
    {
        if(i%4==1)
        {
    puts(buff[i]);
    strncpy(temp,&buff[i][0],15);
    strncpy(temp2,&buff[i][4],15);
    printf("%s\t\t%s\n",temp,temp2);
       // for(n=0;n<80;n++) printf("%p\n",&buff[n]);
    
        }
    
    i++;
    }
    
    
        }
        fclose(fp);
    
    
    }

  5. #5
    Registered User
    Join Date
    Dec 2007
    Posts
    2,675
    Judging by this
    Code:
    char *buff[50][81];
    char *temp[15];
    char *temp2[15];
    I don't believe you have an understanding of C character arrays and are severely complicating your program.

  6. #6
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    So there are several problems to address here:
    1. It's int main(void) and return 0 at the end.
    2. buff, temp and temp2 should be arrays of chars, not char *s.
    3. If you want to copy in 15 characters, you need arrays of size 16 (15 + 1 for the '\0') if you plan on using them as strings (which it looks like you do).

  7. #7
    Registered User
    Join Date
    Jul 2010
    Posts
    28
    Quote Originally Posted by rags_to_riches View Post
    Judging by this
    Code:
    char *buff[50][81];
    char *temp[15];
    char *temp2[15];
    I don't believe you have an understanding of C character arrays and are severely complicating your program.
    Indeed. That is what I actually do not understand a lot on it. Please kindly direct me for any necessary readings or point out the errors I am committing, thanks.

    Quote Originally Posted by anduril462 View Post
    So there are several problems to address here:
    1. It's int main(void) and return 0 at the end.
    2. buff, temp and temp2 should be arrays of chars, not char *s.
    3. If you want to copy in 15 characters, you need arrays of size 16 (15 + 1 for the '\0') if you plan on using them as strings (which it looks like you do).
    1. I think I don't have a return 0, do I?
    2. Exactly, sometimes, I just assign via trial and errors and stick to the one that works. For the buff, if I use char s instead of char *, my fgets doesn't work anymore.
    3. I will be converting them to decimals later on via atof, so I guess it's okay without the \0.
    Last edited by tanjinjack; 03-07-2011 at 12:26 PM.

  8. #8
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    We have some tutorials on this site: Cprogramming.com - Programming Tutorials: C++ Made Easy and C Made Easy. Google is also your friend, and a good book never hurts (various recommendations here: C Book Recommendations).

    To summarize your problem, you were making arrays of pointers to chars, not actual chars. That doesn't provide space to store characters, only to store the address where your program could find characters. But the pointers were uninitialized and there was no space reserved to put characters, hence your problems.

    Quote Originally Posted by tanjinjack View Post
    1. I think I don't have a return 0, do I?
    Nope, which is why I mentioned it. The shell of your program should look like:
    Code:
    int main(void)
    {
        // code here
        return 0;
    }
    2. Exactly, sometimes, I just assign via trial and errors and stick to the one that works. For the buff, if I use char s instead of char *, my fgets doesn't work anymore.
    Not the best strategy. Spend your time up front reading, researching and thinking it through, and you'll save lots of headaches. Also, your fgets works fine like this: fgets(buff[i], sizeof(buff[i]) - 1, fp)

    3. I will be converting them to decimals later on via atof, so I guess it's okay without the \0.
    Nope, atof requires a null terminator since it considers the string a string, not just a bunch of bytes. You should look into strtod instead, since it provides better error handling. strtod also requires a null terminator.
    Last edited by anduril462; 03-07-2011 at 12:31 PM.

  9. #9
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by tanjinjack View Post
    1. I think I don't have a return 0, do I?
    2. Exactly, sometimes, I just assign via trial and errors and stick to the one that works. For the buff, if I use char s instead of char *, my fgets doesn't work anymore.
    3. I will be converting them to decimals later on via atof, so I guess it's okay without the \0.
    Rule #1 in programming ... never let your pride get in the way of learning.

    That is... don't make excuses for bad coding practices... fix the problems.

  10. #10
    Registered User
    Join Date
    Jul 2010
    Posts
    28
    Hey guys,

    I have read this tutorial and get a better idea of the declaration.

    Here's my updated code.
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    int main()
    {
    
        char buff[82];
        char temp[5][16];
        int n=0;
        int i=0;
        float a=0;
        int len;
        FILE *fp;
        fp=fopen("thermo2.dat","r");
    
        while ( fgets( buff, sizeof(buff)-1, fp ) != NULL)
        {
    len=strlen(buff);
    if(len>75)
    {
        if(i%4==1)
        {
    puts(buff);
    
    for(n=0;n<5;n++)
    {
    strncpy(temp[n],&buff[n*15],15);
    strncat(temp[n][15],"\0",1);
    printf("%30s\t",temp[n]);
    a=atof(temp[n]);
    printf("atof:%e\n",a);
    }
    
    printf("\n\n");
        }
    
    i++;
    }
    
    
        }
        fclose(fp);
    
    return 0;
    }
    It produces output that looks like:
    Code:
     2.56942078E+00-8.59741137E-05 4.19484589E-08-1.00177799E-11 1.22833691E-15    2
    
                  2.56942078E+005  atof:2.569421e+000
                   -8.59741137E-05  atof:-8.597411e-005
              4.19484589E-083窴"  atof:0.000000e+000
                   -1.00177799E-11  atof:-1.001778e-011
     1.22833691E-15?2.56942078E+00-8.59741137E-05 4.19484589E-08-1.00177799E-11 1.22
    833691E-15    2 atof:1.228337e-015
    I know that the issue lies in assigning a "\0" at the end of my temp[n].
    I tried with:
    Code:
    strncat(temp[n][15],"\0",1);
    and it doesn't seem to work.

    However, atof does work but that's not neat enough and potentially it will go wrong when I try with other files.
    EDIT: The 3rd atof doesn't work already. I have to fix this.

    Where did I go wrong this time?

  11. #11
    Registered User
    Join Date
    Dec 2007
    Posts
    2,675
    Are you simply trying to output the numbers one after another each on its own line?

  12. #12
    Registered User
    Join Date
    Jul 2010
    Posts
    28
    Quote Originally Posted by rags_to_riches View Post
    Are you simply trying to output the numbers one after another each on its own line?
    I attempt to get all the works done in one loop, from getting the characters, to adding "\0" and using atof to convert. Please let me know if that's not a good practice of doing it.

  13. #13
    Registered User
    Join Date
    May 2010
    Location
    Naypyidaw
    Posts
    1,314
    You want to convert them to numbers or just string?
    Can't you just use fscanf()!

  14. #14
    Registered User
    Join Date
    Jul 2010
    Posts
    28
    Quote Originally Posted by Bayint Naung View Post
    You want to convert them to numbers or just string?
    Can't you just use fscanf()!
    Eventually they should become numbers. But even if just strings, it will need the "\0" terminator, isn't it?

    I don't think fscanf works. If you check my earlier post, the formatting of the file is not so regular.

  15. #15
    Registered User
    Join Date
    Dec 2007
    Posts
    2,675
    This program:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    /* The field width of all numbers */
    #define NUM_WIDTH 15
    
    int main(void)
    {
        char buf[BUFSIZ] = { 0 };
        FILE *fp = fopen("nums.txt", "r");
        if (fp)
        {
            while (fgets(buf, sizeof(buf), fp))
            {
                /*                                                                  
                   Get the number of numbers on the line.                           
                   Subtract one from the length for the terminating newline         
                   saved by fgets.                                                  
                */
                int numberOfNumbers = (strlen(buf) - 1) / NUM_WIDTH;
                int i = 0;
                for (; i < numberOfNumbers; ++i)
    	    {
                    /*                                                              
                       Walk through the buffer by the width of the field,           
                       printing only the number of characters specified             
                       by the width                                                 
                    */
                    printf("%.*s\n", NUM_WIDTH, &buf[i * NUM_WIDTH]);
                }
            }
            fclose(fp);
        }
    
        return 0;
    }
    reads a line from your input file and outputs each on its own line. Perhaps you can expand on this.

    atof apparently does not work with scientific notation, so if these need to be converted to doubles, you will need to do more work. Something like this.
    Last edited by rags_to_riches; 03-08-2011 at 06:36 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Find String length and Arraysearch
    By s.rajaram in forum C Programming
    Replies: 5
    Last Post: 10-03-2007, 02:28 AM
  2. Compile Error that i dont understand
    By bobthebullet990 in forum C++ Programming
    Replies: 5
    Last Post: 05-05-2006, 09:19 AM
  3. Replies: 4
    Last Post: 03-03-2006, 02:11 AM
  4. UNICODE and GET_STATE
    By Registered in forum C++ Programming
    Replies: 1
    Last Post: 07-15-2002, 03:23 PM
  5. length of string etc.
    By Peachy in forum C Programming
    Replies: 5
    Last Post: 09-27-2001, 12:04 PM