Thread: Parsing bittorrent metafile..

  1. #1
    Registered User
    Join Date
    Dec 2004
    Location
    The Netherlands
    Posts
    91

    Parsing bittorrent metafile..

    Hello,

    I am trying to receive information from a bittorrent metafile. It uses the following format:
    Code:
    <string length encoded in base ten ASCII>:<string data>
    Example:
    Code:
    5:Hello
    This should return the string Hello.

    Begin of a bittorrent metafile:
    Code:
    d8:announce39:http://www.point-blank.cc:6969/announce13:announce-list
    After a few attempts, using strchr to find ':', I got stuck. Are there any other ways of extracting the right values from those files?

    OS: Windows XP
    Last edited by apsync; 09-04-2006 at 02:51 PM.

  2. #2
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Code:
    #include <stdio.h>
    
    int main()
    {
       static const char filename[] = "file.txt";
       FILE *file = fopen(filename, "r");
       if ( file != NULL )
       {
          char line[BUFSIZ];
          while ( fgets(line, sizeof line, file) != NULL )
          {
             char text[20];
             if ( sscanf(line, "%*d:%19s", text) == 1 )
             {
                printf("text = \"%s\"\n", text);
             }
          }
       }
       else
       {
          perror(filename);
       }
       return 0;
    }
    
    /* my output
    text = "Hello"
    */
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  3. #3
    Registered User Tonto's Avatar
    Join Date
    Jun 2005
    Location
    New York
    Posts
    1,465
    Code:
    sscanf(line, "%*d:%19s", text)
    Well, if you can guarantee that the data won't exceed that number.

    Code:
    d8:announce39:http://www.point-blank.cc:6969/announce13:announce-list
    I don't know what that 'd' is in the data. Here's an outline of what you could do.

    Code:
    char num_buffer[256]; // or char * buffer, if you want to reallocate memory as needed
    char * data_buffer;
    int c, i;
    
    while(true)
    {
            while(char read into c != ':')
            {
                    store char into num_buffer;
            }
            if(num_buffer looks valid)
            {
                    conver_it_to_an_it();
            }
            else fail();
    
            allocate_memory_for_data_buffer();
    
            for(i < num_from_buf)
            {
                    read data_into_buffer;
            }
    }

  4. #4
    Registered User
    Join Date
    Dec 2004
    Location
    The Netherlands
    Posts
    91
    Thank you for the responses, I can work further now I guess, if not, I will post my questions here.

    The 'd' stands for dictionaries, you can find more info here
    Last edited by apsync; 09-04-2006 at 06:07 PM.

  5. #5
    Registered User
    Join Date
    Dec 2004
    Location
    The Netherlands
    Posts
    91
    Hi again,

    one still is still bothering me, how can I make sure how many digits the number before the ':' has?

  6. #6
    Registered User
    Join Date
    Mar 2006
    Posts
    725
    One does wonder......

    Code:
    int getintlen(int n, int base)
    {
        int len = 0;
        while(n)
        {
            n /= base;
            ++len;
        }
        return len;
    }
    Code:
    #include <stdio.h>
    
    void J(char*a){int f,i=0,c='1';for(;a[i]!='0';++i)if(i==81){
    puts(a);return;}for(;c<='9';++c){for(f=0;f<9;++f)if(a[i-i%27+i%9
    /3*3+f/3*9+f%3]==c||a[i%9+f*9]==c||a[i-i%9+f]==c)goto e;a[i]=c;J(a);a[i]
    ='0';e:;}}int main(int c,char**v){int t=0;if(c>1){for(;v[1][
    t];++t);if(t==81){J(v[1]);return 0;}}puts("sudoku [0-9]{81}");return 1;}

  7. #7
    Registered User
    Join Date
    Dec 2004
    Location
    The Netherlands
    Posts
    91
    Hi, sorry to bring this up again, but I am still having difficulties with this. I can not directly do getintlen(str[1], 10); because that would only take one digit. Or is checking each byte with isdigit() also good? Because I think there must be a more easier way.

  8. #8
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    jafet's code counts the number of digits in an integral number, such as an int, not a string. It sounds like you're trying to check how many digits the number is after you've converted it to a string. You can do this, too, but not with jafet's code.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  9. #9
    Registered User
    Join Date
    Dec 2004
    Location
    The Netherlands
    Posts
    91
    Quote Originally Posted by dwks
    such as an int, not a string.
    Yeah, I was thinking about casting it anyway, but mostly that dont end up good, right

    I am thinking about using strchr and strstr now, I hope I can work it out, for the meanwhile any information about this is appreciated.

  10. #10
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    So you have a string that represents a number and you want to figure out how many digits are in it? Can you use strlen() or are there other things in the string?
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  11. #11
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Quote Originally Posted by apsync
    Thank you for the responses, I can work further now I guess, if not, I will post my questions here.

    The 'd' stands for dictionaries, you can find more info here
    I'd still go with sscanf, but that's me. For example a byte string and an integer could be done this way:
    Code:
    int bt_bytestring(const char **text)
    {
       int size, n;
       if ( sscanf(*text, "%d:%n", &size, &n) == 1 )
       {
          *text += n;
          printf("\"%.*s\"", size, *text);
          *text += size;
          return 1;
       }
       return 0;
    }
    
    int bt_integer(const char **text)
    {
       int value, n;
       if ( sscanf(*text, "i%de%n", &value, &n) == 1 )
       {
          *text += n;
          printf("%d", value);
          return 1;
       }
       return 0;
    }
    These could be called by some overall function that looks at the current location in a string.
    Code:
    int process(const char **text)
    {
       return bt_bytestring (text) || 
              bt_integer    (text) || 
              bt_list       (text) || 
              bt_dictionary (text);
    }
    The list and the dictionary are a bit more troublesome, but they are based on the previous.
    Code:
    int bt_list(const char **text)
    {
       if ( **text == 'l' )
       {
          fputs("[ ", stdout);
          ++*text;
          while ( process(text) )
          {
             if ( **text == 'e' )
             {
                fputs(" ]", stdout);
                ++*text;
                return 1;
             }
             fputs(", ", stdout);
          }
       }
       return 0;
    }
    
    int bt_dict(const char **text)
    {
       if ( !process(text) )
       {
          return 0;
       }
       fputs(" => ", stdout);
       if ( !process(text) )
       {
          return 0;
       }
       return 1;
    }
    
    int bt_dictionary(const char **text)
    {
       if ( **text != 'd' )
       {
          return 0;
       }
       fputs("{ ", stdout);
       ++*text;
       while ( bt_dict(text) && **text != 'e' )
       {
          fputs(", ", stdout);
       }
       fputs(" }", stdout);
       ++*text;
       return 1;
    }
    But a little test might look like this.
    Code:
    int main(void)
    {
       static const char *text[] =
       {
          "4:spam",
          "i3e",
          "l4:spam4:eggse",
          "d3:cow3:moo4:spam4:eggse",
          "d4:spaml1:a1:bee",
       };
       size_t i;
       for ( i = 0; i < sizeof text / sizeof *text; ++i )
       {
          printf("\"%s\" : ", text[i]);
          if ( !process(&text[i]) )
          {
             fputs("bad format", stdout);
          }
          putchar('\n');
       }
       return 0;
    }
    
    /* my output
    "4:spam" : "spam"
    "i3e" : 3
    "l4:spam4:eggse" : [ "spam", "eggs" ]
    "d3:cow3:moo4:spam4:eggse" : { "cow" => "moo", "spam" => "eggs" }
    "d4:spaml1:a1:bee" : { "spam" => [ "a", "b" ] }
    */
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Parsing Struct Objects to Threads
    By pobri19 in forum C Programming
    Replies: 4
    Last Post: 04-18-2009, 06:25 AM
  2. need sth about parsing
    By Masterx in forum C++ Programming
    Replies: 6
    Last Post: 11-07-2008, 12:55 AM
  3. draw tree graph of yacc parsing
    By talz13 in forum C Programming
    Replies: 2
    Last Post: 07-23-2006, 01:33 AM
  4. Parsing for Dummies
    By MisterWonderful in forum C++ Programming
    Replies: 4
    Last Post: 03-08-2004, 05:31 PM
  5. I hate string parsing with a passion
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 2
    Last Post: 03-19-2002, 07:30 PM