Thread: how to parse this file?

  1. #1
    Registered User
    Join Date
    Aug 2009
    Posts
    168

    how to parse this file?

    I have a file as follows:
    Code:
    cons: chromosome06 31329000 0 T Zero coverage
    cons: chromosome06 31329001 0 T Zero coverage
    cons: chromosome06 31329002 0 G Zero coverage
    cons: chromosome06 31329003 0 G Zero coverage
    cons: chromosome06 31329004 0 A Zero coverage
    cons: chromosome06 31329005 0 C Zero coverage
    cons: chromosome06 31329006 0 A Zero coverage
    cons: chromosome06 31329007 0 C Zero coverage
    ..
    cons: chromosome06 31329000 0 T Zero coverage
    I want to get the content whose color is blue,
    but I don't know how to process it, I am a newbie who begin to learn c++
    Can anyone help me?
    I want to load into memory using two-dimension array:
    Code:
    char **chr;
    chr[6][31329001] = 0;

  2. #2
    In the Land of Diddly-Doo g4j31a5's Avatar
    Join Date
    Jul 2006
    Posts
    476
    Use a string tokenizer to get each string that is divided by a space. And for the first value (the chromosome 06), you just have to parse it yourself after you got it from the tokenizer. Just remember though that the outputs are strings (ie. pointer of chars) so you have to convert it to an integer before using it as array indices. Hope that helps.

    PS: Last time I used STL string tokenizer, it was not working correctly (dunno now) so I suggest you use Boost C++'s tokenizer or you can make them yourself.
    ERROR: Brain not found. Please insert a new brain!

    “Do nothing which is of no use.” - Miyamoto Musashi.

  3. #3
    Ex scientia vera
    Join Date
    Sep 2007
    Posts
    477
    Quote Originally Posted by g4j31a5 View Post
    Use a string tokenizer to get each string that is divided by a space. And for the first value (the chromosome 06), you just have to parse it yourself after you got it from the tokenizer. Just remember though that the outputs are strings (ie. pointer of chars) so you have to convert it to an integer before using it as array indices. Hope that helps.

    PS: Last time I used STL string tokenizer, it was not working correctly (dunno now) so I suggest you use Boost C++'s tokenizer or you can make them yourself.
    There is no standard STL string tokenizer. Your only option is to write one yourself(Using STL stuff if you prefer) or use someone else's.
    "What's up, Doc?"
    "'Up' is a relative concept. It has no intrinsic value."

  4. #4
    Registered User
    Join Date
    Mar 2010
    Posts
    109
    Quote Originally Posted by IceDane View Post
    There is no standard STL string tokenizer. Your only option is to write one yourself(Using STL stuff if you prefer) or use someone else's.
    Doesn't strtok() qualify as a standard STL string tokenizer?

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by syzygy View Post
    Doesn't strtok() qualify as a standard STL string tokenizer?
    No. It's part of the C standard, not the STL, and it doesn't work on STL strings.

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by syzygy
    Doesn't strtok() qualify as a standard STL string tokenizer?
    No, strtok() was never part of the STL.

    Anyway, to stick to the standard library, tokenisation can be performed by using getline() to read each line, then initialise a stringstream with the line and read formatted input using an overloaded operator>>. As g4j31a5 indicated, a little more work would be needed to extract the part that is at the end of "chromosome".
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    In the Land of Diddly-Doo g4j31a5's Avatar
    Join Date
    Jul 2006
    Posts
    476
    Quote Originally Posted by IceDane View Post
    There is no standard STL string tokenizer. Your only option is to write one yourself(Using STL stuff if you prefer) or use someone else's.
    Oh yeah, my bad. I was talking about strtok(). :P
    ERROR: Brain not found. Please insert a new brain!

    “Do nothing which is of no use.” - Miyamoto Musashi.

  8. #8
    Registered User
    Join Date
    Aug 2009
    Posts
    168
    I write a test code using strtok,but it can't work,
    why?
    Code:
          //#include "syslib.h"
          #include <string.h>
          #include <stdio.h>
          using namespace std;
          main()
          {
            char *s="Golden Global View";
            char *d=" ";
            char *p;
            
            //clrscr();
            
            p=strtok(s," ");
            
            while(p)
            {
              printf("%s\n",s);
              strtok(NULL,d);
            }
            
            //getchar();
            return 0;
          }

  9. #9
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    strtok is destructive, and as such can't be called on nonmodifiable/const data, which s is (it points to a string literal, which is not memory you own or can change).

  10. #10
    Registered User
    Join Date
    Aug 2009
    Posts
    168
    Quote Originally Posted by tabstop View Post
    strtok is destructive, and as such can't be called on nonmodifiable/const data, which s is (it points to a string literal, which is not memory you own or can change).
    I see ! thans

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. sequential file program
    By needhelpbad in forum C Programming
    Replies: 80
    Last Post: 06-08-2008, 01:04 PM
  2. Interpreter.c
    By moussa in forum C Programming
    Replies: 4
    Last Post: 05-28-2008, 05:59 PM
  3. System
    By drdroid in forum C++ Programming
    Replies: 3
    Last Post: 06-28-2002, 10:12 PM
  4. Hmm....help me take a look at this: File Encryptor
    By heljy in forum C Programming
    Replies: 3
    Last Post: 03-23-2002, 10:57 AM
  5. Need a suggestion on a school project..
    By Screwz Luse in forum C Programming
    Replies: 5
    Last Post: 11-27-2001, 02:58 AM