Thread: Storing tokenized strings into different variables

  1. #16
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Yes i do want to do this with strtok and i have been advised by my lecturer that this function should be used.
    How else would i be able to seperate the data attributes in the file and process them.
    You might find that sscanf() is more helpful here than strtok...

    sscanf - C++ Reference

  2. #17
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    As tater said, it's probably more common in this case to use sscanf. But you can stick with strtok if you wish.

    You weren't updating your i variable, so I changed your outer loop to add i and simplified the loop by removing the eof variable. You don't seem to need the tmpbuff so I got rid of it.

    You can turn the strings read with strtok into integers by using atoi as below.

    But the code still reads one more time than it should. I'm not sure why.

    Code:
     #include <stdio.h> 
    #include <stdlib.h>
    #include <sys/types.h> 
    #include <sys/stat.h> 
    #include <fcntl.h> 
    #include <string.h> /* strtok() */
    #include <unistd.h>  
    #define BUFFSIZE 400
    #define MAXPROCS 1000
    
    struct process {
        char *id; 
        int state; 
        int priority;
        int qaunta;
        int working; 
        int waiting;
        struct process *next;
    };
    struct process proc[MAXPROCS]; /* Current process */
    
    int main (int argc, char **argv)
    { 
        int fd, i;
        char buffer[BUFFSIZE];
        char *procstr = NULL;
        char *delim = " \n";
        int ticks; /* Number of ticks for process */
        int ticktime; 
        int status;
    
        fd = open(argv[1], O_RDONLY);
        if ( fd == -1 ) 
        {
            printf("There was an error opening the file \n"); 
            exit(1);
        }
    
        for (i = 0; i < MAXPROCS; i++) /* quit loop if more than MAXPROCS */
        {
            status = read(fd, buffer, sizeof(buffer));
            if (status <= 0)
                break; /* quit loop if EOF or other error */
        
            /* Tokenize 'process string' attributes */
            procstr = strtok(buffer, delim);
            while ( procstr != NULL )
            {
                proc[i].id = strdup(procstr);
                printf("id: %s\n", proc[i].id);
    
                procstr = strtok( NULL,delim);
                proc[i].qaunta = atoi(procstr);
                printf("Qaunta: %d\n", proc[i].qaunta); 
    
                procstr = strtok( NULL ,delim);
                proc[i].priority = atoi(procstr);
                printf("Priority: %d\n", proc[i].priority); 
    
                procstr = strtok( NULL ,delim); 
            }
        }
        close(fd);
        return 0; 
    }

  3. #18
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    Forget the above code, it doesn't count processes properly. See the code below instead.

    Also, I figured out what was wrong with the operation of the strtok function. The buffer is not null-terminated after the read, so you must do so using the return value to fix it.

    Also note that I've fixed your spacing, which is very important.

    The variable named i below should probably be changed to numProcs or some such.

    Code:
    #include <stdio.h> 
    #include <stdlib.h>
    #include <sys/types.h> 
    #include <sys/stat.h> 
    #include <fcntl.h> 
    #include <string.h> /* strtok() */
    #include <unistd.h>  
    #define BUFFSIZE 400
    #define MAXPROCS 1000
    
    struct process {
        char *id; 
        int state; 
        int priority;
        int qaunta;
        int working; 
        int waiting;
        struct process *next;
    };
    struct process proc[MAXPROCS]; /* Current process */
    
    int main (int argc, char **argv)
    { 
        int fd, i, j;
        char buffer[BUFFSIZE+1]; /* extra byte for null */
        char *procstr = NULL;
        char *delim = " \n";
        int ticks; /* Number of ticks for process */
        int ticktime; 
        int status;
    
        fd = open(argv[1], O_RDONLY);
        if ( fd == -1 ) 
        {
            printf("There was an error opening the file \n"); 
            exit(1);
        }
    
        i = 0;
        while ((status = read(fd, buffer, BUFFSIZE))) > 0) /* read until EOF or error */
        {
            /* null terminate buffer */
            buffer[status] = 0;
     
            /* Tokenize 'process string' attributes */
            procstr = strtok(buffer, delim);
            while (procstr)
            {
                proc[i].id = strdup(procstr);
                //printf("id: %s\n", proc[i].id);
    
                procstr = strtok(NULL, delim);
                proc[i].qaunta = atoi(procstr);
                //printf("Qaunta: %d\n", proc[i].qaunta); 
    
                procstr = strtok(NULL, delim);
                proc[i].priority = atoi(procstr);
                //printf("Priority: %d\n", proc[i].priority); 
    
                procstr = strtok(NULL, delim); 
                if (++i >= MAXPROCS)
                    goto break_outer; /* quit loop if more than MAXPROCS */
            }
        }
    break_outer:
    
        printf("Size: %d\n", i);
        for (j = 0; j < i; j++)
        {
            printf("%s %3d %3d\n", proc[j].id, proc[j].qaunta, proc[j].priority);
        }
    
        close(fd);
        return 0; 
    }
    Last edited by oogabooga; 12-31-2011 at 02:00 PM.

  4. #19
    Registered User spendotw's Avatar
    Join Date
    Dec 2011
    Location
    England
    Posts
    40
    Thanks oogabooga

    This seems to have cleared my program up and thank you for resolving the white space problem... though there is still the same issue of getting the segmentation fault and the program only reads up until process24 which is far from the last.

  5. #20
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    This is more difficult than I thought. Your problem probably occurs after the first buffer is used up and you need to read the next one. But the buffer can end at any point, not just at the end of a line. So you need to check for end-of-buffer after every strtok. Worse than that, it's very possible that a field will start at the end of one buffer and continue at the beginning of the next!

  6. #21
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    You might consider reading one character at a time and doing the "end of line" control yourself. You read one char at a time, putting each one in subsequent spots in your temporary buffer. When you read a new line character, you have one full entry. Stop calling read for a moment, null terminate the buffer and strtok the temp buf with just a space as a delimiter, extracting your id, quanta and priority. When you're all done there, go back to reading another line. It's a bit painful, but perhaps less painful than dealing with partial line/field reads and gluing the next bit onto your temporary buffer.

  7. #22
    Registered User spendotw's Avatar
    Join Date
    Dec 2011
    Location
    England
    Posts
    40
    Quote Originally Posted by oogabooga View Post
    This is more difficult than I thought. Your problem probably occurs after the first buffer is used up and you need to read the next one. But the buffer can end at any point, not just at the end of a line. So you need to check for end-of-buffer after every strtok. Worse than that, it's very possible that a field will start at the end of one buffer and continue at the beginning of the next!
    Yes it becoming a real pain to fix :\

    How is it possible that the buffer can end at any point? when it should be NULL terminated once it meets a delimeter.

    When you say end of buffer do you mean each time it reaches the end of line character?
    From the code u provided previously i added a few 'if' statements to test whether the buffer had a newline character, but when compiled the prefix's show that theres no newline character.

    Thanks
    Code:
    {
                proc[i].id = strdup(procstr);
                printf("id: %s\n", proc[i].id);
                if (buffer[i] == '\n')
                printf("End of line detected! \n");
     
                procstr = strtok(NULL, delim);
                proc[i].qaunta = atoi(procstr);
                printf("Qaunta: %d\n", proc[i].qaunta);
                if (buffer[i] == '\n')
                printf("End of line detected! \n");
     
                procstr = strtok(NULL, delim);
                proc[i].priority = atoi(procstr);
                printf("Priority: %d\n", proc[i].priority); 
                if (buffer[i] == '\n')
                printf("End of line detected! \n");
                
     	
                
                procstr = strtok(NULL, delim); 
                if (++i >= MAXPROCS)
                    goto break_outer; /* quit loop if more than MAXPROCS */
            }

  8. #23
    Registered User spendotw's Avatar
    Join Date
    Dec 2011
    Location
    England
    Posts
    40
    Quote Originally Posted by anduril462 View Post
    You might consider reading one character at a time and doing the "end of line" control yourself. You read one char at a time, putting each one in subsequent spots in your temporary buffer. When you read a new line character, you have one full entry. Stop calling read for a moment, null terminate the buffer and strtok the temp buf with just a space as a delimiter, extracting your id, quanta and priority. When you're all done there, go back to reading another line. It's a bit painful, but perhaps less painful than dealing with partial line/field reads and gluing the next bit onto your temporary buffer.
    Well I resorted back to my old code for your suggestion as it copies all previous data read into the buffer each time it reaches the new line character :
    Code:
     
    			if (buffer[i] == '\n')
    			{
    			memcpy(tmpbuff, buffer,BUFFSIZE);
    	 		}
    And I used only space as the delimiter but this only extracted the first "Process0" and qaunta and priority for a short time before resulting in a seg fault.
    Code:
    	 procstr = strtok(tmpbuff, " ");
            while (procstr)
            {
                proc[i].id = strdup(procstr);
                printf("id: %s\n", proc[i].id);
     
                procstr = strtok(NULL, " ");
                proc[i].qaunta = atoi(procstr);
                printf("Qaunta: %d\n", proc[i].qaunta);
    
     
                procstr = strtok(NULL, " ");
                proc[i].priority = atoi(procstr);
                printf("Priority: %d\n", proc[i].priority); 
                
                procstr = strtok(NULL, delim);

  9. #24
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    How is it possible that the buffer can end at any point?
    Your question shows that you do not understand buffers. Since that is what your assignment is apparently about you should probably have learned about it, unless you're confused about having to use "open" and "read" instead of the more usual "fopen" and "fread" (or fgets or fscanf), which handle the buffer for you.

    The code you've given is not even close to what you'd need. What I would do in this situation is make my own function similar to fgets, but that takes a file descriptor instead of a FILE* (and will either need to be passed "buffer" and "buffer_pos" variables (or a struct) or have them as static locals). This is not too difficult, but not entirely trivial either since lines will occasionally span buffer boundaries. With such a function, I would write my main program to read the file line-by-line, and use sscanf (if you're allowed to!?) to extract the data from each line.

    As for "how can the buffer end at any point", suppose the buffer were 10 chars long and you read a file containing the chars "hello world\n":
    Code:
    0123456789
    hello worl
    Notice that the d and newline won't fit. The buffer doesn't know anything about end of lines, it's just a fixed chunk of bytes. The next buffer-full of the file will have to be read before the line can be completed. So we'd copy what we have so far in the buffer to our line string, read the next buffer:
    Code:
    0 123456789
    d\n........
    and add up to the newline to the end of the line string.

    BTW, a bandaid solution to your problem would be to make your buffer size bigger than your file size. But that's cheating if this assignment is all about implementing your own buffer handling.

  10. #25
    Registered User spendotw's Avatar
    Join Date
    Dec 2011
    Location
    England
    Posts
    40
    Quote Originally Posted by oogabooga View Post
    Your question shows that you do not understand buffers. Since that is what your assignment is apparently about you should probably have learned about it, unless you're confused about having to use "open" and "read" instead of the more usual "fopen" and "fread" (or fgets or fscanf), which handle the buffer for you.

    The code you've given is not even close to what you'd need. What I would do in this situation is make my own function similar to fgets, but that takes a file descriptor instead of a FILE* (and will either need to be passed "buffer" and "buffer_pos" variables (or a struct) or have them as static locals). This is not too difficult, but not entirely trivial either since lines will occasionally span buffer boundaries. With such a function, I would write my main program to read the file line-by-line, and use sscanf (if you're allowed to!?) to extract the data from each line.

    As for "how can the buffer end at any point", suppose the buffer were 10 chars long and you read a file containing the chars "hello world\n":
    Code:
    0123456789
    hello worl
    Notice that the d and newline won't fit. The buffer doesn't know anything about end of lines, it's just a fixed chunk of bytes. The next buffer-full of the file will have to be read before the line can be completed. So we'd copy what we have so far in the buffer to our line string, read the next buffer:
    Code:
    0 123456789
    d\n........
    and add up to the newline to the end of the line string.

    BTW, a bandaid solution to your problem would be to make your buffer size bigger than your file size. But that's cheating if this assignment is all about implementing your own buffer handling.
    Thanks for your reply!

    Well my assignment is to create a process scheduler, using only the system functions available ... but nothing specifically based on buffers.

    I used your bandaid solution which worked Although i probably wont learn from this method the most, it works good enough! Thanks oogabooga!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Help: Storing tokenized 2-digit char (infix->postfix converter)
    By misterpogos in forum C++ Programming
    Replies: 3
    Last Post: 09-26-2011, 11:19 AM
  2. Storing Changing Variables in Array
    By Tien1868 in forum C Programming
    Replies: 8
    Last Post: 07-31-2009, 11:55 AM
  3. Storing classes in variables?
    By jw232 in forum C++ Programming
    Replies: 10
    Last Post: 02-19-2009, 06:34 PM
  4. storing variables permanentely
    By Saimadhav in forum C++ Programming
    Replies: 8
    Last Post: 08-09-2008, 09:15 PM
  5. storing alot of variables...
    By MikeyIckey in forum C Programming
    Replies: 11
    Last Post: 05-30-2008, 12:31 PM

Tags for this Thread