Thread: c code: split string || multiple/optional delimiters || memcpy - optimize

Threaded View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Registered User
    Join Date
    Oct 2013
    Posts
    87

    c code: split string || multiple/optional delimiters || memcpy - optimize

    Hi devs,

    I am writing a code to parse strings with genetic information. Data are clunky with randomness.
    For example,
    GENE1;GENE2
    GENE21,GENE22
    GENE12
    I need name of each gene separately: GENE1, GENE2 and such. It is possible that there is no comma or ; in the string for instance, in GENE12.

    strtok isn't helpful in this kind of situation that is why I worked on this piece of code.

    I wrote code below, it works, but I think it is error prone.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    
    //  gcc -Wpedantic -Wextra -Wall hello.c -o print
    int main(int argc, char *argv[])
    {
    
    
        //char tab[] = "hello;morgan;chase;capital,house";
        char tab[] = "hello,discover"; //split this string
    //    char tab[] = "hello"; //split this string
        printf("we have value as tab %s\n", tab);
    
    
        int itr; //use to iterate
        int str_len = strlen(tab); //use for length variable of string
    
    
        int temp_itr = 0; //use to keep value of itr when needed
        char temp_gene[200]; //store value in this of sub-string
    
    
        for (itr = 0; itr < str_len; itr++)
        {
            if (tab[itr] == ',' || tab[itr] == ';')
            {
                if (temp_itr == 0)
                {
                    //if no comma or ; has been found 
                    memcpy(temp_gene, tab + temp_itr, itr - temp_itr);
                }
                else
                {
                    memcpy(temp_gene, tab + temp_itr + 1, itr - temp_itr - 1);
                }
    
    
                temp_gene[itr - temp_itr] = '\0'; //end with null character
                printf("split wise is %s\n", temp_gene);
                temp_itr = itr;
            }
            temp_gene[0] = '\0'; //set first char null
        }
    
    
        temp_gene[0] = '\0'; //set first char null
        
        if (temp_itr == 0)
        {
            //if no comma or ; has been found 
            memcpy(temp_gene, tab + temp_itr, str_len - temp_itr);
            temp_gene[str_len - temp_itr] = '\0'; //end with null character
        }
        else
        {
            //if we already had temp_itr initiated and string has , or ;
            memcpy(temp_gene, tab + temp_itr + 1, str_len - temp_itr);
            temp_gene[str_len - temp_itr] = '\0'; //end with null character
        }
    
    
        printf("final split wise is %s\n", temp_gene);
    
    
        return 0;
    }
    I find this code a little make-shift.

    I would like to improve this than have multiple if checks. For instance, if (temp_itr == 0) in the end, if no comma or ; is found.

    Thank you.
    Last edited by deathmetal; 05-10-2021 at 07:51 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Help me to optimize this code
    By Fn00 in forum C Programming
    Replies: 12
    Last Post: 09-27-2015, 04:28 PM
  2. How to optimize a program for multiple processsing
    By acho.arnold in forum C Programming
    Replies: 4
    Last Post: 07-08-2013, 09:57 AM
  3. Replies: 7
    Last Post: 10-01-2010, 04:09 PM
  4. String tokenizer and delimiters
    By John_L in forum C Programming
    Replies: 5
    Last Post: 11-06-2007, 07:22 PM
  5. strtok and string delimiters
    By Leonardo in forum C Programming
    Replies: 1
    Last Post: 05-01-2003, 04:28 PM

Tags for this Thread