Thread: My Very First (somewhat) Useful Program!

  1. #1
    Registered User Phoebe's Avatar
    Join Date
    Aug 2012
    Posts
    11

    Talking My Very First (somewhat) Useful Program!

    I finally wrote something with a purpose! This program takes a three letter codon and tells the user the amino acid that the codon will make. My next goal is create a program that can take a continuous stream of codons and print out the corresponding amino acid chain.

    Anyway, this is my first useful program (prior to five days ago I had zero programming experiences), so I'm quite proud of myself. Nonetheless I know it must look cumbersome to the seasoned programmer, and I'm quite certain the same result could probably be achieved using more advanced coding techniques. Any suggestions are welcome, as well as things I should be thinking about as I continue to my journey learning the C programming language!

    Code:
    //
    //  main.c
    //  PageA2
    //  Created by Phoebe August 31, 2012
    //
    //  A program to determine the amino acid from a three letter codon
    
    
    #include <stdio.h>
    #include <string.h>
    
    
    char amino_acids[][40] = {
        
        "GCU GCC GCA GCG         Ala",
        "CGU CGC CGA CGG AGA AGG Arg",
        "AAU AAC                 Asn",
        "GAU GAC                 Asp",
        "UGU UGC                 Cys",
        "CAA CAG                 Gln",
        "GAA GAG                 Glu",
        "GGU GGC GGA GGG         Gly",
        "CAU CAC                 His",
        "AUU AUC AUA             Ile",
        "AUG                     START/MET",
        "UUA UUG CUU CUC CUA CUG Leu",
        "AAA AAG                 Lys",
        "UUU UUC                 Phe",
        "CCU CCC CCA CCG         Pro",
        "UCU UCC UCA UCG AGU AGC Ser",
        "ACU ACC ACA ACG         Thr",
        "UGG                     Trp",
        "UAU UAC                 Tyr",
        "GUU GUC GUA GUG         Val",
        "UAA UGA UAG             STOP",
        
    };
    
    
    void find_amino_acid(char search_for[]){
        int i;
        for (i = 0 ; i < 21 ; i ++){
            if(strstr(amino_acids[i], search_for))
                printf("Amino Acid: '%s'", amino_acids[i]+24);
        }
    }
    
    
    int main()
    
    
    {
        char search_for[4];
        printf("Enter a three letter codon in ALL CAPS: ");
        fgets(search_for, 4, stdin);
            
        char *test = strchr(search_for, '\n');
        if (test!=0) *test='\0';
        
        find_amino_acid(search_for);
        
        return 0;
        }

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    >> Enter a three letter codon in ALL CAPS:

    I know this is typical for this particular area of biology, but you might want to work on "cooking" your input. If your program could handle codons that aren't capitalized, if it could handle strings that were too long or too short, it would be better just because it's easier to use.

  3. #3
    Registered User Phoebe's Avatar
    Join Date
    Aug 2012
    Posts
    11
    Thanks for the advice. I'll try doing that.

  4. #4
    Registered User Phoebe's Avatar
    Join Date
    Aug 2012
    Posts
    11
    Here's the updated code. It accepts lower case letters. If the user enters something that is not a codon, it does nothing.
    Code:
    //
    //  main.c
    //  PageA2
    //  Created by Phoebe August 31, 2012
    //
    //  A program to determine the amino acid from a three letter codon
    
    
    #include <stdio.h>
    #include <string.h>
    
    
    char amino_acids[][58] = {
        
        "GCU GCC GCA GCG gcu gcc gca gcg                 Ala",
        "CGU CGC CGA CGG AGA AGG cgu cgc cga cgg aga agg Arg",
        "AAU AAC aau aac                                 Asn",
        "GAU GAC gau gac                                 Asp",
        "UGU UGC ugu ugc                                 Cys",
        "CAA CAG caa cag                                 Gln",
        "GAA GAG gaa gag                                 Glu",
        "GGU GGC GGA GGG ggu ggc gga ggg                 Gly",
        "CAU CAC cau cac                                 His",
        "AUU AUC AUA auu auc aua                         Ile",
        "AUG aug                                         START/Met",
        "UUA UUG CUU CUC CUA CUG uua uug cuu cuc cua cug Leu",
        "AAA AAG aaa aag                                 Lys",
        "UUU UUC uuu uuc                                 Phe",
        "CCU CCC CCA CCG ccu ccc cca ccg                 Pro",
        "UCU UCC UCA UCG AGU AGC ucu ucc uca ucg agu agc Ser",
        "ACU ACC ACA ACG acu acc aca acg                 Thr",
        "UGG ugg                                         Trp",
        "UAU UAC uau uac                                 Tyr",
        "GUU GUC GUA GUG guu guc gua gug                 Val",
        "UAA UGA UAG uaa uga uag                         STOP",
        
    };
    
    
    void find_amino_acid(char search_for[]){
        int i;
        for (i = 0 ; i < 21 ; i ++){
            if(strstr(amino_acids[i], search_for))
                printf("Amino Acid: '%s'", amino_acids[i]+48);
        }
    }
    
    
    int main()
    
    
    {
        char search_for[4];
        puts("Enter a three letter codon: ");
        while(search_for[0]!='X'){
        fgets(search_for, 4, stdin);
            
        find_amino_acid(search_for);
        }
        return 0;
        }

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > "GCU GCC GCA GCG gcu gcc gca gcg Ala",
    And if your user decides to get creative, and type in GcG, then what?

    Lookup the functions in ctype.h, to see if you can normalise the user input to be say all uppercase.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Quote Originally Posted by Phoebe View Post
    Here's the updated code. It accepts lower case letters. If the user enters something that is not a codon, it does nothing.
    Code:
    //
    //  main.c
    //  PageA2
    //  Created by Phoebe August 31, 2012
    //
    //  A program to determine the amino acid from a three letter codon
    
    
    #include <stdio.h>
    #include <string.h>
    
    
    char amino_acids[][58] = {
        
        "GCU GCC GCA GCG gcu gcc gca gcg                 Ala",
        "CGU CGC CGA CGG AGA AGG cgu cgc cga cgg aga agg Arg",
        "AAU AAC aau aac                                 Asn",
        "GAU GAC gau gac                                 Asp",
        "UGU UGC ugu ugc                                 Cys",
        "CAA CAG caa cag                                 Gln",
        "GAA GAG gaa gag                                 Glu",
        "GGU GGC GGA GGG ggu ggc gga ggg                 Gly",
        "CAU CAC cau cac                                 His",
        "AUU AUC AUA auu auc aua                         Ile",
        "AUG aug                                         START/Met",
        "UUA UUG CUU CUC CUA CUG uua uug cuu cuc cua cug Leu",
        "AAA AAG aaa aag                                 Lys",
        "UUU UUC uuu uuc                                 Phe",
        "CCU CCC CCA CCG ccu ccc cca ccg                 Pro",
        "UCU UCC UCA UCG AGU AGC ucu ucc uca ucg agu agc Ser",
        "ACU ACC ACA ACG acu acc aca acg                 Thr",
        "UGG ugg                                         Trp",
        "UAU UAC uau uac                                 Tyr",
        "GUU GUC GUA GUG guu guc gua gug                 Val",
        "UAA UGA UAG uaa uga uag                         STOP",
        
    };
    
    
    void find_amino_acid(char search_for[]){
        int i;
        for (i = 0 ; i < 21 ; i ++){
            if(strstr(amino_acids[i], search_for))
                printf("Amino Acid: '%s'", amino_acids[i]+48);
        }
    }
    
    
    int main()
    
    
    {
        char search_for[4];
        puts("Enter a three letter codon: ");
        while(search_for[0]!='X'){
        fgets(search_for, 4, stdin);
            
        find_amino_acid(search_for);
        }
        return 0;
        }
    When you start coding something up that has a lot of repetition to it, back away from it. 99.9% of the time, the really smart people who designed C, will have a way built into the language, to make it easy to avoid all that repetitious coding.

    Standard practice would be to get the input from the user, and then change it immediately to either uppercase or lowercase, using the built in functions in the ctype.h include file.

    toupper() and tolower(),

    in a loop (they change only one char at a time, so a loop is needed).

    If run-time is a consideration, using a binary search will speed up the search considerably. (May not be a consideration, depending on how many searches, and how many more strings you may need to search through.)

    So far, you're off to a great start!

  7. #7
    Registered User
    Join Date
    May 2012
    Location
    Italy
    Posts
    53
    Also remember to alwasy check the return value of a function.
    In your case you have to check the ret. val. of fgets and strstr

  8. #8
    Registered User Phoebe's Avatar
    Join Date
    Aug 2012
    Posts
    11
    Wow! Thanks for all the advice. Looks like I have some serious homework to do. Thanks again!

  9. #9
    Registered User Phoebe's Avatar
    Join Date
    Aug 2012
    Posts
    11
    I've noticed something funny about my program. Although I designed it to accept only three letters (one codon) at a time, it seems to be able to accept a continuous stream of codons.

    Input: AUG
    Output: START/Met

    Input: AUGUUACCUACUUAA
    Output: START/Met, Leu, Pro, Thr, STOP

    At first I was extremely happy with this discovery. It seemingly increased the usefulness of the program a lot. But something weird happens when you input three different codons in a row that code for the same amino acid.

    Input: GUUGUCGUA
    Expected Output: Val, Val, Val
    Actual Output: Val Val Ala Gly Val

    What happened? I've been tinkering around with the program for a while trying to figure it out.
    Last edited by Phoebe; 09-03-2012 at 12:18 AM.

  10. #10
    Registered User
    Join Date
    May 2012
    Posts
    1,066
    Show your current code.

    Bye, Andreas

  11. #11
    Registered User Phoebe's Avatar
    Join Date
    Aug 2012
    Posts
    11
    Code:
    //
    //  A program to determine the amino acid chain from an RNA input
    
    
    #include <stdio.h>
    #include <string.h>
    #include <ctype.h>
    
    
    
    
    char amino_acids[][58] = {
        
        "GCU GCC GCA GCG gcu gcc gca gcg                 Ala",
        "CGU CGC CGA CGG AGA AGG cgu cgc cga cgg aga agg Arg",
        "AAU AAC aau aac                                 Asn",
        "GAU GAC gau gac                                 Asp",
        "UGU UGC ugu ugc                                 Cys",
        "CAA CAG caa cag                                 Gln",
        "GAA GAG gaa gag                                 Glu",
        "GGU GGC GGA GGG ggu ggc gga ggg                 Gly",
        "CAU CAC cau cac                                 His",
        "AUU AUC AUA auu auc aua                         Ile",
        "AUG aug                                         START/Met",
        "UUA UUG CUU CUC CUA CUG uua uug cuu cuc cua cug Leu",
        "AAA AAG aaa aag                                 Lys",
        "UUU UUC uuu uuc                                 Phe",
        "CCU CCC CCA CCG ccu ccc cca ccg                 Pro",
        "UCU UCC UCA UCG AGU AGC ucu ucc uca ucg agu agc Ser",
        "ACU ACC ACA ACG acu acc aca acg                 Thr",
        "UGG ugg                                         Trp",
        "UAU UAC uau uac                                 Tyr",
        "GUU GUC GUA GUG guu guc gua gug                 Val",
        "UAA UGA UAG uaa uga uag                         STOP",
        
    };
    
    
    void find_amino_acid(char search_for[]){
        int i;
        for (i = 0 ; i < 21 ; i ++){
            if(strstr(amino_acids[i], search_for))
                printf(" %s ", amino_acids[i]+48);
        }
    }
    
    
    int main()
    
    
    {
        char search_for[4];
        puts("This program translates RNA code into an amino acid chain.  Enter the RNA code and press return.  When you are done, enter X and return to quit.");
        while(search_for[0]!='X'){
        fgets(search_for, 4, stdin);
        
        find_amino_acid(search_for);
        }
        return 0;
    }

  12. #12
    Registered User
    Join Date
    May 2012
    Posts
    1,066
    I can't reproduce your problem:
    Code:
    $ ./test
    This program translates RNA code into an amino acid chain.  Enter the RNA code and press return.  When you are done, enter X and return to quit.
    GUUGUCGUA
     Val  Val  Val 
    CAACAG
     Gln  Gln 
    AUUAUCAUA
     Ile  Ile  Ile 
    cgucgccgacggagaagg
     Arg  Arg  Arg  Arg  Arg  Arg    
    gcugccagaaggcgg
     Ala  Ala  Arg  Arg  Arg
    Are you sure you didn't mistype the codons?

    Bye, Andreas

  13. #13
    Registered User Phoebe's Avatar
    Join Date
    Aug 2012
    Posts
    11
    Huh! The problem goes away when I run the program at the command prompt like you did. The problem only seems to occur when run within the Xcode IDE.

  14. #14
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    How about having the database in a text file that gets read at program start up?

    That way, if you decide to add some more data, the program doesn't have to be recompiled.

    It's generally a good idea to separate data out of code, so that the code only contains logic.

  15. #15
    Registered User
    Join Date
    Apr 2011
    Posts
    2
    Quote Originally Posted by cyberfish View Post
    How about having the database in a text file that gets read at program start up?

    That way, if you decide to add some more data, the program doesn't have to be recompiled.

    It's generally a good idea to separate data out of code, so that the code only contains logic.
    In general this is good advice, but I don't see it as very useful in this scenario. There are 20 naturally occurring amino acids each encoded by a frame of 3 bases; something that won't be changing any time soon. Perhaps faster lookup would be more suitable for this program since in translations usually have to be done in high volume. Use of a binary tree here should be able to drastically reduce lookup times.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 6
    Last Post: 07-14-2012, 09:43 AM
  2. Replies: 1
    Last Post: 03-03-2009, 04:47 PM
  3. Replies: 5
    Last Post: 08-16-2007, 11:43 PM
  4. Replies: 18
    Last Post: 11-13-2006, 01:11 PM
  5. making a program leave a msg for background program when it closes
    By superflygizmo in forum Windows Programming
    Replies: 2
    Last Post: 02-06-2006, 07:44 PM

Tags for this Thread