Thread: Using scanf with enum types

  1. #1
    Registered User
    Join Date
    Mar 2010
    Posts
    40

    Using scanf with enum types

    Hi all,

    I've got my own enum type set up as follows

    Code:
    typedef enum aa 
    {
    	a= 0, ala= 0, alanine= 0,
    	r= 1, arg= 1, arginine= 1,
    	n= 2, asn= 2, asparagine= 2,
    	d= 3, asp= 3, aspartic_acid= 3,
    	c= 4, cys= 4, cysteine= 4,
    	q= 5, gln= 5, glutamine= 5,
    	e= 6, glu= 6, glutamic_acid= 6,
    	g= 7, gly= 7, glycine= 7,
    	h= 8, his= 8, histidine= 8,
    	i= 9, lle= 9, isoleucine= 9,
    	l= 10, leu= 10, leucine= 10,
    	k= 11, lys= 11, lysine= 11,
    	m= 12, met= 12, methionine= 12,
    	f= 13, phe= 13, phenylalanine= 13,
    	p= 14, pro= 14, proline= 14,
    	s= 15, ser= 15, serine= 15,
    	t= 16, thr= 16, threonine= 16,
    	w= 17, trp= 17, tryptophan= 17,
    	y= 18, tyr= 18, tyrosine= 18,
    	v= 19, val= 19, valine= 19
    	
    } aa;
    And now I'd like to use scanf to read a line and store one part of it as this datatype aa.

    The data I'm reading in takes the format of integercharacter, such as 19a, 23c, 243q, etc.

    Currently I'm reading it in as follows

    Code:
    int focal_position;
    char focal_amino_acid;
    
    if (sscanf(string, "%d%c", &focal_position, &focal_amino_acid)!=2)
    	{
    		printf("could not read in data from string in function tokenizer()\n");
    		exit(255);
    	}
    i.e. I'm reading in the a in 19a (for example) as a character, but I'd prefer to read it in as this enum type. Is there easy way to do that?

    Thanks,
    Brad

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    There is no way to tokenize a string in C, i.e. you can't read in a character or string of user input and be able to use what they entered as an identifier (vairable name, enum value, etc). You can either create a mapping that relates the string 19a to alanine or whatever, or you can try an interpreted language like PHP that allows tokenizing strings.

  3. #3
    Registered User
    Join Date
    Mar 2010
    Posts
    40
    I don't need to be able to identify what they entered as an specific identifier. I know that the number will always represent the variable position, and that the character that immediately follows it will always represent the amino acid at the specified position. What I want to be able to do is read in the character that comes immediately after the specified position (e.g. the a that follows 19, in 19a) as an 'a' for the typedef enum specified above, and not an a.

    I can think of crude/brute force approaches using a case statement, but I figured that there must be a more elegant way to read in a character and store it's value in an enum'd datatype as appropriate.

  4. #4
    Registered User
    Join Date
    Mar 2010
    Posts
    40
    By the way, strtok allows you to tokenize strings in C, doesn't it?

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by bhdavis1978
    By the way, strtok allows you to tokenize strings in C, doesn't it?
    Yes, but the strings are broken into tokens that have nothing to do with the tokens in your C program.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Ahh, yes, but with a different meaning for "tokenize". As strtok means it, it breaks the string up into tokens, or chunks, based on certain characters, e.g. strtok("this is a string", " ") would break the string up into words at the space characters. The way I'm using tokenize refers to tokens from the compilers standpoint. A word the compiler can understand and map to a known identifier. If i have something like (this might not compile, it's for instructional use only):

    Code:
    enum {
        x = 17
    };
    char c = 'x';
    
    if (c == x)
    ...
    There is no way for the compiler to take the value in c and use that as an identifier. c refers to the ASCII character 'x', having a numerical value of 120, while the identifier x has a numerical value of 17. If this were PHP (a language I suck at), I think something like
    Code:
    $x = 17;
    $c = "x";
    
    if ($$c == $x) // this would evaluate to true.
    There are clever tricks with something like X-macros (C preprocessor - Wikipedia, the free encyclopedia), or you could store the mappings explicitly somewhere, in a giant array at the top of yoru file (kinda ugly), or a text file to be read in and processed at startup (a bit nicer).

  7. #7
    Registered User
    Join Date
    Dec 2007
    Posts
    2,675
    Not knowing your end game, how I'd approach this:

    Code:
    #include <stdio.h>
    
    typedef struct AminoAcid_
    {
        char abb1;
        char abb2[4];
        char fullName[14];
    } AminoAcid;
    
    static const AminoAcid AminoAcids[] =
    {
        { 'a', "ala", "alanine" },
        { 'r', "arg", "arginine" },
        { 'n', "asn", "asparagine" },
        { 'd', "asp", "aspartic acid" },
        { 'c', "cys", "cysteine" },
        { 'q', "gln", "glutamine" }
    };
    
    static int getAminoAcidIndex(const char aa)
    {
        int i = 0;
        while (i < sizeof(AminoAcids)/sizeof(AminoAcids[0]))
        {
            if (AminoAcids[i].abb1 == aa)
                return i;
            ++i;
        }
        return -1;
    }
    
    int main(void)
    {
        int focal_position = 0;
        char focal_aa = ' ';
        while (1)
        {
            printf("Enter a number/AA combo (-1 to stop): ");
            if (scanf("%d%c", &focal_position, &focal_aa) == 2)
            {
                if (focal_position == -1) break;
    
                int aaIndex = getAminoAcidIndex(focal_aa);
                if (aaIndex != -1)
                {
                    printf("Found %s (%s) @ %d\n",
                           AminoAcids[aaIndex].fullName,
                           AminoAcids[aaIndex].abb2, focal_position);
                }
            }
            int c = 0;
            while ((c = getchar()) != '\n') ;
        }
    
        return 0;
    }

  8. #8
    Novice
    Join Date
    Jul 2009
    Posts
    568
    You can't read it directly as an enum, but you can convert a character into an appropriate number easily enough, like so.
    Code:
    char ch;
    sscanf( string, "%c", &ch );
    int val = ch - 'a';    // 'A' if your characters are uppercase.
    You can then assign val to your enum variable.

  9. #9
    Registered User
    Join Date
    Mar 2009
    Posts
    48
    If the format of the given string is strict ie with no spaces between the number and amino-acid name, instead of making an enum, i would rather treat it as a static C-string array with each element's index as its focal position.
    Code:
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    #include <errno.h>
    
    static const char *const amino_acids[ ] = {
                                         "alamine",
    				     "arginine",
    				     "asparagine"
                                       };
    
    int parse_string( char s[ ] )
    {
        char *p;
        long pos;
        
        pos = -1;
        p = NULL;
        if ( s == NULL ) {
             return -1;
        }
        errno = 0;
        pos = strtol( s, &p, 10 );
        
        if ( errno == ERANGE  || pos < 0 || p == NULL || *p == '\0' || ( size_t )pos >= sizeof amino_acids / sizeof *amino_acids
            || strstr( amino_acids[ pos ], s + pos ) == NULL ) {
            return -1;
        }
    
        return pos;
    
    }
    
    int main( void )
    {
        char s[ BUFSIZ ];
    
        while ( fgets( s, sizeof s, stdin ) != NULL ) {
              char *p;
              int t;
    
    	  p = NULL;
    	  if ( ( p = strchr( s, '\n' ) ) != NULL ) {
    	     *p = '\0'; 
    	  }
    	  if ( ( t = parse_string( s ) ) != -1 ) {
                  printf( "%s \n",amino_acids[ t ] );
              } else {
                 puts( "Not found. Incorrect format" );
              }
        }
    
        return 0;
    }
    edit:
    Caveats( bugs ):
    1. If the name of an amino acid happens to be a part of another amino acid's name and the focal position matches, then it will incorrectly accept it.
    2. If your abbreviations format changes: for eg: alanine is abbreviated as aln, then you may want to develop your own function to deal with amino acids abbreviation.
    Last edited by zalezog; 11-11-2010 at 12:46 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Odd enum error...
    By tjpanda in forum C++ Programming
    Replies: 2
    Last Post: 09-13-2008, 03:08 AM
  2. Replies: 6
    Last Post: 08-23-2008, 01:16 PM
  3. enum
    By JerryL in forum C++ Programming
    Replies: 5
    Last Post: 02-25-2004, 05:45 PM
  4. Scanf and integer...
    By penny in forum C Programming
    Replies: 3
    Last Post: 04-24-2003, 06:36 AM
  5. scanf - data is "put back" - screws up next scanf
    By voltson in forum C Programming
    Replies: 10
    Last Post: 10-14-2002, 04:34 AM