Thread: Array of getchar / count

  1. #1
    Registered User
    Join Date
    Oct 2008
    Posts
    10

    Question Array of getchar / count

    Desired Output:
    Code:
    Please begin your input:
    H
    e
    l
    l
    o
    <user presses ^D here>
    The counts were as follows:
    E - 1
    H - 1
    L - 2
    O - 1
    Array "a" will gather the input with this code:
    (scan, a[], and x are ints)
    Code:
            while ((scan = getchar()) != EOF)
            {
                    if(scan != '\n')
                    {
                            a[x] = scan;
                            x++;
                    }
            }
    Now, my second array, b[], will have all of its values set to 0.
    (b[] and y are ints)
    Code:
            for(y = 0; y <= 100; y++)
            {
                    b[y] = 0;
            }
    I know I need to use another for loop to increase the value of b[z] by one whenever there is a duplicate of an a[] value, but how do I go about doing this? Do I need to use "toascii"? Will a bubble sort help me print this out in alphabetical order or mess up the counting? I know how to do it with a ton of variables and if/switch statements, but the arrays are confusing, help!

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Why not use the input as the array index? 'E' and 'H' and so on are perfectly good array indices. That's probably why you have 100 elements in your b array, as there are not actually 100 letters in the alphabet.

  3. #3
    Registered User
    Join Date
    Oct 2008
    Posts
    10
    Hmm, I still cant figure out how to not duplicate the letters in the output and how to correctly assign the value in the second array to add one to the count...

    So, you're telling me to put the b[] array in the while loop and make the incoming input as the elements or somethin?

  4. #4
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    Huh? 2nd array? What!?
    Code:
    unsigned int something[256] = {0};  /* maybe don't hardcode this */
    size_t index = 0;
    
    /* look we got an a */
    index = 'a';
    something[index]++;
    
    /* and a z! */
    index = 'z';
    something[index]++;
    Last edited by zacs7; 11-18-2008 at 10:15 PM.

  5. #5
    Why bbebfe is not bbebfe? bbebfe's Avatar
    Join Date
    Nov 2008
    Location
    Earth
    Posts
    27
    tabstop means that using the ascii value of the character as the index of the array.

    It seems that your program is case-insensitive, so try the function below to obtain the corresponding index of an input char.

    Code:
    int to_array_index(int c) {
        if (c >= 97 && c <= 122) /* ascii a - b */
        {
            return c - 97;
        }
        else if (c >= 65 && c <= 90)  /* ascii A - B */
        {
            return c - 65
        }
        else {
            return -1; /* error, not letter */
        }
    }
    
    /* your main func*/
    int main(int argc, char** argv) {
            int a[26];
            int index = 0;
            int scan;
            while ((scan = getchar()) != EOF)
            {
                    if(scan != '\n')
                    {
                            index = to_array_index(scan);
                            if (index >= 0) {
                                a[index]++;
                            }
                    }
            }
    }
    Thanks zacs7: Naughty, see the FAQ (I think it's in there). ... getchar() returns int and scan should be of type int.
    Last edited by bbebfe; 11-18-2008 at 11:11 PM.
    Do you know why bbebfe is NOT bbebfe?

  6. #6
    Why bbebfe is not bbebfe? bbebfe's Avatar
    Join Date
    Nov 2008
    Location
    Earth
    Posts
    27
    When you want to print the result out, just traversing the array and printing it like this:
    Code:
    printf("&#37;c - %d\n", index+65 /* upper case out */, a[index]);
    Do you know why bbebfe is NOT bbebfe?

  7. #7
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    Code:
    char scan;
            while ((scan = getchar()) != EOF)
    Naughty, see the FAQ (I think it's in there). EOF doesn't fit in a char, hence getchar() returns int and scan should be of type int.

  8. #8
    Why bbebfe is not bbebfe? bbebfe's Avatar
    Join Date
    Nov 2008
    Location
    Earth
    Posts
    27
    You are right, getchar returns an unsigned char, and converts it to int.

    so the modified code is
    Code:
    int scan;
    while ((scan = getchar()) != EOF)
    but getchar is equivalent to getc with stdin as the value of the stream, Returning EOF from it is possible. I think an additional "else" is necessary here:
    Code:
    if(scan != '\n') {
        ...
    }
    else {
        break;
    }
    I have modified my code upper for not misleading other guys.
    Last edited by bbebfe; 11-18-2008 at 11:14 PM.
    Do you know why bbebfe is NOT bbebfe?

  9. #9
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    While "all the world runs ASCII", to abuse the saying, I would respectfully recommend a more portable method of converting alphabetic characters to indices. Something like the following would work:
    Code:
    int aindex(char c)
    {
      const char ab[] = "abcdefghijklmnopqrstuvwxyz";
      char *p = strchr(ab, tolower((unsigned char)c));
      if(p != NULL)
      {
        return p - ab;
      }
    
      return -1;
    }
    Or it might be faster (does it matter?) to use a switch:
    Code:
    int aindex(char c)
    {
      switch(tolower((unsigned char)c))
      {
        case 'a': return 0; case 'b': return 1; case 'c': return 2; case 'd': return 3;
        case 'e': return 4; case 'f': return 5; case 'g': return 6; case 'h': return 7;
        case 'i': return 8; case 'j': return 9; case 'k': return 10; case 'l': return 11;
        case 'm': return 12; case 'n': return 13; case 'o': return 14; case 'p': return 15;
        case 'q': return 16; case 'r': return 17; case 's': return 18; case 't': return 19;
        case 'u': return 20; case 'v': return 21; case 'w': return 22; case 'x': return 23;
        case 'y': return 24; case 'z': return 25;
      }
    
      return -1;
    }
    I suppose this is a rather quixotic quest (especially when UTF-8 has the same values as ASCII for a-z and A-Z), but I think I'll continue to tilt at the windmills that represent portability.

  10. #10
    HelpingYouHelpUsHelpUsAll
    Join Date
    Dec 2007
    Location
    In your nightmares
    Posts
    223
    Better replace the switch w/ this:

    Code:
    int aindex(char c)
    {
      c = tolower((unsigned char)c)
      {
        return c - 97;
      } //you can then check the range of the returned value
    }
    This should be faster than the switch. Also obviously u can modify it to differenciate between upper & lowercase.
    long time no C; //seige
    You miss 100% of the people you don't C;
    Code:
    if (language != LANG_C && language != LANG_CPP)
        drown(language);

  11. #11
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    Quote Originally Posted by P4R4N01D View Post
    Better replace the switch w/ this:

    Code:
    int aindex(char c)
    {
      c = tolower((unsigned char)c)
      {
        return c - 97;
      } //you can then check the range of the returned value
    }
    This should be faster than the switch. Also obviously u can modify it to differenciate between upper & lowercase.
    That completely misses the point of the switch. My method doesn't assume a particular character set, while yours does.

  12. #12
    HelpingYouHelpUsHelpUsAll
    Join Date
    Dec 2007
    Location
    In your nightmares
    Posts
    223
    Well this can be modified easier, and anyway, there must be a better way to avoid all that repitition. Also I didn't realise that multiple character sets need to be accounted for, I thought that just ASCII codes were used. Well then, add something to convert c to ASCII, I propose against the switch, either of the other two recent suggestions are fine. Only because they are easier to modify in future scenarios.
    long time no C; //seige
    You miss 100% of the people you don't C;
    Code:
    if (language != LANG_C && language != LANG_CPP)
        drown(language);

  13. #13
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    If you don't care about portability beyond ASCII, then you can use something simple such as subtracting 'a' or 'A'. If you do care about portability, solving the problem isn't quite as easy as you imply. The "two recent suggestions" are not portable beyond ASCII, which was the entire point of my original post.

    Actually, zacs7's method would be pretty good as long as UCHAR_MAX is 255 and the indices are cast to unsigned char; but it's still not completely portable (because chars can be larger than 8 bits).

  14. #14
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    Well you could assume that a comes before b, and b before c etc.

    Any character set where that isn't the case is retarded...

    Otherwise, I'd use a hash table.

  15. #15
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    A very real character set, EBCDIC, does not have a sequential alphabet. Which may be ridiculous, but it does exist.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. array of pointers/pointer arithmetic
    By tlpog in forum C Programming
    Replies: 18
    Last Post: 11-09-2008, 07:14 PM
  2. Replies: 2
    Last Post: 07-11-2008, 07:39 AM
  3. 2d array question
    By gmanUK in forum C Programming
    Replies: 2
    Last Post: 04-21-2006, 12:20 PM
  4. count array value
    By miryellis in forum C Programming
    Replies: 7
    Last Post: 10-05-2004, 10:02 AM
  5. Quick question about SIGSEGV
    By Cikotic in forum C Programming
    Replies: 30
    Last Post: 07-01-2004, 07:48 PM