Thread: Filtering out non-ASCII keyboard input

  1. #1
    Registered User
    Join Date
    May 2017
    Posts
    6

    Filtering out non-ASCII keyboard input

    Hello all - I have a simple program that mimics taking in a password in the terminal. To do this I turn off echoing and Icanon so that I can process each char as it is typed. I check if the current char value is in the printable ASCII range before printing it, however this fails to catch non ascii input such as the page down or home key.

    Heres the part of the code that I am having trouble with:
    Code:
      //check ASCII range, dont allow non printable chars
      //the var c is the current character that was input
      if(c > 32 && c < 127){
          buf[len++] = c;
          putchar('*');      //print an '*' for each char input
          }
      }
      else{
        //print diff char just for debugging
        putchar('#')
      }
    The else block does get called when I hit a non ascii key like page up, but then prints a few '*', as if it processed more then just one character. Here is the output when run and only entering a single page up key stroke:
    Enter password: #***

    So the else block does catch it at first, yet then 3 more characters are printed even though my loop should only process one at a time since I only have a single char variable and terminal buffering is off. Any help would be great, as I'm pretty stuck on how to filter out these invalid keystrokes.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    ANSI Keyboard Key Codes
    All the extended keys are prefixed with either '\x00' or '\xE0', so you need to take this into account when simply reading strings of bytes.

    You need to parse the stream to detect when an extended key has been pressed. In your case, the two prefixes can be used to always throw away the next byte.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    May 2017
    Posts
    6
    Thanks so much! Now that I know I can identify them by prefix I should be good to go - much appreciated

  4. #4
    Registered User
    Join Date
    May 2017
    Posts
    6
    So this issue now is that since Im only processing one byte at a time, I can strip/skip the leading \x00 or \xE0 but since those are followed by one or two bytes, the rest still prints. At least they are now not control characters but its still not quite what I want. So lets pretend the input key is input as \x005~ my code will skip the \x00 part but 5~ is still printed. I could hard code it to skip two bytes but some only have one byte following the hex prefix so I cant just assume that. Any thoughts?

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    Actually, that's DOS not UNIX/Linux.
    Code:
    #include <stdio.h>
    #include <termios.h>
    #include <unistd.h>
    
    int main ( void ) 
    {
      int ch;
      struct termios oldt, newt;
      
      tcgetattr ( STDIN_FILENO, &oldt );
      newt = oldt;
      newt.c_lflag &= ~( ICANON | ECHO );
      tcsetattr ( STDIN_FILENO, TCSANOW, &newt );
      while ( (ch = getchar()) != EOF ) {
        printf("Code=%d\n", ch);
      }
      tcsetattr ( STDIN_FILENO, TCSANOW, &oldt );
      
      return ch;
    }
    Experiment with this.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Registered User
    Join Date
    May 2017
    Posts
    6
    I liked the idea of your snippet of code that you posted, but it raises a few more questions for me - since I have buffering off I only want one byte per time but these keys send anywhere from 2-4 bytes it seems so I dont all the associated bytes at once. For instance when I hit page up your code outputs:
    Code=27
    Code=91
    Code=53
    Code=126
    All of these are from one single keystroke, and individually each one is in the valid ASCII range that I want except for the first val (27 which is escape). So I know when I see 27 that an escape sequence is coming in, but I have no way to know how many bytes are associated with that sequence as all the numbers that follow are valid, and the number of bytes is variable. This is getting to be more of an interesting problem then I expected!

    P.S. thanks for the quick replies and follow through, I really appreciate the help
    Last edited by sliceofpi; 05-26-2017 at 10:24 AM.

  7. #7
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    The pattern is something like this:
    Code:
    Escape
        '['
            'A' - 'Z'
            '0' - '9'
                optionally more digits
                '~'
        'O'
            'Q','R','S'   F2,F3,F4
    Esc O P is probably F1, but I keep getting help.
    Some of the other function keys are like F5: Esc [ 1 5 ~

    Code:
    #include <stdio.h>
    #include <termios.h>
    #include <unistd.h>
    
    #define ESC '\033'
    #define PRNC(c) printf("%c ", c)  //  "%d "  "%02X "
     
    int main (void)
    {
      int ch;
      struct termios oldt, newt;
    
      tcgetattr (STDIN_FILENO, &oldt);
      newt = oldt;
      newt.c_lflag &= ~(ICANON | ECHO);
      tcsetattr (STDIN_FILENO, TCSANOW, &newt);
    
      while ((ch = getchar()) != EOF) {
        if (ch != ESC)
            PRNC (ch);
        else {
          printf("Esc ");
          ch = getchar();
          PRNC (ch);
          if (ch == '[') {
            ch = getchar ();
            PRNC (ch);
            if (ch >= '0' && ch <= '9') {
              do {
                ch = getchar ();
                PRNC (ch);
              } while (ch != '~');
            }
          }
          else if (ch == 'O') {  // some function keys
            ch = getchar ();
            PRNC (ch);
          }
          else
              printf (" UNKNOWN ");
        }
        putchar ('\n');
      }
    
      tcsetattr (STDIN_FILENO, TCSANOW, &oldt);   
      return 0;
    }

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Input from keyboard
    By B_B in forum C Programming
    Replies: 2
    Last Post: 10-25-2010, 12:27 PM
  2. Keyboard Input
    By CaliJoe in forum C++ Programming
    Replies: 3
    Last Post: 05-08-2009, 09:51 AM
  3. Input of keyboard
    By Gordon in forum Windows Programming
    Replies: 3
    Last Post: 07-13-2008, 06:12 PM
  4. Keyboard input
    By Darkinyuasha1 in forum Windows Programming
    Replies: 7
    Last Post: 06-19-2007, 03:21 PM
  5. Keyboard Input
    By Unregistered in forum C++ Programming
    Replies: 5
    Last Post: 07-29-2002, 11:41 AM

Tags for this Thread