Thread: text comparing

  1. #16
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    maybe
    int ret = sscanf("A 2222222","%1[A-Z]%1[A-Z] %3[0-9]",temp1,temp2,temp3);

    or use %n to determine how many characters are read
    or strlen(temp1)
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  2. #17
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    heres a complete other way using bits to represent the pattern it should be fairly fast

    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main(void)
    {
            char buffer[12], *ptr;
    
            /* 0 = letter, 1 = digit */
    
            unsigned int alternating_mask = 0x2AA;   /* 0101010101 first possible sequence */
            unsigned int preceding_mask = 0x3E0;    /* 1111100000 2nd possible sequence */
            unsigned int following_mask = 0x1F;     /* 0000011111 3rd possible sequence */
    
            unsigned int i;                         /* binary representation of the input */
    
            while (1)
            {
    
                    fgets(buffer, sizeof(buffer), stdin);
    
                    if ((ptr = strchr(buffer, '\n')) != NULL)
                            *ptr = '\0';
    
                    i = 0;
                    for (ptr = buffer; *ptr != '\0'; ptr++)
                    {
                            i <<= 1;
    
                            if (isdigit(*ptr))
                                    i += 1;
                    }
    
                    if (i == alternating_mask) printf("Pattern #1\n");
                    else if (i == preceding_mask) printf("Pattern #2\n");
                    else if (i == following_mask) printf("Pattern #3\n");
                    else printf("Unknown pattern\n");
            }
    
            return 0;
    }
    output:
    1q2w3e4r5t
    Pattern #1

    11111wwwww
    Pattern #2

    ttttt55555
    Pattern #3

    qq2233ee5r
    Unknown pattern

    11111wwww
    Unknown pattern

    haven't tested it all that much, but looks like its working..and you can see how its setup

  3. #18
    Registered User
    Join Date
    Dec 2006
    Posts
    22
    sl4nted, I like your code, it is exactly what I was looking for at the beggining but now I think it will be complicated because I will need too much patterns. Because the old spanish plates can start one or two leters and finish also with one or two (those with () in an other message), so I will need 7 patterns. It doesn't seems to be a good solution. But I like it a lot.

    vart, I don't know how to use %n. I can't copy the code because I am using two computers but it is more or less like that

    Code:
    #define PATERN1 "%4[0-9] %3[A-Z] %s"       //1111 AAA
    #define PATERN2 "%2[A-Z] %4[0-9] %2[A-Z] %s"      //A(A) 1111 A(A)
    #define PATERN3 "%2[A-Z] %5[0-9] %s"   // A(A) 11111
    // I put an extra rule in the pattern to check that there aren't more characters
    
    
    int main(){
    
    char entry[80], one[10], two[10], three[10], four[10]; strcpy(entry,"1111 AAA"); // an example if(sscanf(entry,PATERN1,one,two,three) == 2 && strnlen(one,10) == 4 && strnlen(two,10) == 3){
    printf("Spanish");
    }else if (sscanf(entry,PATERN2,one,two,three,four) == 3 && strnlen(two,10) == 4){
    printf("Spanish");
    }else if (sscanf(entry,PATERN3,one,two,three) == 2 &&& strnlen(two,10) == 5){
    printf("Spanish");
    }else{
    printf("Not spanish");
    } return 0;
    }

    So if you have an idea to make it better please tell me.

    Thanks a lot.

  4. #19
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    the reason I originally said look at sscanf was because I figured you would be parsing and using the numbers/letters for somthing. So you could kill 2 birds with one stone using sscanf. now I see that you really just want to see which pattern if any, the plate is in. I would suggest going with varts first answer and, implimenting it with isdigit, isalpha. theres really no need for sscanf here. Yes it may be a bit more complicated in the sense that it has more lines and loops etc. but it's a better way to do it.

  5. #20
    Registered User
    Join Date
    Dec 2006
    Posts
    22
    I discard this option at the begginig because I think it will take too long for the execution.

    I need the program be as quick as it can be, so I prefer the fastest solution instead of the maybe the most correct one.

    I don't care if the code is complicated or not, so if you think that it will be a best solution to do it with isalpha and isdigit I will trust you and change it. But if you think that my solution is good and fast I also will trust you.

    I have no idea about C programming, I am working with it only for one mounth and anybody teach me, so I do as you say.

    Thanks a lot again.

  6. #21
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    I haven't looked at the source for sscanf, but I would assume yes the isdigit, isalpha method is faster speed wise.

    heres some reasoning, each time you call sscanf it traverses the string.
    in the isdigit, isalhpa method, with patterns like I showed. The string is traversed once, thats it
    then theres just simple camparisons which are fast

    ::also stlen is gonna add time to the sscanf version aswell
    Last edited by sl4nted; 12-12-2006 at 11:07 AM.

  7. #22
    Registered User
    Join Date
    Dec 2006
    Posts
    22
    So OK I use isdigit with the patterns like you said. But if letter is 1 and number is 0, what is space. Because the OCR give me the answer whith spaces if the plate have spaces.

  8. #23
    Registered User
    Join Date
    Dec 2006
    Posts
    22
    I think I have an idea.

    It is more or less the example you give me but whith strings instead of bits.

    But I have a technical problem, something I am sure it is possible to do but I don't know how.

    The problem is that I have a char[] and I want to add a new letter at the end just like

    char out[]
    out = out + "A"

    for example.

    So this way I can add an A if it is a letter, a 1 if it is a number an a space if it is a space, and compare it whith the patterns.

    I don't know what is the code to do this.

    I hope you undestand what I want to do.

  9. #24
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    use something like
    Code:
    char out[MAX_LENGTH];
    memset(out,0,sizeof(out));
    for(i=0;i<len;i++)
    {
       if(isalpha(input[i]))
          out[i]='A';
       else ...
       ...
       else
          out[i] = ' ';
    }
    strcmp(out,PATTERN)
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  10. #25
    Registered User
    Join Date
    Dec 2006
    Posts
    22
    I had an idea to use only 3 patterns instead of 7.

    This is the code I use :

    Code:
    #include <stdio.h>
    #include <string.h>
    #include <ctype.h>
    
    #define LETTER "A" //patrón cuando encontramos una letra
    #define NUMBER "1" //patrón cuando encontramos un número
    #define SPACE " " //patrón cuando encontramos un espacio
    
    #define PATTERN1 "1111 AAA"
    #define PATTERN2 "AA 1111 AA" 
    #define PATTERN3 "AA 11111"
    
    
    int main(){
    	char entry[81], out[10]="";
    	strcpy(entrada, "SS 1235 J");
    
    	int size = strnlen(entrada, 81);
    	if(size > 7 || tamano < 11){  //size of the spanish plates
    		for(int i=0; i<size; i++){
    			
    			if(isdigit(entry[i])){
    				strcat(out,NUMBER);
    			
    			}else if(isalpha(entry[i])){
    				strcat(out,LETTER);
    			
    				//if the plate is AA 1111 A we make it become AA 1111 AA
    				if(i==tamano-1 && strnlen(salida, sizeof(salida)) == 9){
    					strcat(out,LETRA);
    				}
    			
    			}else{
    				// if the plate start with "A " we make it become "AA "
    				if(i==1 && strcmp(&salida[0],LETRA) == 0){
    					strcat(out,LETTER);
    				}
    				strcat(out,SPACE);
    			}
    		}
    		
    		if(strcmp(out,PATTERN1) == 0 || strcmp(out,PATTERN2) == 0 || strcmp(out,PATTERN3) == 0){
    			printf("spanish plate \n");
    			return 0;
    		}
    	}
    	printf("no spanish plate \n");
    	return 1;
    		
    	
    }

    I hope it is better than the other.

    Thanks a lot.

  11. #26
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    strcmp(&salida[0],LETRA) == 0
    when i == 1 is equivalent to
    salida[0] == 'A'

    also using strcat to add one symbol a little bit everhelming...
    you can just use one char assignement
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  12. #27
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    your getting back into a bunch string comaprisons lengths etc. what about spaces?
    if you find a space the current i has to be one of the following
    binary 1111 if 4 digits preceed it
    0 if 2 letters preceed it

    notice the case where 00111100 is the pattern a space at 00 1111<here> can still just be tested against 1111

  13. #28
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Code:
    #include<stdio.h>
    
    size_t parse( char *s )
    {
        if( s )
        {
            size_t d[][3] =
            {
                { 4, 3, 0 },
                { 2, 4, 2 },
                { 2, 5, 0 },
            };
    
            size_t a = 0, b = 0, c = 0, x;
            while( *s && ' ' != *s++ ) a++;
            while( *s && ' ' != *s++ ) b++;
            while( *s && ' ' != *s++ ) c++;
    
            for( x = 0; x < sizeof d / sizeof d[0]; x++ )
            {
                if( d[ x ][ 0 ] == a && d[ x ][ 1 ] == b && d[ x ][ 2 ] == c )
                    return 1;
            }
    
        }
        return 0;
    }
    
    int main( void )
    {
        char p1[] = "1111 AAA";
        char p2[] = "AA 1111 AA";
        char p3[] = "AA 11111";
        
        printf("%d %d %d", parse( p1 ), parse( p2 ), parse( p3 ) );
    
        return 0;
    }
    Something like that?

    Quzah.
    Hope is the first step on the road to disappointment.

  14. #29
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    theres a whole lot of uninitialized variables in there.

    I changed mine to work for the 3 patterns you gave
    and tested with these inputs
    111aa
    11 111 a
    1111 aaa
    1111aaa
    aa 11
    aa 1111 aa
    aa1111aa
    aa1111a
    aa111aaa
    aa 11111
    aa 1111
    a 1111
    a1111

    output was:
    111aa
    Unknown pattern
    Processing Time: 0.406494 milliseconds
    11 111 a
    Unknown pattern
    Processing Time: 0.090096 milliseconds
    1111 aaa
    Pattern #1
    Processing Time: 0.151079 milliseconds
    1111aaa
    Unknown pattern
    Processing Time: 0.138146 milliseconds
    aa 11
    Unknown pattern
    Processing Time: 0.128862 milliseconds
    aa 1111 aa
    Pattern #2
    Processing Time: 0.100856 milliseconds
    aa1111aa
    Unknown pattern
    Processing Time: 0.114127 milliseconds
    aa1111aa^?
    Unknown pattern
    Processing Time: 0.142931 milliseconds
    aa1111aaa
    Unknown pattern
    Processing Time: 0.136956 milliseconds
    aa 11111
    Pattern #3
    Processing Time: 0.110694 milliseconds
    aa 1111
    Unknown pattern
    Processing Time: 0.083907 milliseconds
    a 1111
    Unknown pattern
    Processing Time: 0.103737 milliseconds
    a1111
    Unknown pattern
    Processing Time: 0.146526 milliseconds

    averaging 0.142647 milliseconds per input

    so theres a speed goal for you to use.

  15. #30
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by sl4nted
    theres a whole lot of uninitialized variables in there.
    Not in mine there isn't. Who are you talking to? If you need more patterns, add them. It only takes one line of code. You can also change one line of code to allow for <= instead of ==.


    Quzah.
    Last edited by quzah; 12-12-2006 at 06:28 PM.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. DirectX | Drawing text
    By gavra in forum Game Programming
    Replies: 4
    Last Post: 06-08-2009, 12:23 AM
  2. A bunch of Linker Errors...
    By Junior89 in forum Windows Programming
    Replies: 4
    Last Post: 01-06-2006, 02:59 PM
  3. Unknown Memory Leak in Init() Function
    By CodeHacker in forum Windows Programming
    Replies: 3
    Last Post: 07-09-2004, 09:54 AM
  4. Scrolling The Text
    By GaPe in forum C Programming
    Replies: 3
    Last Post: 07-14-2002, 04:33 PM
  5. Replies: 1
    Last Post: 07-13-2002, 05:45 PM