Thread: text comparing

  1. #31
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    no not yours...in his...and not uninitialised....undefined

  2. #32
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    letters and digits are in specific places though quzah. not just spaces.

  3. #33
    Registered User
    Join Date
    Dec 2006
    Posts
    22
    In mine there is a lot of unizialised variables because I forgot to translate a lot of things. My code is in spanish and I tried to translate it into English for you tu understand it better, but it was late and I forgot to translate a lot of things, for example, LETRA = LETTER and salida = out. I will rewrite it again :

    Code:
    #include <stdio.h>
    #include <string.h>
    #include <ctype.h>
    
    #define LETTER "A" //patrón cuando encontramos una letra
    #define NUMBER "1" //patrón cuando encontramos un número
    #define SPACE " " //patrón cuando encontramos un espacio
    
    #define PATTERN1 "1111 AAA"
    #define PATTERN2 "AA 1111 AA" 
    #define PATTERN3 "AA 11111"
    
    
    int main(){
    	char entry[81], out[10]="";
    	strcpy(entrada, "SS 1235 J");
    
    	int size = strnlen(entry, 81);
    	if(size > 7 || size< 11){  //size of the spanish plates
    		for(int i=0; i<size; i++){
    			
    			if(isdigit(entry[i])){
    				strcat(out,NUMBER);
    			
    			}else if(isalpha(entry[i])){
    				strcat(out,LETTER);
    			
    				//if the plate is AA 1111 A we make it become AA 1111 AA
    				if(i==size-1 && strnlen(out, sizeof(out)) == 9){
    					strcat(out,LETTER);
    				}
    			
    			}else{
    				// if the plate start with "A " we make it become "AA "
    				if(i==1 && strcmp(&out[0],LETTER) == 0){
    					strcat(out,LETTER);
    				}
    				strcat(out,SPACE);
    			}
    		}
    		
    		if(strcmp(out,PATTERN1) == 0 || strcmp(out,PATTERN2) == 0 || strcmp(out,PATTERN3) == 0){
    			printf("spanish plate \n");
    			return 0;
    		}
    	}
    	printf("no spanish plate \n");
    	return 1;
    		
    	
    }

    In the example you give me, there is a problem that make it complicated. Spanish plates can be :
    1111 AAA
    A 1111 A
    A 1111 AA
    AA 1111 A
    AA 1111 AA
    A 11111
    AA 11111

    In the example you give me, you will need 7 patterns. I tried to make it so you will only need 3

    strcmp(&salida[0],LETRA) == 0
    when i == 1 is equivalent to
    salida[0] == 'A'
    This is not exactly like that, I tried to do that if we find a space in the second position and in the first one there is a letter we add a letter to the second position and then we put a space. We need this if we had a plate that is A 11111, to make it become AA 11111, so we will use only one pattern for this case instead of 2.

    Or maybe I didn't understand well, and what you mean is to use " salida[0] == 'A' " instead of "strcmp". I will try with this.


    Maybe you think that it is better to use more patterns and make the code be easier.

    I am sorry because I think I am causing much troubles.

  4. #34
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    1111 AAA
    A 1111 A
    A 1111 AA
    AA 1111 A
    AA 1111 AA
    A 11111
    AA 11111
    Like I said earlier, if you need more cases, just add them:
    Code:
    size_t d[][3] =
            {
                { 4, 3, 0 },
                { 1, 4, 1 },
                { 2, 4, 1 },
                { 2, 4, 2 },
                { 1, 5, 0 },
                { 2, 5, 0 },
            };
    Sure you can optimize it if you really need to, also like I said earlier, by changing the if check slightly. In either case, what I posted should do the trick. There's no need to complicate it any more than a few simple loops. (And you can do it in fewer loops than I did if you really want to.)
    Quote Originally Posted by sl4nted
    letters and digits are in specific places though quzah. not just spaces.
    Of course they're in specific places. That's what the patterns they've described are for. Try to understand the logic of what it's doing.

    However, it really doesn't matter if it's a letter or a number. If for some reason you need that, then simply return a different value, say the index of the pattern match. Naturaly you'll want to modify it to return an error value instead of zero, or make an empty pattern to be at index zero and return that on error.


    Quzah.
    Last edited by quzah; 12-13-2006 at 03:39 AM.
    Hope is the first step on the road to disappointment.

  5. #35
    Registered User
    Join Date
    Dec 2006
    Posts
    22
    Ok I think I'm trying to complicate it too much.

    I will do test with your examples and I will choose the fastest one.

    Maybe I am complicating it too much trying to use less patterns but maybe it is fastest to compare with 7 patterns than doing strange things to use only 3.

    Thanks a lot for your help

  6. #36
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    You can do it with three. Just check each value to see if it falls in a range. Or you do something fun, like this:
    Code:
    size_t d[] =
            {
                430, /*{ 4, 3, 0 },*/
                141, /*{ 1, 4, 1 },*/
                241, /*{ 2, 4, 1 },*/
                242, /*{ 2, 4, 2 },*/
                150, /*{ 1, 5, 0 },*/
                250, /*{ 2, 5, 0 },*/
            };
    Then you just do:
    Code:
    while( *s && ' ' != *s++ ) a++;
    while( *s && ' ' != *s++ ) b++;
    while( *s && ' ' != *s++ ) c++;
    num = (a * 100) + (b * 10) + c;
    There. Same thing, tiny edit. Now you can loop through your array...
    Code:
    for( x = 0; x < sizeof array / sizeof array[0]; x++ )
        if( array[ x ] == num )
            return num;
    return 0;
    Or you can use a switch:
    Code:
    switch( num )
    {
        case 430:
        case 141:
        ...
        case 250: return num;
        default: return 0;
    }
    Or something else amusing...


    Quzah.
    Last edited by quzah; 12-13-2006 at 04:52 AM.
    Hope is the first step on the road to disappointment.

  7. #37
    Registered User
    Join Date
    Dec 2006
    Posts
    22
    Maybe I am confusing you more, but my boss just tell me that we are interested, not to see which plates are spanish, but which are not.

    So realy what I need to know is which plate is not like the patterns.

    Maybe with this information we can make a different programm or I have to use the same.

  8. #38
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    I'm not confused at all. It's quite simple. Anything that returns a zero from the above example would not be a standard plate, because it wouldn't match any pattern as you've described them.


    Quzah.
    Hope is the first step on the road to disappointment.

  9. #39
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    just pointing out that your previous versin didn't work with the 3 patterens. ie. A3AA 333 would return 1 when it should return 0. And hes trying to make it as fast as possible so I really think just changing its value to a binary number and testing it with if (pattern == 145) for example will far be the quickest option.

  10. #40
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    test data
    Code:
        char *tests[] = { "911aB", "AAA 564", "6114 DFa", "1111Aaa", "aa 11", "GF 1511 SS",
                          "aa111aa", "aa111a", "aa111aaa", "XX 11111", "a@ 11311", "a 4511",
                          "a1111" };
    Right Benchmark: 2.931 microsecs
    Right Benchmark: 1.996 microsecs
    Right Benchmark: 1.422 microsecs
    Right Benchmark: 1.470 microsecs
    Right Benchmark: 1.446 microsecs
    Right Benchmark: 1.550 microsecs
    Right Benchmark: 1.428 microsecs
    Right Benchmark: 1.429 microsecs
    Right Benchmark: 1.488 microsecs
    Right Benchmark: 1.595 microsecs
    Wrong Benchmark: 1.604 microsecs
    Right Benchmark: 1.360 microsecs
    Right Benchmark: 1.434 microsecs
    Average cputime per call: 1.627154

    Right Benchmark: 1.839 microsecs
    Right Benchmark: 0.902 microsecs
    Right Benchmark: 0.922 microsecs
    Right Benchmark: 0.785 microsecs
    Right Benchmark: 0.696 microsecs
    Right Benchmark: 1.074 microsecs
    Right Benchmark: 0.750 microsecs
    Right Benchmark: 0.649 microsecs
    Right Benchmark: 0.750 microsecs
    Right Benchmark: 0.860 microsecs
    Right Benchmark: 0.494 microsecs
    Right Benchmark: 0.560 microsecs
    Right Benchmark: 0.607 microsecs
    Average cputime per call: 0.837538

    column 1 = whether the function returned the right value, the rest is the time it took to return.
    using gethrtime() (specifically for timing). the first set of tests is quzahs algorithm. the second is switching to a number and testing against patterns as binary numbers. from these you can see its about twice as fast.

    ran 3 tests: with very similar results:
    Average cputime per call: 1.542385
    vs.
    Average cputime per call: 0.828923

    Average cputime per call: 1.478154
    vs.
    Average cputime per call: 0.737077

    then switched order of calls, so converting to binary ran first.
    quzahs algorithm increased speed to 1.363385 and 1.362769 microsecs
    while other stayed consitent with 0.808385 and 0.834538 microsecs

    ::edit if you want to see the full program I tested with I'm glad to post it
    Last edited by sl4nted; 12-13-2006 at 08:15 AM.

  11. #41
    Registered User
    Join Date
    Dec 2006
    Posts
    22
    sl4nted, I was thinking about using your code, because finaly I have to discard spaces because of some problems whith the OCR.

    But your system has a little problem. You define the pattern with bytes, so if the pattern is AA 11111 you enter 1F, but if it si A 11111 you also put 1F. That is not so bad. But if you enter AAA 11111 it will say OK is a good one, but it isn't. I think this will produce some confusions.

    Finaly I think I will use a mix of yours and mine.

  12. #42
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    yes, but I think I explained previously, if you used this method, you would have to make sure the spaces are in the right places.

    so for your example: A 11111 does = the correct pattern, but a space is not allowed to be in the 2nd position, so the function returns 0 right when it finds it.
    AAA 11111 nor is it allowed in the 4th
    only the 3rd
    Last edited by sl4nted; 12-13-2006 at 10:58 AM.

  13. #43
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    Code:
    int parse2(char *p)
    {
            int pos, spc, p_val;
            unsigned int i;
    
            i = 0, pos = 0, spc = 0;
    
            for (; *p != '\0'; p++, pos++)
            {
                    if (*p == ' ')
                    {
                            if ((pos == 2) && i == 0)
                            {
                                    spc++;
                                    continue;
                            }
                            else if (((pos == 4) || pos == 7) && i == 15)
                            {       spc++;
                                    continue;
                            }
                            else return 0;
                    }
    
                    i <<= 1;
    
                    if (((p_val = *p - '0') < 10) && p_val >= 0)
                            i += 1;
                    else if (((p_val = *p - 'a') < 26) && p_val >= 0)
                            continue;
                    else if (((p_val = *p - 'A') < 26) && p_val >= 0)
                            continue;
                    else return 0;
            }
    
            switch (i)
            {
                    case 31:
                    case 120: if (spc == 1) return 1;
                    case 60: if (spc == 2) return 1;
            }
    
            return 0;
    }
    works for only the 3 patterns you have described...if you want to add more, you'll have to add a case entry and a line to test for the space positions. if yu want it fast it has to be specific

  14. #44
    Registered User
    Join Date
    Dec 2006
    Posts
    22
    Sorry my english is very bad.

    I can't use the spaces because I tried with many examples and in one of them, the OCR put an example when there isn't. So it sais that it isn't a Spanish plate, but it is. So I discard spaces.

    I read from the OCR, get the plate, erase the spaces and compare it with the patterns. That's wy I thought about using your the first code you wrote. But, making the 7 patterns I can have I realised that for differents plates I have the same pattern. So I realise that it won't work properly in some cases.

    But I'm realy grateful for your help.

  15. #45
    Registered User
    Join Date
    Nov 2006
    Posts
    176
    I coule care less what you use. milliseconds and microseconds are nothing. Unless your test data is huge, you'll see no differnce. If this is going to be speed tested against the rest of your class or somthing, you shoud use my method. But if you just want it not to lag when processing, you don't need to worry about that. you'll notice no differnce. quzahs is better for ease updating etc.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. DirectX | Drawing text
    By gavra in forum Game Programming
    Replies: 4
    Last Post: 06-08-2009, 12:23 AM
  2. A bunch of Linker Errors...
    By Junior89 in forum Windows Programming
    Replies: 4
    Last Post: 01-06-2006, 02:59 PM
  3. Unknown Memory Leak in Init() Function
    By CodeHacker in forum Windows Programming
    Replies: 3
    Last Post: 07-09-2004, 09:54 AM
  4. Scrolling The Text
    By GaPe in forum C Programming
    Replies: 3
    Last Post: 07-14-2002, 04:33 PM
  5. Replies: 1
    Last Post: 07-13-2002, 05:45 PM