Thread: Enormous Input Test

  1. #31
    Registered User
    Join Date
    Oct 2001
    Posts
    62
    Elysia, gets() is dangerous and atoi() may fail. You're correct on this. I know that. But this discussion is about to find the fastest solution for a given problem. So let's concentrate on that and try to improve the work of the people that posted solutions before.

    You may call it evil, but after all, my replacements of scanf() and atoi() are the most significant improvements that were made yet, aren't they?
    Last edited by rpet; 12-31-2008 at 08:03 AM. Reason: typo

  2. #32
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    No, fgets is fine. If you use fgets instead of gets, there will be no problem. So fgets was one of the most significant improvements.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  3. #33
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by C_ntua
    This code runs as fast as the already posted code (2.18 vs 2.17, the times vary a bit for every test)
    I believe that there is an edge case bug in the code: in the case where the input is 1000000000, the fgets() will read 10 characters in and then insert the null character. However, that leaves the newline in the buffer, thus causing the next read to end prematurely. Ironically, this edge case bug is not present with the gets() version since the newline would be read and discarded.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  4. #34
    Registered User
    Join Date
    May 2006
    Posts
    903
    I'd try multithreading. The only thing I could see that would slow it down is sharing the same vector (i.e. having to wait for the other process to finish its job).

  5. #35
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    This gives 2.10
    Code:
    #include <stdio.h>
    
    int main()
    {
        char b[11];
        unsigned int i, v, n, k, count = 0;
    
        scanf("%u %u", &n, &k);
        while (getchar() != '\n');
    
        while (gets(b)) {
            for (v = 0, i = 0; b[i]; ++i)
                v = 10 * v + (b[i] - '0');
    
            if ((v / k) * k == v) ++count;
        }
    
        printf("%u", count);
        return 0;
    }

  6. #36
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Again with the gets. How about fgets version?
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  7. #37
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    fgets() version gives 2.12 (replacing the b[i] with b[i] != '\n' in the for-loop).
    gets() is just a bit faster, but no big deal

    edit: For easy copy-paste

    Code:
    #include <stdio.h>
    
    int main()
    {
        char b[11];
        unsigned int i, v, n, k, count = 0;
    
        scanf("%u %u", &n, &k);
        while (getchar() != '\n');
    
        while (fgets(b, 12, stdin)) {
            for (v = 0, i = 0; b[i] != '\n'; ++i)
                v = 10 * v + (b[i] - '0');
            if ((v / k) * k == v) ++count;
        }
        printf("%u", count);
        return 0;
    }

  8. #38
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Now, the question is if bitshifting is not faster unless the compiler optimizes it?
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  9. #39
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    Tried also this approach
    Code:
    #include <stdio.h>
    
    int main()
    {
        char b[11];
        unsigned int i, v, n, k, count = 0;
    
        scanf("%u %u", &n, &k);
        while (getchar() != '\n');
        
        for ( ; n; --n) {
            for(i = 0, v = 0; (b[i] = getchar()) != '\n'; ++i)
                v = 10 * v + b[i] - '0';
            if ((v / k) * k == v) ++count;
        }
    
        printf("%u", count);
        return 0;
    }
    And this:
    Code:
    #include <stdio.h>
    
    int main()
    {
        char b[11];
        unsigned int i, v, n, k, count = 0;
    
        scanf("%u %u", &n, &k);
        while (getchar() != '\n');
        
        for ( ; (b[0] = getchar()) != EOF; ) {
            v = b[0] - '0';
            for(i = 1; (b[i] = getchar()) != '\n'; ++i)
                v = 10 * v + (b[i] - '0');
            if ((v / k) * k == v) ++count;
        }
    
        printf("%u", count);
        return 0;
    }
    But it gives 2.97 - 3.04 sec

  10. #40
    Registered User
    Join Date
    Jan 2008
    Posts
    182
    Wow... I didn't check this thread since my last post...

    I'll start reading with detail, I just browsed.

    But what about using fread? Would that be faster somehow?

  11. #41
    Registered User
    Join Date
    Jan 2008
    Posts
    182
    Quote Originally Posted by samus250 View Post
    Wow... I didn't check this thread since my last post...

    I'll start reading with detail, I just browsed.

    But what about using fread? Would that be faster somehow?
    Reading with fread char by char is not good... time was 8.29 seconds (and I thought the time limit was 8, but it got accepted anyways).
    Code:
    #include <stdio.h>
    
    int main()
    {
        char b;
        unsigned int i, v, n, k, count = 0;
    
        scanf("%u %u", &n, &k);
        while (getchar() != '\n');
        
        v = 0;
        while (fread(&b, sizeof(char), 1, stdin)) {
            if(b == '\n') {
                if((v / k) * k == v) ++count;
                v = 0;
                continue;
            }
            
            v = 10 * v + (b - '0');
        }
    
        printf("%u", count);
        return 0;
    }

  12. #42
    Registered User
    Join Date
    Jan 2008
    Posts
    182
    OK I am REALLY puzzled here.........

    I test the program with the example input given on the site... which should output 4.
    Why the heck does it give me 4 when I uncomment the commented printf, and gives me 5 when I comment it out?????////

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    
    int main()
    {
        unsigned int i, v, n, k, chunk_read,  elements_read = 0, bm = 1, count = 0;
        const unsigned int chunk_size = 1024 * 1024;
        const unsigned int buffer_size = 1024 *1024 * 60; // 60MB
        char *buffer = NULL;
        
        // init buffer
        buffer = (char *)malloc(sizeof(char) * buffer_size);
    
        scanf("%u %u", &n, &k);
        while (getchar() != '\n');
    
        // read the input and store to the buffer
        while((chunk_read = fread(buffer, sizeof(char), chunk_size, stdin))) {
            elements_read += chunk_read;
            
        	// if the buffer is running out of space...
            if(elements_read >= buffer_size * bm - 2) {
                ++bm;
                buffer = (char *)realloc(buffer, buffer_size * bm);
            }
        }
    
        // cap with NULL
        buffer[elements_read] = '\0';
        
        for(i = 0; buffer[i] !='\0'; i++) {
            if(buffer[i] == '\n') {
                if((v / k) * k == v) {
                    //printf("%u\n", v);
                    count++;
                }
                v = 0;
                continue;
            }
    
            v = 10 * v + (buffer[i] - '0');
        }
    
        printf("%u", count);
        return 0;
    }
    Edit: with the printf commented out, it gives wrong answer in .07 seconds.
    Last edited by samus250; 12-31-2008 at 11:34 AM.

  13. #43
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by C_ntua View Post
    This gives 2.10
    Code:
    #include <stdio.h>
    
    int main()
    {
        char b[11];
        unsigned int i, v, n, k, count = 0;
    
        scanf("%u %u", &n, &k);
        while (getchar() != '\n');
    
        while (gets(b)) {
            for (v = 0, i = 0; b[i]; ++i)
                v = 10 * v + (b[i] - '0');
    
            if ((v / k) * k == v) ++count;
        }
    
        printf("%u", count);
        return 0;
    }
    Okay that can't be right. Division and Mod aren't just the same speed, they're the same asm operation as it produces both results at once. So what you've done is add an extra multiply, compare against something non-zero, and you've brought back the if-statement meaning an unpredictable branch again.
    Obviously what I posted didn't have as much effect as I thoguht it would. Maybe the compiler was already eliminating the branch.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  14. #44
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    I'm not sure why everyone's trying to optimize the divisibility test. The claimed purpose of this challenge was to see how fast you can perform I/O. I'd hardly expect the time spent in the division to dominate the time spent reading values from disk.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  15. #45
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Anyone for some extreme loop unrolling?:
    Code:
    #include <stdio.h>
    
    int main()
    {
        char b[11];
        unsigned int v, n, k, count = 0;
    
        scanf("%u %u", &n, &k);
        while (getchar() != '\n');
    
        while (gets(b)) {
    		if (!b[0]) v = 0;
    		else if (!b[1]) v = (b[0]-'0');
    		else if (!b[2]) v = (10*b[1]+b[0] - (10*'0'+'0')); 
    		else if (!b[3]) v = (10*(10*b[2]+b[1])+b[0] - (10*(10*'0'+'0')+'0')); 
    		else if (!b[4]) v = (10*(10*(10*b[3]+b[2])+b[1])+b[0] - (10*(10*(10*
    			'0'+'0')+'0')+'0')); 
    		else if (!b[5]) v = (10*(10*(10*(10*b[4]+b[3])+b[2])+b[1])+b[0] - (10
    			*(10*(10*(10*'0'+'0')+'0')+'0')+'0')); 
    		else if (!b[6]) v = (10*(10*(10*(10*(10*b[5]+b[4])+b[3])+b[2])+b[1])+
    			b[0] - (10*(10*(10*(10*(10*'0'+'0')+'0')+'0')+'0')+'0')); 
    		else if (!b[7]) v = (10*(10*(10*(10*(10*(10*b[6]+b[5])+b[4])+b[3])+b[2])
    			+b[1])+b[0] - (10*(10*(10*(10*(10*(10*'0'+'0')+'0')+'0')+'0')+'0')+'0')); 
    		else if (!b[8]) v = (10*(10*(10*(10*(10*(10*(10*b[7]+b[6])+b[5])+b[4])
    			+b[3])+b[2])+b[1])+b[0] - (10*(10*(10*(10*(10*(10*(10*'0'+'0')+'0')
    			+'0')+'0')+'0')+'0')+'0')); 
    		else if (!b[9]) v = (10*(10*(10*(10*(10*(10*(10*(10*(b[8]-'0')+b[7])+
    			b[6])+b[5])+b[4])+b[3])+b[2])+b[1])+b[0] - (10*(10*(10*(10*(10*(
    			10*(10*'0'+'0')+'0')+'0')+'0')+'0')+'0')+'0')); 
    		else if (!b[10])v = (10*(10*(10*(10*(10*(10*(10*(10*(10*(b[9]-'0')+
    			b[8]-'0')+b[7])+b[6])+b[5])+b[4])+b[3])+b[2])+b[1])+b[0] - (10*
    			(10*(10*(10*(10*(10*(10*'0'+'0')+'0')+'0')+'0')+'0')+'0')+'0')); 
            count += (v % k) == 0;
        }
    
        printf("%u", count);
        return 0;
    }
    Yeah okay so that was a dumb idea.
    Last edited by iMalc; 12-31-2008 at 01:46 PM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Integer Emulation
    By Elysia in forum C++ Programming
    Replies: 31
    Last Post: 03-18-2008, 01:03 PM
  2. For loop problems, input please.
    By xIcyx in forum C Programming
    Replies: 2
    Last Post: 04-22-2007, 03:54 AM
  3. MSVC Template Constructor/Assignment Errors
    By LuckY in forum Windows Programming
    Replies: 3
    Last Post: 07-22-2005, 02:57 PM
  4. input files in C
    By LWisne in forum C Programming
    Replies: 4
    Last Post: 09-30-2002, 06:24 AM
  5. Segmentation fault with input test file
    By bentles in forum C Programming
    Replies: 20
    Last Post: 04-28-2002, 08:58 PM

Tags for this Thread