design your application to process ONE LINE of data from
the problem statement "at a time".
how do i start do a program like this?
design your application to process ONE LINE of data from
the problem statement "at a time".
how do i start do a program like this?
The nicest way to do this is to use fread() and work one byte at a time, since this is the method that will probably be the most useful now and in the future.
So you write a separate function to do the job like this:
You need to give this a file on the command line to read or it will segfault.Code:#include <stdio.h> #include <stdlib.h> #include <string.h> char *linein (FILE *src) { int count=0; char byte, buffer[1024], *ptr; while (fread(&byte,1,1,src)==1) { buffer[count]=byte; count++; if ((byte=='\n') || (count==1024)) break; /* end of line */ } buffer[count]='\0'; /* null terminate */ if (!(count)) return NULL; /* nothing was read */ ptr=malloc(strlen(buffer)+1); if (!(ptr)) { puts("!!OUT OF MEMORY!!"); return NULL; } strcpy(ptr,buffer); return ptr; } int main (int argc, char *argv[]) { FILE *in=fopen(argv[1],"r"); char *ptr; while ((ptr=linein(in))!=NULL) { printf("%s",ptr); free(ptr); } puts("\n...done"); fclose(in); return 0; }
The only issue here is that the line cannot be more than 1024 bytes long. You could make the buffer longer. If the line could be any length at all, the function becomes more complicated because you must keep reallocing and resetting the buffer every kilobyte (or whatever length the buffer is).
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
A line at a time, is not quite the same thing as a byte at a time, and making up a line, however.
I'd recommend using fgets(), for one line at a time, type of processing. Examples are in the tutorials, click on "FAQ" link, in this forum's top banner to get to the page where they are linked (in the column on the left side).
Last edited by Adak; 03-15-2009 at 05:23 PM.
Today I had to replace my use of GNU getline() somewhere in order to make something POSIX compliant and I used this model (fread byte by byte to newline), wondering if it that would make any difference, speed wise. A pleasant surprise: altho I don't do benchmarking, I would say just by observation (eg on 10 megs of text, you have to wait a few seconds) that it is noticeably faster than getline(). And for me to even notice this, it is probably something like twice as fast (or maybe I should quit drugs ) Don't know about fgets.
It's almost irrelevant anyway, but there you go.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
Binary is always faster, since there is no translation into text mode. I'm surprised it's *that* much faster, however.
Maybe you had the version of Getline that prints out the Gettysburg Address, at the same time?
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
I meant the data itself. Since nothing has to be translated to text mode, it's just always faster - has been since BASIC days. But we're talking 5 - 10% difference, here.
The direct writing would account for the rest, I expect.
fcanf() is another little speedster, compared to other input functions. Leaves 'em in the dust!
(Not faster than fread() however), but I haven't tested these two against each other.
to read one byte at a time you have fgetcThe nicest way to do this is to use fread() and work one byte at a time, since this is the method that will probably be the most useful now and in the future.
fread supossed to be used when reading bulks.
using it to read 1 byte at a time is like using the track to transfer one cap at a time. I could not see anything nice about this approach.
All problems in computer science can be solved by another level of indirection,
except for the problem of too many layers of indirection.
– David J. Wheeler
I was wrong about this; I decided to actually check (via clock()) before I got rid of the old function, and my implimentation of fread() byte by byte to newline is maybe 10-15% slower than getline(). I imagine reading in larger chunks might be faster, if finding the newline wasn't an issue.
Just goes to show how great I think I am ;(
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge