Thread: alphabetically sort textfile

  1. #1
    Registered User
    Join Date
    Nov 2012
    Posts
    38

    alphabetically sort textfile

    Heya,

    I need to alphabetically sort a textfile and store the ordered result in another textfile.

    How can this be solved?
    Code:
    #define LENGTH 255
    
    int main()
    {
        FILE *filein;
        FILE *fileout;
        char puffer[LENGTH];
        filein=fopen("namein.txt","r");
        fileout=fopen("nameout.txt", "a");
    
    while(fgets(puffer,LENGTH,filein))
    {
            fputs(puffer,fileout);
    }
    thought of the following solutions:
    1. idea: store the data in some array and then just use a regular sort algorithm on it.
    1. issue: I don't know the amount of entries in advance, thus the size of the array is unknown. I mean I would have to once read through the full textfile to count lines and then create the array based on that information, or is there a better approach?

    2. idea: use a second puffer to read through the textfile and sort on the fly by comparing puffer1/puffer2 and storing the very "smallest" in the new textfile
    2. issue: sort algorithm needs me to to "delete" or flag entries of the first textfile once they were put into into the second one, and I'm not quite sure if that's even possible with the "while fgets puffer" approach.

    could you suggest me some ideas on how to best solve this issue? :/ maybe with tries or something even, I dunno :X

    thanks in advance

  2. #2
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    You need to copy a file to another file.
    In the link you can see how this can be done, but in your case you need to sort before writing to the output file.

    For sorting, I would suggest you to use quicksort (it is in C++, but actually it is C), qsort or mergesort.

    Of course, you need to have an array big enough to hold your data.

    As a matter of fact, go for your 1st idea and use a logical size for the array (not too small and not too big).
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  3. #3
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Your #1 idea was fine, but for external files (not already loaded into a data struct), merge sort seems very natural to use - and it doesn't care what the size of the file is.

    "puffers"?? I've never heard that term used in sorting before. Lost me on that one.
    EDIT: OH! You mean char buffer[]. LOL.

    Read up on mergesort on Wikipedia, and see if it helps clear the haze a bit.
    Last edited by Adak; 03-14-2013 at 09:13 AM.

  4. #4
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    I would say that he or she uses the word puffer for buffer. :P (based on the code too).
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  5. #5
    Registered User
    Join Date
    Nov 2012
    Posts
    38
    U guys don't know puffer?
    alphabetically sort textfile-934_14_5026-kartoffelpuffer-jpg


    Anyways, thanks for the info, I will go with idea #1 then.

    I can't think of a solution of the mentioned issue though: how can I set a dynamic size of my array? I mean, my array can't know how many elements it will have to store unless I once count the lines in my textfile (and I want to prevent having to once loop through it). is there a solution to that?

    Basically the idea is, that the amount of lines my textfile has should not be limited to a specific, set number. Only the linelength is pregiven with 200 characters.

    Thx in advance ^^

  6. #6
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    DID YOU SAY YOUR LINE LENGTHS WERE PREGIVEN AT 200 CHARS?

    A-L-W-A-Y-S 200 chars? I'll bet you could find a relationship between a file with x number of lines:

    x = file_size in bytes / 200 chars, where each char is one byte long.

    (maybe plus a small fudge factor - strictly for "engineering safety") < lol >

    You'll have a few bytes for the file header and probably uncounted newlines at the end of each of the lines of text, but that should give you a good approximation of the number of lines in the file.

    Test that out, see what you can discover.

  7. #7
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    In standard C you can not know the size of the file, if you won't traverse it. Put a reasonable size and start coding
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  8. #8
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    fseek() and ftell() are not standard C (library)?

  9. #9
    Registered User
    Join Date
    May 2012
    Posts
    1,066
    Quote Originally Posted by coffee_cup View Post
    I can't think of a solution of the mentioned issue though: how can I set a dynamic size of my array?
    You allocate some initial memory and when you need more you call realloc().
    11.3 Reallocating Memory Blocks

    Bye, Andreas

  10. #10
    Registered User
    Join Date
    Nov 2012
    Posts
    38
    Thanks all for the input ^^

    Quote Originally Posted by AndiPersti View Post
    You allocate some initial memory and when you need more you call realloc().
    11.3 Reallocating Memory Blocks

    Bye, Andreas
    Nice, that should do the trick, thx a ton

  11. #11
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    Quote Originally Posted by nonoob View Post
    fseek() and ftell() are not standard C (library)?
    Of course they are. But you need to run through the file ones to learn its size (if I am not mistaken). In some systems, there are some functions that have information about the file, but these functions are not standard.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  12. #12
    Registered User
    Join Date
    May 2009
    Posts
    4,183
    For small (less than 2 GB files), fseek() and ftell() works good on getting file size.
    (Technically the file size needs to fit in signed int variable to work.)

    seek to file end and get the size.

    Tim S.
    "...a computer is a stupid machine with the ability to do incredibly smart things, while computer programmers are smart people with the ability to do incredibly stupid things. They are,in short, a perfect match.." Bill Bryson

  13. #13
    Stoned Witch Barney McGrew's Avatar
    Join Date
    Oct 2012
    Location
    astaylea
    Posts
    420
    Quote Originally Posted by stahta01 View Post
    For small (less than 2 GB files), fseek() and ftell() works good on getting file size.
    (Technically the file size needs to fit in signed int variable to work.)
    A signed long int, rather. There are a few issues with using fseek and ftell to retrieve the size of a file. Firstly, the value returned by ftell only gives you the number of bytes from the beginning of the file for binary streams, so if you're using a text stream the value needn't represent the number of bytes that the stream has progressed from the beginning. Also, streams with a binary mapping can have multiple null characters appended at the end of them, so seeking to the end of a binary stream may not necessarily give you the position of the end of the file.

  14. #14
    Registered User
    Join Date
    Jun 2011
    Posts
    4,513
    ...seeking to the end of a binary stream may not necessarily give you the position of the end of the file.
    This is very true. It is, in fact, undefined behavior.

    Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.

    - 7.19.3, footnote 228

  15. #15
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    It might not be standard C, but the OS seems like a good place to get the file size.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. funny output at the end of textfile after sort.
    By csharp100 in forum C Programming
    Replies: 7
    Last Post: 10-21-2012, 05:11 PM
  2. Sort alphabetically and by number
    By krakatao in forum C Programming
    Replies: 9
    Last Post: 04-12-2012, 01:15 PM
  3. Sort A String Alphabetically.
    By wantsree in forum C Programming
    Replies: 2
    Last Post: 02-05-2011, 02:14 AM
  4. How do I bubble sort alphabetically?
    By arih56 in forum C++ Programming
    Replies: 4
    Last Post: 02-27-2008, 02:30 AM
  5. How do I heap sort alphabetically?
    By arih56 in forum C++ Programming
    Replies: 7
    Last Post: 12-12-2007, 01:00 AM