Thread: Help! - FILE - sort alphabetically

  1. #1
    Registered User
    Join Date
    Feb 2014
    Posts
    105

    Lightbulb Help! - FILE - sort alphabetically

    Hello everyone!
    I'm learning programming in C and now I have a some troubles with FILES. I searched a lot of tutorials in Google and YouTube but I really didn't understand how to solve some problems.

    I have to solve an exercise that says:
    In a File I have millon of names (they're not ordered alphabetically) and I have to print only the first 100 sorted alphabetically. The file is a .txt
    I can only use ONE array.


    I think I have to do this:
    First, I sort the first 100 names alphabetically and I save them in an array (alphabetically).
    Then, I go to the name 101 and I see if it's higher or lower than 100. If it's higher, the array continues like it was. If it's lower, I check it to the number 99. If it's higher, I put that name in the position 100 and I clean the name that was there. I can do that with a for.

    To compare the names, I think that "strcmp" would be perfect

    Can anyone help me with the code?

    Thank you VERY MUCH,
    Juan



    ps. I hope you understand what I wrote haha
    Last edited by juanjuanjuan; 02-12-2014 at 06:17 PM.

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Quote Originally Posted by juanjuanjuan View Post
    Hello everyone!
    I'm learning programming in C and now I have a some troubles with FILES. I searched a lot of tutorials in Google and YouTube but I really didn't understand how to solve some problems.

    I have to solve an exercise that says:
    In a File I have millon of names (they're not ordered alphabetically) and I have to print only the first 100 sorted alphabetically. The file is a .txt
    I can only use ONE array.


    I think I have to do this:
    First, I sort the first 100 names alphabetically and I save them in an array (alphabetically).
    Then, I go to the name 101 and I see if it's higher or lower than 100. If it's higher, the array continues like it was. If it's lower, I check it to the number 99. If it's higher, I put that name in the position 100 and I clean the name that was there. I can do that with a for.

    To compare the names, I think that "strcmp" would be perfect

    Can anyone help me with the code?
    If you had code, posted it in a well indented/formatted manner, in [code][/code] tags, and told us specifically where you were having trouble then yes, we could help you. But we wont write code for you if that's what you're after.

    Your overall solution seems (array of 100, fill it, "bump out" higher names as lower ones fill spots) like it would work. Maybe not the fastest or most efficient, but simple and straight forward, a good plan to start with. Also yes, strcmp is good for comparing names (which are typically strings).

    So give it your best shot, and if you get stuck, post your attempt with specific questions.

  3. #3
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    On a current PC, you probably have enough ram to read all the millions of names into an array, but my guess is that that the assignment wants you to limit the size of the array to 100. As an alternative to starting with 100 names, you could keep track of how many names there are in the array, starting with zero names, then "insert" the names into the array in sorted order as you read the names one at a time. Once the array size is 100, you would then just throw away any name that would go past the end of the array (any name that would go into array[100]).

  4. #4
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    I like your idea Juan. Why not try it out with say, 20 names, and keep the lowest 10 of them, in an array. Work through the details of the code, and then jump it up to the million names, and 100 being kept. You'll want to keep the lowest names in sorted order, of course. So if the name "Aaron" comes along later in the file, you'll want it to rise up "through the ranks" of the names already in the array.

    After you've compared a few hundred names, most of the million names won't make the top 100 list. So if your first test is from the BOTTOM of the top 100 names array, just a single comparison will eliminate many names, immediately.

    In pseudo code:
    Code:
    if(currentName is < lastNameOfArray) {  //this is not code. Use strcmp() to make this comparison in your program.
       i=99;
       while(currentName < arrayName[i]) {
          --i
       }
       currentName should go in arrayName[i+1]
    }
    That kind of logic will be efficient in your program.


    I'm not sure what your problem(s) are.

    There is an excellent tutorial link at the top middle of this forum - maybe check it out to brush up on the basics.
    Last edited by Adak; 02-12-2014 at 08:17 PM.

  5. #5
    Registered User
    Join Date
    Feb 2014
    Posts
    105
    Thanks for your answers!

    Yes, my idea is to dolike you told me. For example, if I have 5 names and I want just the first 3 alphabetically:

    I have: John, Pete, Paul, Roger, Aaron

    First of all, I save in an array the first 3: John, Pete, Paul
    Now, I order those names and I have: John, Paul, Pete
    Now, I have 'Roger'. I compared it with Pete. It's "bigger", so I have again: John, Paul, Pete
    Now, I have 'Aaron'. I compared it with Paul. It's "smaller", so I compared it with Pete. It's also "smaller", so I compared it with John. It's "smaller". So I put Aaron instead of John. John instead of Paul and Paul instead of Pete.

    I don't know too much of programming in C. I don't know how I can "read" the names that I receive in a .txt and how to compare them. If I have 2 strings, I compared them like this:

    Code:
    char name1[10], name2[10];
    
    printf("Name 1: ");
    scanf("%c", &name1);
    printf("Name 2: ");
    scanf("%c", &name2);
    
    if (strcmp(name1, name2)) == 0
         printf("Same name");
    if (strcmp(name1, name2)) > 0
         printf("%c is bigger than %c", name1, name2);
    if (strcmp(name1, name2)) < 0
         printf("%c is bigger than %c", name2, name1);
    And I also don't know how to save the first 100 names.

    Thanksssssss
    Last edited by juanjuanjuan; 02-12-2014 at 09:59 PM.

  6. #6
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    To save a list of names, you would need a two dimensional array (length and width).

    Instead of name1[10] which is one dimensional, you need names[10][20]. That would give you room for ten names, with a maximum length of any name, being 19 chars long.

    If you wanted 20 names, you'd need names[20][20].

    Instead of using several if statements, you need only one, as I showed you above. If the name is greater than the last name in the array, you don't need to do anything with it - at all.

    If it is less (you don't have to do anything with the same name, either), then you need to decrease the index to the array, and re-compare the current name, with the name one up in the array.

    P.S. When you print a name, scanf() for a name, or fscanf(), the format you need is %s, instead of %c. %c is for a single char, whereas %s is for a string.

    When you have a million names to deal with, what you DON'T do in your program, is just as important as what you DO.

    You don't know how to use fscanf()? Consult the C tutorial, and it will show you. It works just like the scanf(), except you add the file pointer as the first argument to the fscanf(filePointer, "%s", array[i]), code.

  7. #7
    Registered User
    Join Date
    Feb 2014
    Posts
    105
    Sorry, I wrote the program really bad... this one works:

    Code:
        char name1[20], name2[20];
        int ret;
    
        printf("Name1: ");
        scanf("%s", &name1);
        printf("Name2: ");
        scanf("%s", &name2);
        ret = strcmp(name1, name2);
    
        if(ret > 0)
        {
           printf("name1 is higher than name2");
        }
        else if(ret < 0)
        {
           printf("name2 is higher than name1");
        }
        else
        {
           printf("name1 is equal to name2");
        }

  8. #8
    Registered User
    Join Date
    Feb 2014
    Posts
    105
    Thanks for your answer, Adak!

    That's true! I need a bidimensional array, so I just have to "change" where the pointer points, isn't it?

  9. #9
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Quote Originally Posted by juanjuanjuan View Post
    Thanks for your answer, Adak!

    That's true! I need a bidimensional array, so I just have to "change" where the pointer points, isn't it?
    Yes indeed. the char array[10][20], can be printed out with
    Code:
    for(i=0;i<10;i++)
       printf("%s\n",array[i]);
    which is a very handy way to print out strings. Replace the printf() line of code with fscanf(yourFilePointer, "%s",array[i]), and you can fill the array, with a similar syntax. array[i] can give us the address of the first char of each name in the two dimensional array[][].

    Because of the way C works internally, you can also use a large one dimensional array to do this same thing - but it's a bit more difficult to get right, and doesn't make the code as clear.

  10. #10
    Registered User
    Join Date
    Feb 2014
    Posts
    105
    OK! I understood fscanf will be better. I'm going to read the tutorial to see what do I have to write in "yourFilePointer". Thanks!

    Can you help me with the other things? Saving the first 100 names would not be a problem, I suppose. I'm having a lot of problems with replacing one name for another one in the array.

  11. #11
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Sure, are you using a while loop? As the while loop moves up the array of top names, each name can be strcpy()'d down to the next index number. array[98], becomes array[99], then array[97] is strcpy()'d to array[96]. (you won't work with array[99], since that one will simply be overwritten).

  12. #12
    Registered User
    Join Date
    Feb 2014
    Posts
    105
    Yes, that would be great! while (name[][]!='\0')

    Can you please write some parts of the code? I understand what you're telling me but I'm not able to write the program


    Thanks!

  13. #13
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    I already told you, we wont write code for you. That would not only defeat the purpose of you learning, it would constitute academic dishonesty, which could get you in trouble at school, and is ethically wrong. Of course, you would know that and wouldn't ask such a question if you bothered to read the forum rules (which you agreed to having done when you signed up for your account -- why is it so hard for people to read and follow the rules?). Anyways, here they are, again:
    Announcements - C Programming
    and the homework policy:
    Announcements - General Programming Boards

    Now, stop asking us to do your work for you. Write some code and ask us for help on that code.

  14. #14
    Registered User
    Join Date
    Feb 2014
    Posts
    105
    I have already readed the rules.

    I don't go to school. I need it to finish some work for University. This is part of a large work I'm doing.

    I'm not able to write the code because I'm not really good programming haha. I think I have to separate this exercise in 5 parts:
    - Read the .txt that has all the names
    - Make a bidimensional array to put the names sorted alphabetically
    - Sort alphabetically the first 100 names
    - See if anyone of the other names (from 101 to 1000000) is lesser than anyone that is in the bidimensional array
    - Printf first 100 names

    I only know how to do number 2 (like Adak told me), 3 (I already put the code but it has to be modified to have a more efficient program, I suppose) and 5 (with printf)

    I don't know how to:
    - Read the .txt that has all the names

    - See if anyone of the other names (from 101 to 1000000) is lesser than anyone that is in the bidimensional array


    So I'm asking you if you can help me with that parts of the code. I don't know how to start

    Thanks!
    Juan

  15. #15
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Quote Originally Posted by juanjuanjuan View Post
    I have already readed the rules.
    And yet, you asked, twice (three times counting this post), for us to write code for you. Why? You read the rules and know you're not supposed to. I would not make such a big deal if you made any real effort on this. It seems you haven't bothered to read anything about files in C, otherwise you would be able to at least attempt opening/reading/closing, even if it was incorrect.
    Quote Originally Posted by juanjuanjuan View Post
    I don't go to school. I need it to finish some work for University. This is part of a large work I'm doing.
    I fail to see how doing "work for University" is not "school". I don't know what the "large work" is, but this sounds exactly like a homework problem. Unless, of course, this is your job. But if you got hired to program C, either you lied about your credentials, or it's an internship/entry-level position, in which case you should have a mentor who is able to help you with the basics. Still, this is not the type of problem you typically see in a professional environment.
    Quote Originally Posted by juanjuanjuan View Post
    I'm not able to write the code because I'm not really good programming haha. I think I have to separate this exercise in 5 parts:
    - Read the .txt that has all the names
    - Make a bidimensional array to put the names sorted alphabetically
    - Sort alphabetically the first 100 names
    - See if anyone of the other names (from 101 to 1000000) is lesser than anyone that is in the bidimensional array
    - Printf first 100 names

    I only know how to do number 2 (like Adak told me), 3 (I already put the code but it has to be modified to have a more efficient program, I suppose) and 5 (with printf)

    I don't know how to:
    - Read the .txt that has all the names

    - See if anyone of the other names (from 101 to 1000000) is lesser than anyone that is in the bidimensional array
    I don't see the humor in you being bad at programming. And us writing it for you is not going to help you get any better. We do have a tutorial on this site. The sections on files and arrays should be of much help. I recommend you read those, along with any textbooks and class notes you have, and other tutorials on C (Google for more). That being said, you have a clear understanding of the problem and the solution, which is a huge first step.
    Quote Originally Posted by juanjuanjuan View Post
    So I'm asking you if you can help me with that parts of the code. I don't know how to start
    For the third and last time, we can not write code for you. You should make an attempt to learn the material and try to write the code on your own. Make a sincere effort, and we will help. Here's some free hints:, you will need fopen and fclose, and I would recommend fgets for reading the names (if there is one name per line). As for sorting, bubble sort, insertion and selection are all quite easy to understand and implement, and are sufficiently fast for sorting the first 100 names. Wikipedia has great articles on all 3 of them, and example code abounds on the internet. After that, you don't really need to do any sorting. Just find the right place in the list, and slide the rest of the elements down one spot.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 9
    Last Post: 04-01-2011, 04:13 PM
  2. Arranging a file into alphabetical order.
    By Malachi in forum C Programming
    Replies: 18
    Last Post: 02-10-2009, 11:07 PM
  3. Replies: 4
    Last Post: 03-06-2008, 03:38 PM
  4. Binary File - Byte order / endian
    By chico1st in forum C Programming
    Replies: 27
    Last Post: 08-22-2007, 07:13 PM
  5. Header file include order
    By cunnus88 in forum C++ Programming
    Replies: 6
    Last Post: 05-17-2006, 03:22 PM