Thread: Parallel arrays, dynamic memory allocation, and sorting arrays

  1. #1
    Registered User
    Join Date
    Dec 2007
    Posts
    5

    Parallel arrays, dynamic memory allocation, and sorting arrays

    I'm trying to read data from a file into parallel arrays. It's a list of names (first and last), dates, and scores. I then need to sort those arrays by any one of the four things being read in (name, date, score).

    This is what I have so far...

    Code:
     
    #include <stdio.h>
    #include <string.h>
    
    #define NMAX 200
    #define MAXLEN 50
    
    int main(void)
    {
    	int num = 0;
    	int score[NMAX];
    	char lastname[NMAX][MAXLEN];
    	char firstname[NMAX][MAXLEN];
    	char date[NMAX][3];
    
    	FILE *ifile;
    
    	ifile = fopen("data.txt", "r");
    
    	while (fscanf(ifile, "%s %s %s %d", lastname[num], firstname[num],
    		date[num], &score[num]) != EOF) {
    			printf("%5d %-14s %-14s %3s\n", score[num], firstname[num],
    				lastname[num], date[num]);		
    	numop++;
    	}
    	
    	fclose(ifile);
    	
    	return(0);
    }

    1) I imagine I'll want to dynamically allocate memory for my arrays, but I'm confused as to how to do this. Do you scanf the file to see how long it is, then use that value w/ malloc to allocate the array before you actually add values as elements? Or do you constantly make the array bigger/smaller with each element that you add/subtract? Also, I define some macros in the beginning to make my arrays "big enough" when declaring them. When I implement dynamic allocation, will I still need these macros?

    2) In the while loop, lastname, firstname, and date, are all "scanned" in without using an "&", but score requires an "&"? Why is this?

    3) When sorting the data, do I actually want to sort the data--as in switching which values are held at which address for all four arrays? Or would it be more practical to sort an array of indices (or pointers) of the arrays?

    If questions similar to mine have been answered elsewhere, I apologize. If so, please direct me to the proper location.

    Thanks

  2. #2
    Registered User
    Join Date
    Apr 2006
    Posts
    2,149
    Quote Originally Posted by sbaker1313 View Post
    1) I imagine I'll want to dynamically allocate memory for my arrays, but I'm confused as to how to do this. Do you scanf the file to see how long it is, then use that value w/ malloc to allocate the array before you actually add values as elements? Or do you constantly make the array bigger/smaller with each element that you add/subtract? Also, I define some macros in the beginning to make my arrays "big enough" when declaring them. When I implement dynamic allocation, will I still need these macros?
    There is a function, which I can remember off the top of my head that determines the length of a file. It will overestimate on a windows machine, because newlines in windows are stored as two characters (\r\n) but read by scanf as 1.

    EDIT: But that won't help you here, because you need to know the max length of any string you read in, not the total length. You also need to know the number of elements in the list. The max string length you should just make constant, dealing with overly long string by reporting an error. As for the number of elements, you have to guess at it in the beginning, and scale it if you are about to overflow. Alternatively, you can implement a linked list or tree. I recommend the former.

    You will not need the "big enough" macros.

    2) In the while loop, lastname, firstname, and date, are all "scanned" in without using an "&", but score requires an "&"? Why is this?
    Because loop, lastname, firstname, and date are arrays. Arrays, when passed to a function pass the address of the first element of the array. Thus scanf can place characters read into the array by dereferencing that pointer. For normal variable, you must take the address yourself with the & operator.

    ----

    EDIT: Also, instead of using 3 array, you should use a single array of structs. Sorta like this:
    Code:
    typedef struct{
        char lastname[MAXLEN];
        char firstname[MAXLEN];
        char date[2];
        int score;
    } Person;
    
    Person array[NMAX];
    Except you'll want to make that dynamic.

    Quote Originally Posted by sbaker1313 View Post
    3) When sorting the data, do I actually want to sort the data--as in switching which values are held at which address for all four arrays? Or would it be more practical to sort an array of indices (or pointers) of the arrays?
    Depends. You may as well sort it by one of the criteria. For the others, you can implement an index if you think that the user will frequently switch between sorting criteria. If you use a stuct as mentioned above, with firstname and last name being dynamic (which I don't recommend) then it's not much sacrifice to create a duplicate array.
    Last edited by King Mir; 12-19-2007 at 01:05 AM.
    It is too clear and so it is hard to see.
    A dunce once searched for fire with a lighted lantern.
    Had he known what fire was,
    He could have cooked his rice much sooner.

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    This shows how to use realloc to create an expanding array.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    What is your "date" format - three chars seems awfully short. Even "md" where m is month and d is day would overflow if the m or d is twodigit, never mind a fully-formed date of year, month and day.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Registered User
    Join Date
    Dec 2007
    Posts
    5
    my 'date' format is [yys] where 's' is the season. for instance, 06F would be Fall of 2006. it's based on academic years, so there's no 'summer' to conflict with 'spring'.

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by sbaker1313 View Post
    my 'date' format is [yys] where 's' is the season. for instance, 06F would be Fall of 2006. it's based on academic years, so there's no 'summer' to conflict with 'spring'.
    In which case you need 4 chars to store that string, or you won't have enough space for the zero-termination character.
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  7. #7
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    A sharper focus on what you want to do first with this program, would be helpful. Do you want to get some practice working with parallel array's w/o an index array? Do you want to work on using an array of structs? Static arrays or dynamic arrays?

    You're getting some good idea's here, but nobody knows HOW you really want to work with this program. There is certainly more than one way to do this. You may want to actually make both a static array version, a dynamic array version (both with parallel arrays), and then change the program to include an index array (with more than 3 array's, as long as you're using parallel arrays (rather than structs), I would recommend an index array.

    Now change it to using just one array, but with structs. Then change that to using a linked list, with those structs.

    There could be file reading, writing, adding records, deleting, editing, sorting, searching, and naturally, a good display, and a main menu function to give the user all these choices.

    Just as a console program, this one flat file project, would be a helluva learning program, all by itself!

  8. #8
    Registered User
    Join Date
    Dec 2007
    Posts
    5
    Wow! Everybody has been incredibly helpful. Thanks so much.

    In response to Adak's question, I guess what I'm looking to do for starters is get some practice working with parallel arrays. I understand that I could also do this with array structs and linked lists and whatnot, but for now, seeings how I'm still pretty new to C, I'd like to work with parallel arrays.

    Any suggestions for sorting (or anything at all really) would be greatly appreciated.

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Sorting can be done in many different ways. The simplest versions are "bubble sort" and "insertion sort". If you search on those, you'll find algorithms for them. Insertion sort is based on the principle that instead of sorting all the data "at the end", you sort it when you add it. E.g. if we have the number 1 and 5 in a list, and add number 3, we place that by adding it BEFORE 5 [shuffling the 5 forward one step]. Adding 2 on to that, we shuffle 3 and 5 forward one step. If we then add 4, it goes between 3 and 5, shuffling 5 forward another step.

    Does that make any sense?

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  10. #10
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Here's another tip for your code.
    When it comes to reading code, you want to make it as readable as possible. This is usually accomplished by following a strict rule with indenting.
    Indent each block once and only once.
    I'll show you what I mean:

    [QUOTE=sbaker1313;699376]
    Code:
     
    #include <stdio.h>
    #include <string.h>
    
    #define NMAX 200
    #define MAXLEN 50
    
    int main(void)
    {
    	int num = 0;
    	int score[NMAX];
    	char lastname[NMAX][MAXLEN];
    	char firstname[NMAX][MAXLEN];
    	char date[NMAX][3];
    
    	FILE *ifile;
    
    	ifile = fopen("data.txt", "r");
    
    	while (fscanf(ifile, "&#37;s %s %s %d", lastname[num], firstname[num], date[num], &score[num]) != EOF) {
    		printf("%5d %-14s %-14s %3s\n", score[num], firstname[num],
    			lastname[num], date[num]);		
    		numop++;
    	}
    	
    	fclose(ifile);
    	
    	return(0);
    }
    I also moved your while to a single row because, well frankly, it was confusing me
    So as you see, the code in the while loop is one more tab than the while itself. That's how indenting should work.
    I do also recommend adding another indent when wrapping your code on multiple lines so you can see that it is actually multi-row. But all the extra rows should only be one more tab than the first line of the multi-line expression.
    Simple rules for easy reading!

    Keep that in mind!
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  11. #11
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Quote Originally Posted by sbaker1313 View Post
    Wow! Everybody has been incredibly helpful. Thanks so much.

    In response to Adak's question, I guess what I'm looking to do for starters is get some practice working with parallel arrays. I understand that I could also do this with array structs and linked lists and whatnot, but for now, seeings how I'm still pretty new to C, I'd like to work with parallel arrays.

    Any suggestions for sorting (or anything at all really) would be greatly appreciated.
    This is a basic bubble sort:
    Code:
    void BubbleSort(int a[], int n)  /* n = the size of the array a[] */
    int i, j;
    
    for (i = 0; i < n - 1; i++)      /* not just n! */
       for (j = n - 1; i < j; --j)   /* n - 1 is the last array element, not n */
           if (a[j-1] > a[j])
              swap(&a[j-1], &a[j]);
    So i starts at the bottom of the array, and j starts at the top. They meet in the middle, as the song says.

    If you have less than 1,000 items to be sorted, this sort has almost no run-time difference with faster sorts like Quicksort or Mergesort. It's all a matter of a blink of an eye on modern hardware.

    With larger number of items to be sorted (10,000 or more), the real speed differences will become apparent. Don't use Bubblesort for anything but these small quantities.

    With the index sort, you probably know you compare the REAL items, but whenever a sort is needed, you sort the Index[] array elements, instead of the REAL array elements.

    When a display or printout is needed, you show it/print it, by going through the index array:
    Code:
    for (i = 0; i < N; i++)
       printf("\n %s %s", Lastname[index[i]], Firstname[index[i]])
    Before you do anything else with the index array, (including doing your sorting), you have to give it's it's initial values:

    Code:
    for (i = 0; i < N; i++)
       index[i] = i;
    Easy, but easy to forget or overlook as well.

  12. #12
    Registered User
    Join Date
    Dec 2007
    Posts
    5
    Code:
    #include <stdio.h>
    #include <string.h>
    
    #define MAXOPS 200
    #define MAXNAME 15
    
    void swap(int *index1, int *index2);
    void namesort(int n, int usenum, char list[], int sort[]);
    
    int main(void)
    {
    	int numop, i, j, k;
    	int length1, length2, minlen;
    	int sort[MAXOPS];
    	int score[MAXOPS];
    	char lastname[MAXOPS][MAXNAME];
    	char firstname[MAXOPS][MAXNAME];
    	char date[MAXOPS][4];
    	
    
    	FILE *ifile;
    
    	ifile = fopen("operatives.txt", "r");
    
    	numop = 0;
    	while (fscanf(ifile, "&#37;s %s %s %d", lastname[numop], firstname[numop], 
    			date[numop], &score[numop]) != EOF) {
    		numop++;
    	}
    	
    	for (i = 0; i < numop; i++){
    		sort[i] = i;
    	}
    	
    		
    	/* I'M TRYING TO MAKE A FUNCTION OUT OF THIS... */
    
    	/* for (i = 0; i < numop; i++){
    		for (j = numop - 1; i < j; --j){
    			length1 = strlen(lastname[j-1]);
    			length2 = strlen(lastname[j]);
    			if (length1 <= length2){
    				minlen = length1;
    			}else{
    				minlen = length2;
    			}
    			k = 0;
    			while (k < minlen){
    				if (lastname[sort[j-1]][k] > lastname[sort[j]][k]){
    					swap(&sort[j-1], &sort[j]);
    					k = minlen;
    				}else if (lastname[sort[j-1]][k] == lastname[sort[j]][k]){
    					k++;
    					if (k == minlen && length1 > length2){
    						swap(&sort[j-1], &sort[j]);
    					}
    				}else{
    					k = minlen;
    				}
    			}
    		}
    	} */
    	
    	namesort(numop, numop, lastname[0], &sort[0]);
    			
    	printf("|   FIRST NAME       |   LAST NAME        |  DATE  |   SCORE   |\n");
    	printf("================================================================\n");
    	
    	for (i = 0; i < numop; i++){
    		printf("|   %-17s", firstname[sort[i]]);
    		printf("|   %-17s", lastname[sort[i]]);
    		printf("|   %s  ", date[sort[i]]);
    		printf("|   %4d    |\n", score[sort[i]]);
    	}
    	
    	fclose(ifile);
    	
    	return(0);
    }
    
    
    void namesort(int n, int usenum, char list[], int sort[])
    {
    	int i, j, k;
    	int length1, length2, minlen;
    	for (i = 0; i < usenum; i++){
    		for (j = usenum - 1; i < j; --j){
    			length1 = strlen(list[j-1]);
    			length2 = strlen(list[j]);
    			if (length1 <= length2){
    				minlen = length1;
    			}else{
    				minlen = length2;
    			}
    			k = 0;
    			while (k < minlen){
    				if (list[sort[j-1]][k] > list[sort[j]][k]){
    					swap(&sort[j-1], &sort[j]);
    					k = minlen;
    				}else if (list[sort[j-1]][k] == list[sort[j]][k]){
    					k++;
    					if (k == minlen && length1 > length2){
    						swap(&sort[j-1], &sort[j]);
    					}
    				}else{
    					k = minlen;
    				}
    			}
    		}
    	}
    }
    
    void swap(int *index1, int *index2)
    {
    	int temp = *index2;
    	*index2 = *index1;
    	*index1 = temp;
    }
    Ok, so this is what I've got so far. I'm trying to make a function that will sort the arrays by name. But I'm getting several error messages at compile time--one at the length1/2 =... saying "passing arg of strlen makes pointer from integer without a cast"; then I get more error messages at
    Code:
    if (list[sort[j-1]][k] > list[sort[j]][k]){
    saying "subscript value is neither array nor pointer".

    Can anyone see what I'm doing wrong? My goal is to be able to have multiple sorting techniques. For instance, sort first by name, then by score; or by last name then first name. The idea being that if there's ever a "tie" between two values in the primary sorting method, I could call another function to resolve the tie.

    As always, your help is greatly appreciated!
    Last edited by sbaker1313; 01-01-2008 at 03:10 PM.

  13. #13
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Your main sorts
    Code:
    char lastname[MAXOPS][MAXNAME];
    which is an array of null-terminated strings.
    Your function sorts
    Code:
    char list[]
    which is an array of letters. If you want list to be a list of strings, you need to make it a 2-D array.

  14. #14
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    And when you have a multi-dimensional parameter to a function, you have to specify the sizes of all but the leftmost dimension. For example,
    Code:
    void func1(char array[]);
    void func2(char array[][2]);
    void func3(char array[][2][3]);
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  15. #15
    Registered User
    Join Date
    Dec 2007
    Posts
    5
    Thanks guys. Per your advice, I changed my function definition to...

    Code:
    void namesort(int n, int usenum, char list[][MAXNAME], int sort[])
    but at compile time it says, "conflicting types for 'namesort'". Any thoughts? And how would I call this function?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 26
    Last Post: 06-11-2009, 11:27 AM