Thread: pointer problems

  1. #1
    Registered User
    Join Date
    Feb 2009
    Posts
    3

    pointer problems

    Hello,

    I'm pretty new to C programming, and am working on an assignment that uses the quicksort algorithm adapted to sort strings. I got it working OK except for the fact that it only seems to be looking at the first character in each string. I'm at a loss trying to figure out where the problem is. Anyone have any clues?

    I'm testing it by piping it the contents of a file named "test" that contains the following:

    Code:
    bbb
    ccc
    aaa
    ddd
    I'm trying to get the output to be:
    Code:
    aaa
    bbb
    ccc
    ddd
    Instead, I'm getting:
    Code:
    a
    b
    c
    d
    Here's how I'm executing it:
    Code:
    $ cat test | ./a.out
    Anyway, here's the c code. Sorry for the huge post.
    Code:
    //-----------------------------COMPILER DIRECTIVES------------------------------
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <assert.h>
    
    //-----------------------------FUNCTION PROTOTYPES------------------------------
    void quicksort( char *[], int );
    void swap( char *[], int, int );
    
    //----------------------------------CONSTANTS-----------------------------------
    const long LINE_CAP = 1000000;     // Max number of strings (lines)
    const short STR_CAP = 82;          // Max characters per string +2 for '/0'
    
    //--------------------------------MAIN FUNCTION---------------------------------
    int main() {
    
    // Declare Variables
       long lngLinesRead = 0;          // Keeps track of how many data lines exist
       char *pcLine[LINE_CAP];         // Array of character arrays
       char cLine[STR_CAP];            // Buffer for the array of characters
       int i;                          // Iterator for loops
    
    // Read the char array into the buffer while EOF is not reached
       while( fgets( cLine, STR_CAP, stdin ) != NULL ) {
    
    // Allocate space at the nth pointer in pcLine array
          pcLine[lngLinesRead] = (char *) malloc( sizeof( cLine ) );
    
    // Set the value in the pointer to the value of the character array
          *(pcLine[lngLinesRead]) = *cLine;
    
    // Increment to next line
          lngLinesRead++;
       }
    
    // Sort the lines
       quicksort( pcLine, lngLinesRead );
    
    // Print to standard output
       for( i = 0; i < lngLinesRead; i++ ) {
          printf( "%s\n", pcLine[i] );
          free( pcLine[i] );
       }
    
    // Return error free
       return 0;
    }
    
    //-----------------------------FUNCTION DEFINITIONS-----------------------------
    
    // swap:  interchange v[i] and v[j].
    void swap( char *v[], int i, int j )
    {
    	char *temp;
    
    	temp = v[i];
    	v[i] = v[j];
    	v[j] = temp;
    }
    
    // quicksort: sort v[0]..v[n-1] into increasing order.
    void quicksort( char *v [], int n )
    {
    	int i, last;
    	if( n <= 1 )                             // nothing to do
    		return;
    
    	swap( v, 0, rand() % n );                // move pivot element to v[0]
    	last = 0;
    
    	for ( i = 1; i < n; i++ )                // partition
    		if ( strcmp( v[i], v[0] ) < 0 )
    			swap( v, ++last, i );
    
    	assert(( last >= 0 ) && ( last < n ));
    	swap( v, 0, last );                      //restore pivot
    
    	quicksort( v, last );                    // recursively sort each part
    	quicksort( v+last+1, n-last-1 );
    }
    Thanks for the help, I really appreciate it.

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Code:
          *(pcLine[lngLinesRead]) = *cLine;
    shoudl be a strcpy.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Complete Beginner
    Join Date
    Feb 2009
    Posts
    312
    I didn't test my proposal, so this is just guessing.

    Code:
    // Allocate space at the nth pointer in pcLine array
          pcLine[lngLinesRead] = (char *) malloc( sizeof( cLine ) );
    
    // Set the value in the pointer to the value of the character array
          *(pcLine[lngLinesRead]) = *cLine;
    First of all, use malloc(strlen(cLine)). You don't need to allocate more space than you need.

    Second, the next line of code sets pcLine[IngLinesRead][0] to the first character of cLine, which is not what you want. In order to assign string values, you need to use strcpy():

    Code:
    strcpy(pcLine[IngLinesRead], cLine);
    Try that and tell us what happened.

    Greets,
    Philip
    All things begin as source code.
    Source code begins with an empty file.
    -- Tao Te Chip

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    @snafuist & BradC78:
    Code:
    malloc(strlen(cLine)+1)
    Would be a lot better!

    @BradC78
    Code:
       char *pcLine[LINE_CAP];         // Array of character arrays
    This uses up 4MB of stack (assuming 32-bit pointers) - that is quite a lot to put on the stack. Either use a global (not nice, but you probably only need the one!) or use dynamic allocation to create the array in the first place [or don't use 1 million as LINE_CAP!].

    You also do no check IF the file is longer than a million lines - you probably should!

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Complete Beginner
    Join Date
    Feb 2009
    Posts
    312
    Quote Originally Posted by matsp View Post
    @snafuist & BradC78:
    Code:
    malloc(strlen(cLine)+1)
    Would be a lot better!
    /* no comment */

    Arghl,
    Philip
    All things begin as source code.
    Source code begins with an empty file.
    -- Tao Te Chip

  6. #6
    Registered User
    Join Date
    Feb 2009
    Posts
    3
    strcpy did the trick! Thanks everyone. I'm also going to incorporate the other suggestions. Thanks for the help, it's much appreciated!

  7. #7
    Complete Beginner
    Join Date
    Feb 2009
    Posts
    312
    Your code is ok. Here are some more remarks:

    1)
    You're using comments, that's good. But you are using way too many comments: try to narrow down your comments to why you did it, not how. The code itself already explains the latter to full detail. If your code isn't obvious, make it so. On a general note, if you have more comments than code (and your name is not Donald E. Knuth), there's something wrong.

    2)
    "lngLinesRead" is a funky name, but you will only have to remember it for the next 10 lines of code. "size" will do just fine and is not the least more difficult to understand. The same applies to cLine: what's wrong with simply calling it "tmp"? I'm not interested in its type and/or purpose if it gets used only on two lines. Generally speaking, the length of a variable name and the size of its scope are directly proportional.

    3)
    Similar arguments hold for type encoding in the variable name (the so called Hungarian Notation). This is bad for at least two reasons: first, a good function is no longer than 25 lines (all inclusive) and doesn't have more than maybe 4-5 local variables. Every programmer can easily handle this all by himself without having to look up the type every time he encounters another variable, especially given that one variable is called "i" and another "tmp".
    Second, the type of a variable may change in the future. Now consider a global variable in a project with several million lines of code. Do you want to adjust every occurrence of that variable to reflect its new type? But it doesn't need to be so big. Having to inspect 25 lines of code because a variable changed its type is already annoying enough.

    4)
    For array subscripts, use variables of type size_t. Although int is guaranteed to hold values which are large enough to express all meaningful array indices, size_t is better because it makes the purpose of the variable more clear to the reader, and furthermore makes the compiler issue a warning if you try to assign negative values to it.

    5)
    You don't need to cast the return value of malloc(). void* is guaranteed to cast implicitly to the desired type.

    6)
    A function should do one thing well. Your main() function does three things, only one of which has its own function. Do functions for reading and output as well.

    7)
    Check the return value of malloc(). If there's not enough free memory available, you're in trouble.


    If you follow these proposals, your main() will look like this:

    Code:
    /* read lines from stdin */
    size_t read_s(char **lines)
    {
            size_t size = 0;
            char tmp[STR_CAP];
    
            while(fgets(tmp, STR_CAP, stdin) != NULL) {
                    lines[size] = malloc(strlen(tmp)+1);   // check for NULL
                    strcpy(lines[size++], tmp);
            }
    
            return size;
    }
    
    
    /* print array of strings */
    void print_s(char **lines, size_t size)
    {
            size_t i;
    
            for(i = 0; i < size; i++)
                    puts(lines[i]);
    }
    
    
    /* free array of strings */
    void free_s(char **lines, size_t size)
    {
            size_t i;
    
            for(i = 0; i < size; i++)
                    free(lines[i]);
    }
    
    
    int main()
    {
            size_t sz = 0;
            char *lines[LINE_CAP];
      
            sz = read_s(lines);
    
            quicksort(lines, sz);
    
            print_s(lines, sz);
            free_s(lines, sz);
    
            return 0;
    }
    Now it's clean and simple, there's no need for any further comments, and even my mother could understand it. On the other hand, it may not work because I didn't try to compile it.

    And yes, I'm having too much free time.

    Greets,
    Philip
    Last edited by Snafuist; 02-18-2009 at 07:01 PM.
    All things begin as source code.
    Source code begins with an empty file.
    -- Tao Te Chip

  8. #8
    Registered User
    Join Date
    Feb 2009
    Posts
    278
    A couple comments on Philips list here...

    1. I think for beginning programmers, there can be no such thing as too many comments (provided they are relative of course!) Especially considering that a good programmer fleshes out the algorithm in words before coding. When he goes to do his next program, or one later in the year or next year, he might find valuable insight into his methodology by reading through his many comments.

    2. Using meaningful variable names should never be discouraged. Nothing worse, in my opinion, than variables named line, temp, tmp, size, x, y, count, etc. size of what exactly? Yes, it should be possible to figure out what the meaning of the vague name is by its context, but why make that necessary. Like my comment for #1, he might appreciate his thoroughness in the future. That being said, I, personally, DO use variables such as x, y, etc, but only as incrementers in for loops as long as I'm not using the variable in any other way (such as an array index). I also use temp, but I usually do temp_int, iTemp, temp_mystruct etc, just for clarity sake.

    3. I use so-called Hungarian Notation in my programs. It makes a lot of things easier to understand (especially if you use such vague variable names as temp, tmp, etc.) without having to scroll up and down. Secondly, if you have a global variable that you need to change the type of, there's a handy feature of most editors called the replace all. Replacing every iTemp with fTemp a very simple feat. On top of that is the fact that global variables should be avoided in the first place.

    3b. Again, in my opinion, there is nothing worse than a bunch of two and three line functions in a program. In the code you provided, the three little functions might be handy to put in a seperate file because you might use them in multiple programs. However there are lots of examples of where splitting up a 75 line function would make it harder to understand rather than easier, not to mention a pain in the ass to keep scrolling up and down when you need to refer to one. I think saying that a good function is limited to 25 lines is too strict, especially for main. I see all the time where people put just one function call inside their main. What's the point of that? I do agree that a function should do one thing MOST times, but some things go together well enough (and always seem to follow one another) and can (and should) be put in the same function.

    4. I don't do this myself, but it is a good suggestion.

    5. It is generally good practice to do the casting, even though it is not required. I think making it explicit is good for two reasons. First, it makes it certain to the reader exactly what you did, and second, and more importantly, it's a double check for you as you code so you are getting the right kind of pointer. If you cast a malloc to a char and you assign it to a variable declared as a pointer to a structure you get an error. if you don't explicitly do this you don't get an error and then you will get at best junk, at worst, seg faults.

    6. See 3b.

    7. Agreed. However I don't see in the code you posted where you did that.

    Overall, I know what you mean with most of these things, but I don't think they are things that should be discouraged.

  9. #9
    Registered User
    Join Date
    Feb 2009
    Posts
    278
    I just re-read my post. I wasn't trying to sound hostile

  10. #10
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,412
    Quote Originally Posted by Bladactania
    I think for beginning programmers, there can be no such thing as too many comments (provided they are relative of course!)
    I do not see why there should be a double standard, one for beginners and another for more experienced programmers. If the beginners want to stop being beginners, they have to adopt better practices that might not be typical of beginners who do not know better.

    Quote Originally Posted by Bladactania
    Especially considering that a good programmer fleshes out the algorithm in words before coding.
    Restating what the code obviously does is redundant and not the same as describing an algorithm.

    Using meaningful variable names should never be discouraged. Nothing worse, in my opinion, than variables named line, temp, tmp, size, x, y, count, etc. size of what exactly? Yes, it should be possible to figure out what the meaning of the vague name is by its context, but why make that necessary.
    I generally prefer descriptive variable names, but it depends on context, and if you write functions that do one thing and do it well, the context becomes more obvious. As such, the names that you cited as the worst possible certainly are not.

    On the other hand, I agree with your objection to Snafuist's renaming (with the caveat that I dislike Hungarian notation, so agree with Snafuist on that point). Although Snafuist's own choice of variable names are not bad, I do not see any reason not to use line instead of tmp (though some object to variable names that differ by plurality) and line_count instead of size and sz.

    Quote Originally Posted by Bladactania
    Again, in my opinion, there is nothing worse than a bunch of two and three line functions in a program.
    A 900+ line function is worse, in my opinion. Apparently some people here have seen co-workers write even longer functions, but that is the longest that I have seen written by people I know. I certainly would expect some one liners in the implementation of a library interface, e.g., for a getter function.

    Quote Originally Posted by Bladactania
    However there are lots of examples of where splitting up a 75 line function would make it harder to understand rather than easier, not to mention a pain in the ass to keep scrolling up and down when you need to refer to one.
    I do not recall ever seeing a case where appropriately splitting up a function of such length into smaller, well named, functions that do one thing and do it well made it harder to understand the code. There are certainly more examples where splitting up a 75 line function would make it easier to understand the code.

    The scrolling objection does not make sense to me: you do not need to scroll very often within the same function because everything fits into the screen, and when you do scroll to refer to another function, you can easily see which function's implementation you are looking at.

    Quote Originally Posted by Bladactania
    It is generally good practice to do the casting, even though it is not required.
    There are pros and cons. The main disadvantage that is commonly cited is that in C, functions that are not declared are assumed by the compiler as having an int return type. Casting the return type of malloc() would then hide the fact that <stdlib.h> was not included, and thus result in an error that occurs at run time.

    On the other hand, if you want your C code to be compilable by a C++ compiler, the cast becomes necessary since it is necessary in C++.

    Quote Originally Posted by Bladactania
    First, it makes it certain to the reader exactly what you did, and second, and more importantly, it's a double check for you as you code so you are getting the right kind of pointer.
    If you write the code correctly, that will not be a problem, e.g.
    Code:
    p = malloc(sizeof(*p) * n);
    No matter what the type of the pointer p, malloc() will allocate the correct number of elements n, or return a null pointer.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Problems with passing an array of structures by pointer
    By raptor1770 in forum C Programming
    Replies: 9
    Last Post: 11-29-2008, 11:01 AM
  2. Replies: 2
    Last Post: 07-11-2008, 07:39 AM
  3. Following CTools
    By EstateMatt in forum C Programming
    Replies: 5
    Last Post: 06-26-2008, 10:10 AM
  4. Request for comments
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 01-02-2004, 10:33 AM
  5. base class pointer problems
    By ... in forum C++ Programming
    Replies: 3
    Last Post: 11-16-2003, 11:27 PM