![]() |
| | #1 |
| Registered User Join Date: Feb 2009
Posts: 3
| pointer problems I'm pretty new to C programming, and am working on an assignment that uses the quicksort algorithm adapted to sort strings. I got it working OK except for the fact that it only seems to be looking at the first character in each string. I'm at a loss trying to figure out where the problem is. Anyone have any clues? I'm testing it by piping it the contents of a file named "test" that contains the following: Code: bbb ccc aaa ddd Code: aaa bbb ccc ddd Code: a b c d Code: $ cat test | ./a.out Code: //-----------------------------COMPILER DIRECTIVES------------------------------
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
//-----------------------------FUNCTION PROTOTYPES------------------------------
void quicksort( char *[], int );
void swap( char *[], int, int );
//----------------------------------CONSTANTS-----------------------------------
const long LINE_CAP = 1000000; // Max number of strings (lines)
const short STR_CAP = 82; // Max characters per string +2 for '/0'
//--------------------------------MAIN FUNCTION---------------------------------
int main() {
// Declare Variables
long lngLinesRead = 0; // Keeps track of how many data lines exist
char *pcLine[LINE_CAP]; // Array of character arrays
char cLine[STR_CAP]; // Buffer for the array of characters
int i; // Iterator for loops
// Read the char array into the buffer while EOF is not reached
while( fgets( cLine, STR_CAP, stdin ) != NULL ) {
// Allocate space at the nth pointer in pcLine array
pcLine[lngLinesRead] = (char *) malloc( sizeof( cLine ) );
// Set the value in the pointer to the value of the character array
*(pcLine[lngLinesRead]) = *cLine;
// Increment to next line
lngLinesRead++;
}
// Sort the lines
quicksort( pcLine, lngLinesRead );
// Print to standard output
for( i = 0; i < lngLinesRead; i++ ) {
printf( "%s\n", pcLine[i] );
free( pcLine[i] );
}
// Return error free
return 0;
}
//-----------------------------FUNCTION DEFINITIONS-----------------------------
// swap: interchange v[i] and v[j].
void swap( char *v[], int i, int j )
{
char *temp;
temp = v[i];
v[i] = v[j];
v[j] = temp;
}
// quicksort: sort v[0]..v[n-1] into increasing order.
void quicksort( char *v [], int n )
{
int i, last;
if( n <= 1 ) // nothing to do
return;
swap( v, 0, rand() % n ); // move pivot element to v[0]
last = 0;
for ( i = 1; i < n; i++ ) // partition
if ( strcmp( v[i], v[0] ) < 0 )
swap( v, ++last, i );
assert(( last >= 0 ) && ( last < n ));
swap( v, 0, last ); //restore pivot
quicksort( v, last ); // recursively sort each part
quicksort( v+last+1, n-last-1 );
}
|
| BradC78 is offline | |
| | #2 |
| Kernel hacker Join Date: Jul 2007 Location: Farncombe, Surrey, England
Posts: 15,686
| Code: *(pcLine[lngLinesRead]) = *cLine; -- Mats
__________________ Compilers can produce warnings - make the compiler programmers happy: Use them! Please don't PM me for help - and no, I don't do help over instant messengers. |
| matsp is offline | |
| | #3 |
| Complete Beginner Join Date: Feb 2009
Posts: 312
| I didn't test my proposal, so this is just guessing. Code: // Allocate space at the nth pointer in pcLine array
pcLine[lngLinesRead] = (char *) malloc( sizeof( cLine ) );
// Set the value in the pointer to the value of the character array
*(pcLine[lngLinesRead]) = *cLine;
Second, the next line of code sets pcLine[IngLinesRead][0] to the first character of cLine, which is not what you want. In order to assign string values, you need to use strcpy(): Code: strcpy(pcLine[IngLinesRead], cLine); Greets, Philip
__________________ All things begin as source code. Source code begins with an empty file. -- Tao Te Chip |
| Snafuist is offline | |
| | #4 |
| Kernel hacker Join Date: Jul 2007 Location: Farncombe, Surrey, England
Posts: 15,686
| @snafuist & BradC78: Code: malloc(strlen(cLine)+1) @BradC78 Code: char *pcLine[LINE_CAP]; // Array of character arrays You also do no check IF the file is longer than a million lines - you probably should! -- Mats
__________________ Compilers can produce warnings - make the compiler programmers happy: Use them! Please don't PM me for help - and no, I don't do help over instant messengers. |
| matsp is offline | |
| | #5 | |
| Complete Beginner Join Date: Feb 2009
Posts: 312
| Quote:
Arghl, Philip
__________________ All things begin as source code. Source code begins with an empty file. -- Tao Te Chip | |
| Snafuist is offline | |
| | #6 |
| Registered User Join Date: Feb 2009
Posts: 3
| strcpy did the trick! Thanks everyone. I'm also going to incorporate the other suggestions. Thanks for the help, it's much appreciated! |
| BradC78 is offline | |
| | #7 |
| Complete Beginner Join Date: Feb 2009
Posts: 312
| Your code is ok. Here are some more remarks: 1) You're using comments, that's good. But you are using way too many comments: try to narrow down your comments to why you did it, not how. The code itself already explains the latter to full detail. If your code isn't obvious, make it so. On a general note, if you have more comments than code (and your name is not Donald E. Knuth), there's something wrong. 2) "lngLinesRead" is a funky name, but you will only have to remember it for the next 10 lines of code. "size" will do just fine and is not the least more difficult to understand. The same applies to cLine: what's wrong with simply calling it "tmp"? I'm not interested in its type and/or purpose if it gets used only on two lines. Generally speaking, the length of a variable name and the size of its scope are directly proportional. 3) Similar arguments hold for type encoding in the variable name (the so called Hungarian Notation). This is bad for at least two reasons: first, a good function is no longer than 25 lines (all inclusive) and doesn't have more than maybe 4-5 local variables. Every programmer can easily handle this all by himself without having to look up the type every time he encounters another variable, especially given that one variable is called "i" and another "tmp". Second, the type of a variable may change in the future. Now consider a global variable in a project with several million lines of code. Do you want to adjust every occurrence of that variable to reflect its new type? But it doesn't need to be so big. Having to inspect 25 lines of code because a variable changed its type is already annoying enough. 4) For array subscripts, use variables of type size_t. Although int is guaranteed to hold values which are large enough to express all meaningful array indices, size_t is better because it makes the purpose of the variable more clear to the reader, and furthermore makes the compiler issue a warning if you try to assign negative values to it. 5) You don't need to cast the return value of malloc(). void* is guaranteed to cast implicitly to the desired type. 6) A function should do one thing well. Your main() function does three things, only one of which has its own function. Do functions for reading and output as well. 7) Check the return value of malloc(). If there's not enough free memory available, you're in trouble. If you follow these proposals, your main() will look like this: Code: /* read lines from stdin */
size_t read_s(char **lines)
{
size_t size = 0;
char tmp[STR_CAP];
while(fgets(tmp, STR_CAP, stdin) != NULL) {
lines[size] = malloc(strlen(tmp)+1); // check for NULL
strcpy(lines[size++], tmp);
}
return size;
}
/* print array of strings */
void print_s(char **lines, size_t size)
{
size_t i;
for(i = 0; i < size; i++)
puts(lines[i]);
}
/* free array of strings */
void free_s(char **lines, size_t size)
{
size_t i;
for(i = 0; i < size; i++)
free(lines[i]);
}
int main()
{
size_t sz = 0;
char *lines[LINE_CAP];
sz = read_s(lines);
quicksort(lines, sz);
print_s(lines, sz);
free_s(lines, sz);
return 0;
}
And yes, I'm having too much free time. Greets, Philip
__________________ All things begin as source code. Source code begins with an empty file. -- Tao Te Chip Last edited by Snafuist; 02-18-2009 at 07:01 PM. |
| Snafuist is offline | |
| | #8 |
| Registered User Join Date: Feb 2009
Posts: 278
| A couple comments on Philips list here... 1. I think for beginning programmers, there can be no such thing as too many comments (provided they are relative of course!) Especially considering that a good programmer fleshes out the algorithm in words before coding. When he goes to do his next program, or one later in the year or next year, he might find valuable insight into his methodology by reading through his many comments. 2. Using meaningful variable names should never be discouraged. Nothing worse, in my opinion, than variables named line, temp, tmp, size, x, y, count, etc. size of what exactly? Yes, it should be possible to figure out what the meaning of the vague name is by its context, but why make that necessary. Like my comment for #1, he might appreciate his thoroughness in the future. That being said, I, personally, DO use variables such as x, y, etc, but only as incrementers in for loops as long as I'm not using the variable in any other way (such as an array index). I also use temp, but I usually do temp_int, iTemp, temp_mystruct etc, just for clarity sake. 3. I use so-called Hungarian Notation in my programs. It makes a lot of things easier to understand (especially if you use such vague variable names as temp, tmp, etc.) without having to scroll up and down. Secondly, if you have a global variable that you need to change the type of, there's a handy feature of most editors called the replace all. Replacing every iTemp with fTemp a very simple feat. On top of that is the fact that global variables should be avoided in the first place. 3b. Again, in my opinion, there is nothing worse than a bunch of two and three line functions in a program. In the code you provided, the three little functions might be handy to put in a seperate file because you might use them in multiple programs. However there are lots of examples of where splitting up a 75 line function would make it harder to understand rather than easier, not to mention a pain in the ass to keep scrolling up and down when you need to refer to one. I think saying that a good function is limited to 25 lines is too strict, especially for main. I see all the time where people put just one function call inside their main. What's the point of that? I do agree that a function should do one thing MOST times, but some things go together well enough (and always seem to follow one another) and can (and should) be put in the same function. 4. I don't do this myself, but it is a good suggestion. 5. It is generally good practice to do the casting, even though it is not required. I think making it explicit is good for two reasons. First, it makes it certain to the reader exactly what you did, and second, and more importantly, it's a double check for you as you code so you are getting the right kind of pointer. If you cast a malloc to a char and you assign it to a variable declared as a pointer to a structure you get an error. if you don't explicitly do this you don't get an error and then you will get at best junk, at worst, seg faults. 6. See 3b. 7. Agreed. However I don't see in the code you posted where you did that. Overall, I know what you mean with most of these things, but I don't think they are things that should be discouraged. |
| Bladactania is offline | |
| | #9 |
| Registered User Join Date: Feb 2009
Posts: 278
| I just re-read my post. I wasn't trying to sound hostile |
| Bladactania is offline | |
| | #10 | |||||||
| C++ Witch Join Date: Oct 2003 Location: Singapore
Posts: 11,319
| Quote:
Quote:
Quote:
On the other hand, I agree with your objection to Snafuist's renaming (with the caveat that I dislike Hungarian notation, so agree with Snafuist on that point). Although Snafuist's own choice of variable names are not bad, I do not see any reason not to use line instead of tmp (though some object to variable names that differ by plurality) and line_count instead of size and sz. Quote:
Quote:
The scrolling objection does not make sense to me: you do not need to scroll very often within the same function because everything fits into the screen, and when you do scroll to refer to another function, you can easily see which function's implementation you are looking at. Quote:
On the other hand, if you want your C code to be compilable by a C++ compiler, the cast becomes necessary since it is necessary in C++. Quote:
Code: p = malloc(sizeof(*p) * n);
__________________ C + C++ Compiler: MinGW port of GCC Build + Version Control System: SCons + Bazaar Look up a C/C++ Reference and learn How To Ask Questions The Smart Way | |||||||
| laserlight is offline | |
![]() |
| Thread Tools | |
| Display Modes | |
|
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Problems with passing an array of structures by pointer | raptor1770 | C Programming | 9 | 11-29-2008 11:01 AM |
| Dynamic array of structures containing yet another dynamic array of structures | innqubus | C Programming | 2 | 07-11-2008 07:39 AM |
| Following CTools | EstateMatt | C Programming | 5 | 06-26-2008 10:10 AM |
| Request for comments | Prelude | A Brief History of Cprogramming.com | 15 | 01-02-2004 10:33 AM |
| base class pointer problems | ... | C++ Programming | 3 | 11-16-2003 11:27 PM |