Thread: Superfluous characters after string -- pointer issues?

  1. #1
    Registered User
    Join Date
    Jan 2011
    Posts
    9

    Superfluous characters after string -- pointer issues?

    I hope this isn't the wrong forum! I'm aware that I'm using some non-ANSI-C functions, but my problem seems to be with simple C strings and pointers, so I'm posting it here.

    I wrote a simple program to sort torrent files by tracker. It parses the files for the first http tracker, creates a subdirectory named after the tracker, then moves (renames) the input file into that directory.

    The program works in two modes: directory mode, which scans the top level of a directory for torrent files, and file mode, which accepts a list of files in the command line (designed for working with shell globbing).

    Everything seemed to be working fine until this morning, when I decided to modify the file mode to work using absolute paths, as well as its previous function of using relative paths.

    I immediately noticed that it now worked only on the first iteration: beginning with the second file, it didn't work in absolute path mode. Each time running the program would move one file, and all the others would fail.

    I began putting some printf statements in to assist with debugging, and I immediately determined that calling printf() on a string (including a string literal) caused the problem to go away, at least most of the time. Very strange, I thought.

    After a little more playing I discovered that, beginning with the second iteration, the string I was using to represent the path of the file to to the final "/" (equivalent to unix dirname) had one or more superfluous characters. Which character (or characters) appear seems to depend on the length of the original filename (or argv), and possibly its content.

    Now, I can't think of any reason why a simple printf() call would affect other, unrelated code, unless I have incurred some memory referencing problem. Granted, the print statement doesn't resolve the problem in all instances, but in many cases the program will run to completion on a whole set of torrent files, where it would have failed on the same set of files without that printf() statement.

    The program encounters no runtime or compile errors, as I run no checks on the success or failure of rename (on the assumption that, if my code is correct, rename will not ordinarily fail). The files, beginning with the second, simply fail to be moved, as the string used to represent their destination has superfluous characters and thus cannot math the created directory. If using torrent files from more than one tracker, I assume that mkdir would also fail after the first iteration, again without any checks.

    I was hoping that someone might be able to take a look at my code and tell me what I'm doing wrong. I've highlighted in red the section of code where the problem occurs.

    Incidentally, the commented-out printf() call in the highlighted if-statement is the printf() call which seems to have such a great effect on end result of my string creation.

    Code:
    int main (int argc, const char * argv[]) {
    	if (argc <= 1) {
    		printf("Error: No input file.");
    		return 1;
    	}
    	// check if first char of second arg is "-"
    	if (argv[1][0] == '-') {
    		if (strlen(argv[1]) > 2) {
    			printf("Error: Invalid flag.");
    			return 1;
    		}
    
    		switch (argv[1][1]) {
    			case 'd': scanDirectory(argv[2]); return 0;
    			default: printf("Error: Unrecognized flag."); return 1;
    					  
    		}
    	}
    
    	/* at this point we're in file mode, so we treat each argument as a file in
    	 * a for loop and go from there. */
    	int j;
    	for (j = 1; j < argc; j++) {
    		FILE *fp;
    		char buffer[kMaxBufferSize];
            
    		if ( (fp = fopen(argv[j], "r")) == NULL ) {
    			printf("Error: File not found: %s\n", argv[j]);
    			continue;
    		}
    		if ( fgets(buffer, kMaxBufferSize, fp) == NULL ) {
    			if ( feof(fp) )
    				printf("Error: End of file reached: %s\n", argv[j]);
    			else
    				printf("Unknown error occured on file: %s\n", argv[j]);
    			continue;
    		}
    		fclose(fp);
    		char *trackerName;
    	    if ( (trackerName = parseTracker(buffer)) == NULL) {
    		    printf("HTTP tracker not found in file: %s\n", argv[j]);
    		    free(trackerName);
    		    continue;
    	    }
    
    		if ( strrchr(argv[j], '/') != NULL) {
    			char *baseName = strrchr(argv[j], '/');
    			//printf("%s\n",baseName);
    			int pathLen = (strlen(argv[j]) - strlen(baseName));
    			char dest[strlen(argv[j]) + strlen(trackerName)];
    			strncpy(dest, argv[j], pathLen);
    			printf("%s\n", dest);
    			strncat(dest, "/", 1);
    			strncat(dest, trackerName, strlen(trackerName));
    			mkdir(dest, 0764);
    			strncat(dest, baseName, strlen(baseName));
    			rename(argv[j], dest);
    		}
    		else {
    			mkdir(trackerName, 0764);
    
    			char dest[ (strlen(trackerName)) + (strlen(argv[j])) + 1 ];
    			strcpy(dest, trackerName);
    			strncat(dest, "/", 1);
    			strncat(dest, argv[j], strlen(argv[j]) );
    			rename(argv[j], dest);
    		}
    
    		// last thing before the loop exits
    		free(trackerName);
    	}
        return 0;
    }
    Last edited by marshaul; 01-03-2011 at 01:26 PM.

  2. #2
    Registered User
    Join Date
    Jan 2011
    Posts
    9
    Here is the rest of the code in case anybody wants to compile it to see what I'm talking about:

    Code:
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    #include <glob.h>
    #include <sys/stat.h>
    
    #define kMaxBufferSize 250 // this ought to be enough to get the tracker name 
    
    char* parseTracker(char *buffer);
    void scanDirectory(const char *dirName);
    
    int main (int argc, const char * argv[]) {
    ...
    }
    
    char* parseTracker(char *buffer) {
    	char *pNameStart;
    	char *pNameEnd;
    
    	if ( (pNameStart = strstr(buffer, "http://")) == NULL ) {
    		return NULL;
    	}
    	pNameStart += 7; // http:// is 7 characters long
    	
    	char *colon, *slash;
    	colon = strchr(pNameStart, ':');
    	slash = strchr(pNameStart, '/');
    
    	if (slash < colon)
    		pNameEnd = slash;
    	else
    		pNameEnd = colon;
    
    	int nameLen = pNameEnd - pNameStart;
    	
    	char *trackerName;
    	trackerName = malloc(nameLen);
    	strncpy(trackerName, pNameStart, nameLen);
    	trackerName[nameLen] = 0;
    	return trackerName;
    }
    
    void scanDirectory(const char *dirName) {
        char path[strlen(dirName) + 2];
    	strcpy(path, dirName);
    	strncat(path, "/*.torrent", 2);
    
    	glob_t data;
    
    	switch( glob(path, GLOB_NOSORT, NULL, &data ) ) {
    		case 0: break;
    		case GLOB_NOSPACE: printf( "Out of memory\n" ); break;
    		case GLOB_ABORTED: printf( "Reading error\n" ); break;
    		case GLOB_NOMATCH: printf( "No files found\n" ); break;
    		default: break;
    	}
    
    	int i;
    	for(i=0; i<data.gl_pathc; i++) {
    		FILE *fp;
    		char buffer[kMaxBufferSize];
    		
            fp = fopen(data.gl_pathv[i], "r");
    
    		if ( fgets(buffer, kMaxBufferSize, fp) == NULL ) {
    			if ( feof(fp) )
    				printf("Error: End of file %s reached.", data.gl_pathv[i]);
    			else
    				printf("Unknown error occured on file: %s", data.gl_pathv[i]);
    			continue;
    		}
    		fclose(fp);
    		char *trackerName; 
    		if ( (trackerName = parseTracker(buffer)) == NULL) {
    			printf("HTTP tracker not found in file: %s\n", data.gl_pathv[i]);
    			free(trackerName);
    			continue;
    		}
    
    		char *fileName = data.gl_pathv[i] + (strlen(dirName));
    		char dest[ (strlen(path)) + (strlen(fileName)) + (strlen(trackerName)) +
    				1 ]; 
    		strcpy(dest, dirName);
    		strncat(dest, "/", 1);
    		strncat(dest, trackerName, strlen(trackerName));
    		mkdir(dest, 0764);
    		strncat(dest, fileName, strlen(fileName));
    
    		rename(data.gl_pathv[i], dest);
    
    		free(trackerName);
    	}
    
    	globfree( &data );
    }

  3. #3
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    A string is only a string if it ends in \0. dest is probably therefore at least one character too short to actually hold the thing it's supposed to, which means the "string" doesn't stop.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    Code:
    		char dest[ (strlen(path)) + (strlen(fileName)) + (strlen(trackerName)) +
    				1 ]; 
    		strcpy(dest, dirName);
    		strncat(dest, "/", 1);
    		strncat(dest, trackerName, strlen(trackerName));
    		mkdir(dest, 0764);
    		strncat(dest, fileName, strlen(fileName));
    The +1 either counts the \0, or it counts the "/".
    Either way, you need both of them.

    Also, don't use strncpy() / strncat().
    Neither add a \0 in the right place, and you already "know" you have enough space (calculation permitting).

    I'm assuming the final strcat begins with a / to generate a proper pathname.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Jan 2011
    Posts
    9
    Here is a sample output (stdout) of the current build (with the first of two printf statements in that if-statement commented out):


    Code:
    /tmp/test
    /tmp/test&
    /tmp/test&
    /tmp/test&
    /tmp/test&
    /tmp/test&
    /tmp/test??
    /tmp/test?_?
    If I uncomment the first printf (which prints the basename of the current file), you'll see that the output of "dest" is appropriate for everything but the last iteration, where the filename includes an underscore.

    Code:
    /2F0D6C7DA767A6320B1CEED46DDD06C43EA2E454.torrent
    /tmp/test
    /462AFF632FC67EE4129881A5B4AADEA24ED0682E.torrent
    /tmp/test
    /4F0E06898AF593D5EBB456C2C8C1D24100005733.torrent
    /tmp/test
    /8CFCE7DFA2F8E1DAEDA1304F3ACFA123F7691E18.torrent
    /tmp/test
    /A56E6B7F41923259980FB1C6FDD413798DC998EA.torrent
    /tmp/test
    /A56E6B7F41923259980FB1C6FDD413798DC998EA_copy.torrent
    /tmp/testSq??
    /temp.torrent
    /tmp/test

    Edit: Woah, two responses already. Thanks, folks.

  6. #6
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Your string dest probably doesn't get a trailing null appended...
    It also looks like it may not be big enough to hold the entire pathname and tracker name.

    You would probably be a lot further ahead to create a global buffer above main and re-use it instead of trying to recreated it every time through the loop... Moreover since you do recreate the array every time, you're probably orphaning a whole mess of them and leaking memory like crazy in the process.

    Code:
    #include <this and that>
    
    char dest[1024];
    
    int main (...
    
    // interveining code...
    
    
    if ( strrchr(argv[j], '/') != NULL) {
           memset(dest,0,1024);
          // construct target path in dest here.
    }

  7. #7
    Registered User
    Join Date
    Jan 2011
    Posts
    9
    Quote Originally Posted by Salem View Post
    Code:
    		char dest[ (strlen(path)) + (strlen(fileName)) + (strlen(trackerName)) +
    				1 ]; 
    		strcpy(dest, dirName);
    		strncat(dest, "/", 1);
    		strncat(dest, trackerName, strlen(trackerName));
    		mkdir(dest, 0764);
    		strncat(dest, fileName, strlen(fileName));
    The +1 either counts the \0, or it counts the "/".
    Either way, you need both of them.

    Also, don't use strncpy() / strncat().
    Neither add a \0 in the right place, and you already "know" you have enough space (calculation permitting).

    I'm assuming the final strcat begins with a / to generate a proper pathname.
    Well, adding the +1 to the other dest didn't help, nor did changing the +1s to +2. :P

    The final strncat does begin with a /, as my "baseName" starts with the final slash in the given path (unlike the actual unix command basename).

    IF I shouldn't use strncpy or strncat, what should I use?

  8. #8
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    You should really be using strcat. For instance,
    Code:
    		strcpy(dest, dirName);
    		strncat(dest, "/", 1);
    		strncat(dest, trackerName, strlen(trackerName));
    It can only be sheer luck for your third strncat to even have a chance of working -- the point of strncat is that it starts where the string currently ends, i.e. at the \0 end-of-string marker. But your second strncat deliberately destroyed the \0 end-of-string marker without putting a new one in....

  9. #9
    Registered User
    Join Date
    Jan 2011
    Posts
    9
    Quote Originally Posted by CommonTater View Post
    Your string dest probably doesn't get a trailing null appended...
    Should I do that manually? If so, should I do so before both the mkdir and the rename commands, both of which use dest (with different values) as arguments?


    It also looks like it may not be big enough to hold the entire pathname and tracker name.

    You would probably be a lot further ahead to create a global buffer above main and re-use it instead of trying to recreated it every time through the loop... Moreover since you do recreate the array every time, you're probably orphaning a whole mess of them and leaking memory like crazy in the process.

    Code:
    #include <this and that>
    
    char dest[1024];
    
    int main (...
    
    // interveining code...
    
    
    if ( strrchr(argv[j], '/') != NULL) {
           memset(dest,0,1024);
          // construct target path in dest here.
    }
    Thanks for the tip, I will probably incorporate this change once I've got the rest of my problems solved. Although I doubt this is causing problems just yet, as I'm only testing with a small number of torrents.
    Last edited by marshaul; 01-03-2011 at 01:45 PM.

  10. #10
    Registered User
    Join Date
    Jan 2011
    Posts
    9
    Quote Originally Posted by tabstop View Post
    You should really be using strcat. For instance,
    Code:
    		strcpy(dest, dirName);
    		strncat(dest, "/", 1);
    		strncat(dest, trackerName, strlen(trackerName));
    It can only be sheer luck for your third strncat to even have a chance of working -- the point of strncat is that it starts where the string currently ends, i.e. at the \0 end-of-string marker. But your second strncat deliberately destroyed the \0 end-of-string marker without putting a new one in....
    Got it...

    So does strcat, I take it, append the null character at the end, when strncat does not?

    Thanks.

  11. #11
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    [QUOTE=marshaul;993372]
    Quote Originally Posted by CommonTater View Post
    Your string dest probably doesn't get a trailing null appended...
    [//QUOTE]
    Should I do that manually? If so, should I do so before both the mkdir and the rename commands, both of which use dest (with different values) as arguments?
    Re-using variables can lead to all kinds of strangness... You may want to consider using two separate buffers... for the purpose.

    Thanks for the tip, I will probably incorporate this change once I've got the rest of my problems solved. Although I doubt this is causing problems just yet, as I'm only testing with a small number of torrents.
    Ummm... the way you are doing it right now is the problem...

  12. #12
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    There is nothing wrong with your locally declared array of variable length, so long as the length is adequate.

    It certainly isn't a memory leak situation.

    Providing your length calculation is correct, strcpy() the first string, then strcat() the rest will be fine.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  13. #13
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by Salem View Post
    There is nothing wrong with your locally declared array of variable length, so long as the length is adequate.

    It certainly isn't a memory leak situation.

    Providing your length calculation is correct, strcpy() the first string, then strcat() the rest will be fine.
    He's recreating the same variable over and over inside a loop...

    Code:
    for(i=0; i<data.gl_pathc; i++) {
    //...
    
    char dest[ (strlen(path)) + (strlen(fileName)) + (strlen(trackerName)) +1 ]; 
    
    //...
    }
    How can this not leak memory?
    When the function exits only the last copy is indexed and can be destroyed.

  14. #14
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by CommonTater View Post
    He's recreating the same variable over and over inside a loop...

    Code:
    for(i=0; i<data.gl_pathc; i++) {
    //...
    
    char dest[ (strlen(path)) + (strlen(fileName)) + (strlen(trackerName)) +1 ]; 
    
    //...
    }
    How can this not leak memory?
    When the function exits only the last copy is indexed and can be destroyed.

    Because they're static variables and consequently get destroyed every time through the loop.

  15. #15
    Registered User
    Join Date
    Jan 2011
    Posts
    9
    Edit: next page...

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Polymorphism and generic lists
    By Shibby3 in forum C# Programming
    Replies: 9
    Last Post: 07-26-2010, 05:27 AM
  2. Unable to compare string with 'getter' returned string.
    By Swerve in forum C++ Programming
    Replies: 2
    Last Post: 10-30-2009, 05:56 PM
  3. char Handling, probably typical newbie stuff
    By Neolyth in forum C Programming
    Replies: 16
    Last Post: 06-21-2009, 04:05 AM
  4. Replies: 8
    Last Post: 10-17-2005, 07:01 PM
  5. pointers
    By InvariantLoop in forum C Programming
    Replies: 13
    Last Post: 02-04-2005, 09:32 AM