-
char pointers and arrays
while working on a simple little utility I came across something I don't quite understand
why is it that
Code:
char str[] = "first,second,third";
works for strtok()
but
Code:
char *str = "first,second,third";
ends in a segfault that even though I trace the stack I cant figure out why for the life of me
Now it's been a while since I was in class for C, since I've been programming professionally for years but to the best of my knowledge
char *blah = "this is a const char *";
and char blah[] = "this is a const char*";
were the same exact thing.
The examples were compiled on a Linux x86 system
anyway, here's the sample codes
Code:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
char str[] = "first,second,third";
char *result = NULL;
char *ptr = str;
printf("%s\n",str);
result=strtok(ptr,",");
while(result)
{
printf("%s\n",result);
result = strtok(NULL,",");
}
return 0;
}
the above WORKS
Code:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
char *str = "first,second,third";
char *result = NULL;
char *ptr = str;
printf("%s\n",str);
result=strtok(ptr,",");
while(result)
{
printf("%s\n",result);
result = strtok(NULL,",");
}
return 0;
}
the above results in a segmentation fault
-
1. You cannot modify string literal like "helo" it results in undefined behaviour.
e.g char *p = "crash";
*p = 'C'; // undefined behaviour!
2. strtok() modifies the string. RTFM for more details.
-
while looking at the assemnly I learned that the way the compiler arranges the two char arrays
for char *
the string is set in static memory (i.e. .LC0 .string "blahblah")
where as
char []
pushes the array onto the stack and copies the characters word (4bytes) at a time
Code:
char *
.LC0:
.string "first,second,third"
...
movl $.LC0, 20(%esp)
Code:
char []
movl $1936877926, 21(%esp)
movl $1702046836, 25(%esp)
movl $1684959075, 29(%esp)
movl $1768453164, 33(%esp)
movw $25714, 37(%esp)
so the answer that I came up with in a simplified manner is that with a char *blah = "words" the characters are static and cannot be manipulated while char [] can be manipulated similar to the old const char * vs char *
I cant believe it took me to disassemble the program to figure out the obvious.
-
Save yourself trouble, declare all char* pointers that hold string literals const. Then you will get a compile error instead of a runtime error next time.