Thread: Segmentation fault

  1. #1
    Registered User
    Join Date
    Feb 2013
    Location
    Sweden
    Posts
    89

    Segmentation fault

    I'm not really sure what happens here. The problem is that I get a segmentation fault at the last wprintf at line 66 in my code below.

    I'm obviously missing something that is probably obvious for real programmers, but what?
    Here's my code so far. Main starts at line 43 and the relevant function at line 29. The other functions are not important for this thread.
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <locale.h>
    #include <string.h>
    #include <wchar.h>
    
    
    #define VER "0.1"
    #define PROGNAME "SomeProgram"
    
    
    const int WideMode=1;
    
    
    void UserErrorExit(void) {
        fputs("Incorrect number of parameters.\n", stderr);
        fputs("Read the manual!\n\n", stderr);
        fprintf(stderr,"man %s\n", PROGNAME);
        exit(EXIT_FAILURE);
    }
    
    
    void MemoryErrorExit(void) {
        fprintf(stderr, "Oops, seems like we are out of memory\n");
        exit(EXIT_FAILURE);
    }
    
    
    void *WMalloc(char **mbs) {
        size_t Len=mbstowcs(NULL, *mbs, 0);
    
    
        wchar_t *newArray=malloc((Len+1)*sizeof(wchar_t));
        if (newArray==NULL)
            MemoryErrorExit();
    
    
        Len=mbstowcs(newArray, *mbs, Len);
        return (newArray);
    }
    
    
    int main(int argc, char *argv[])
    {
        setlocale(LC_ALL, ""); // Use system's locale.
        fwide(stdout, WideMode); // Set standard output to wide character mode.
    
    
    //    Check user input. ——————————————————————————————————————————————————————————
        if(argc<2 || argc>3)
            UserErrorExit();
        if(argc==2) {
            if(strcmp(argv[1],"--version")==0) {
                wprintf(L"Version: %s\n", VER);
                return EXIT_SUCCESS;
            }
            else
                UserErrorExit();
        }
    //    ————————————————————————————————————————————————————————————————————————————
    
    
        wchar_t *MyString;
        MyString=WMalloc(&argv[1]);
        
        wprintf(L"%s\n", *MyString);
        free(MyString);
        return EXIT_SUCCESS;
    }
    Compile:
    Code:
    gcc -Wall -Wextra -std=gnu99 "${Name}.c" -o "${Name}"
    (The Name variable contains the program name)
    What I'm trying to do:
    The user enters the program name followed by two text strings.
    One of those (at the moment) is supposed to be copied to a wide character string variable, ”MyString” using the WMalloc function (which by the way maybe needs another name – first it was only supposed to allocate memory for the string, then I enhanced its functionality and I should maybe have changed its name accordingly).

    I guess the problem is the ”MyString=WMalloc(&argv[1]);” line (line 64), but I'm not sure how to make it right. My head is just spinning…
    Running:
    Code:
    $ ./MallocInFunctionTest Johnny Rosenberg
    Segmenteringsfel (minnesutskrift skapad)
    $
    The error message is in the language of my locale and means something like ”Segmentation fault (memory dump created)”.

    gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
    Operating system: Ubuntu 14.04
    Last edited by guraknugen; 04-26-2015 at 10:59 AM.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,662
    Sooner or later, you need to learn how to use a debugger. It can tell you right off the bat why something crashes.
    Code:
    $ gcc -g -Wall -Wextra -std=gnu99 foo.c
    $ gdb -q ./a.out 
    Reading symbols from /home/sc/Documents/a.out...done.
    (gdb) run foo bar
    Starting program: /home/sc/Documents/a.out foo bar
    
    Program received signal SIGSEGV, Segmentation fault.
    0x00007ffff7a98cf4 in _IO_vfwprintf (s=<optimized out>, format=<optimized out>, ap=<optimized out>) at vfprintf.c:1629
    1629    vfprintf.c: No such file or directory.
            in vfprintf.c
    (gdb) bt
    #0  0x00007ffff7a98cf4 in _IO_vfwprintf (s=<optimized out>, format=<optimized out>, ap=<optimized out>) at vfprintf.c:1629
    #1  0x00007ffff7aae049 in __wprintf (format=<optimized out>) at wprintf.c:34
    #2  0x00000000004009b1 in main (argc=3, argv=0x7fffffffe0c8) at foo.c:66
    (gdb) frame 2
    #2  0x00000000004009b1 in main (argc=3, argv=0x7fffffffe0c8) at foo.c:66
    66          wprintf(L"%s\n", *MyString);
    (gdb) print *MyString
    $1 = 102 L'f'
    Now, using your knowledge of printf calls, does your parameter look to be the right thing to be passing?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Feb 2013
    Location
    Sweden
    Posts
    89
    Quote Originally Posted by Salem View Post
    Sooner or later, you need to learn how to use a debugger. It can tell you right off the bat why something crashes.
    Code:
    $ gcc -g -Wall -Wextra -std=gnu99 foo.c
    $ gdb -q ./a.out 
    Reading symbols from /home/sc/Documents/a.out...done.
    (gdb) run foo bar
    Starting program: /home/sc/Documents/a.out foo bar
    
    Program received signal SIGSEGV, Segmentation fault.
    0x00007ffff7a98cf4 in _IO_vfwprintf (s=<optimized out>, format=<optimized out>, ap=<optimized out>) at vfprintf.c:1629
    1629    vfprintf.c: No such file or directory.
            in vfprintf.c
    (gdb) bt
    #0  0x00007ffff7a98cf4 in _IO_vfwprintf (s=<optimized out>, format=<optimized out>, ap=<optimized out>) at vfprintf.c:1629
    #1  0x00007ffff7aae049 in __wprintf (format=<optimized out>) at wprintf.c:34
    #2  0x00000000004009b1 in main (argc=3, argv=0x7fffffffe0c8) at foo.c:66
    (gdb) frame 2
    #2  0x00000000004009b1 in main (argc=3, argv=0x7fffffffe0c8) at foo.c:66
    66          wprintf(L"%s\n", *MyString);
    (gdb) print *MyString
    $1 = 102 L'f'
    Now, using your knowledge of printf calls, does your parameter look to be the right thing to be passing?
    Thank you for replying, I appreciate it very much.

    Well, I already knew that I sent the wrong thing to the wprintf. My intention was that *MyString should contain, in your case, ”foo”, which it obviously doesn't, so where I actually fail here is to figure out why. I suspect that I use the mbstowcs() function wrong, but I fail to figure out how I should use it in this specific case.
    Code:
    void *WMalloc(char **mbs) {    size_t Len=mbstowcs(NULL, *mbs, 0);
    
    
        wchar_t *newArray=malloc((Len+1)*sizeof(wchar_t));
        if (newArray==NULL)
            MemoryErrorExit();
    
    
        Len=mbstowcs(newArray, *mbs, Len);
        return (newArray);
    }
    In my example, where argv[1]=”Johnny”, Len seems to be correctly calculated, it's 6.
    The sizeof thing also seems to work: (Len+1)*sizeof(wchar_t)=28 in my case. Seems to be what I'm looking for.
    However, ”Len=mbstowcs(newArray, *mbs, Len);” seems to fail, and I seem to fail to figure out why. *newArray seems to end up as a single ”J” in my example run, not as ”Johnny”.


    Last edited by guraknugen; 04-26-2015 at 11:49 AM.

  4. #4
    Registered User
    Join Date
    Sep 2014
    Posts
    364
    I have not checked the full code, but the formatstring for wprintf is wrong.
    Code:
    …
        wprintf(L"%s\n", *MyString);
    …
    First, you don't need to dereference the pointer.
    The format %s (s = string) is for character-arrays (aka strings or multi-byte-strings).
    For arrays with wide characters you should use %ls (ls = long strings).
    The line should be:
    Code:
    …
        wprintf(L"%ls\n", MyString);
    …
    If you know that your string has a maximum length, you can also do the following:
    Code:
    …
    #define MAX_LEN 1024
    …
        wchar_t MyString[MAX_LEN];
        swprintf(MyString, MAX_LEN, "%ls", argv[1]);
        wprintf(L"%ls\n", MyString);
    …
    With this solution, you don't need to allocate memory and can't forget to free it at end.
    Last edited by WoodSTokk; 04-26-2015 at 08:51 PM.
    Other have classes, we are class

  5. #5
    Registered User
    Join Date
    Feb 2013
    Location
    Sweden
    Posts
    89
    Quote Originally Posted by WoodSTokk View Post
    I have not checked the full code, but the formatstring for wprintf is wrong.
    Code:
    …
        wprintf(L"%s\n", *MyString);
    …
    First, you don't need to dereference the pointer.
    The format %s (s = string) is for character-arrays (aka strings or multi-byte-strings).
    For arrays with wide characters you should use %ls (ls = long strings).
    The line should be:
    Code:
    …
        wprintf(L"%ls\n", MyString);
    …
    If you know that your string has a maximum length, you can also do the following:
    Code:
    …
    #define MAX_LEN 1024
    …
        wchar_t MyString[MAX_LEN];
        swprintf(MyString, MAX_LEN, "%ls", argv[1]);
        wprintf(L"%ls\n", MyString);
    …
    With this solution, you don't need to allocate memory and can't forget to free it at end.
    Thanks for your reply. I'll give it another shot soon (quite busy this evening with other stuff).
    Since I get the original string from argv[1] in this case, I don't know the maximum length of the string; who knows what the user (myself in most cases, I guess…) will type… Unless there is a maximum length of what can be typed as one line in Bash, I guess there is… But this is also for learning, and allocating and freeing memory seems to be worth learning and get used to in the long run, I think…

  6. #6
    Registered User
    Join Date
    Feb 2013
    Location
    Sweden
    Posts
    89
    [QUOTE=WoodSTokk;1230749]I have not checked the full code, but the formatstring for wprintf is wrong.
    Code:
    …
        wprintf(L"%s\n", *MyString);
    …
    First, you don't need to dereference the pointer.
    The format %s (s = string) is for character-arrays (aka strings or multi-byte-strings).
    For arrays with wide characters you should use %ls (ls = long strings).
    The line should be:
    Code:
    …
        wprintf(L"%ls\n", MyString);
    …
    I didn't have time to test anything yet, but if the above is true, then this is not… I quote:
    You must tell printf to look for multibyte characters by adding the l: %ls.

    (If you happen to be using wprintf, on the other hand, you can simply use %s and it will natively treat all strings as wide character strings.)

    For some reason this site is telling me that I need to lengthen this message with at least 4 characters, so here they are, and a few more… Well, I don't mind. If it asks for nonsense, I'll write nonsense… Done. I'm happy, the site is happy… :P
    Last edited by guraknugen; 04-30-2015 at 02:11 PM.

  7. #7
    Registered User
    Join Date
    Sep 2014
    Posts
    364
    'char' stand for character and is always 1 byte.
    In an multi byte string can be characters that need more then 1 byte to code a special character, but it is stored byte by byte.
    There is no problem to put a multi byte character inside a char-array and for char-arrays is the format '%s'.
    But a wide character take allways more then 1 byte. I'm working here on an amd64 and a wide character is 4 byte long.
    Because amd64 is an little-endian, a uppercase letter 'A' will be stored as '0x41 0x00 0x00 0x00' in wide format (wchar_t).
    If you use '%s' as format at printf for a wide character string, it will mostly show the first character of the string,
    because '%s' treat the argument byte by byte and the second byte in an wide character is 0x00 if the first character a standard ascii character.
    So we need a format that wide characters can also be handled from the functions and this is '%ls'.

    For clarification:
    Code:
    #include <stdio.h>
    #include <wchar.h>
    
    int main (void)
    {
        char single_byte_string[] = "Text as string or multi byte string.\n";
        wchar_t wide_char_string[] = L"Text as wide character string.\n";
    
    // This fuctions take a character array and write characters out.
        printf("%s", single_byte_string);
    
    // This fuctions take a wide character array and write characters out.
        printf("%ls", wide_char_string);
    
    // This fuctions take a character array and write wide characters out.
    //   wprintf(L"%s", single_byte_string);
    
    // This fuctions take a wide character array and write wide characters out.
    //   wprintf(L"%ls", wide_char_string);
    
        return 0;
    }
    You can only use printf or wprintf at same time because a stream can only have one format.
    If you not set the stream format, the first output will set it automaticaly.
    The 'printf'-function show you that it can output char-arrays and wide-arrays.
    Comment out the 'printf'-lines and uncomment the 'wprintf'-lines and you will see that
    the 'wprintf'-function can also output char-arrays and wide-arrays.
    But in both cases, you must set the right specifier for the argument.
    Last edited by WoodSTokk; 04-30-2015 at 04:45 PM.
    Other have classes, we are class

  8. #8
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by guraknugen View Post
    Code:
    void *WMalloc(char **mbs) {
        size_t Len=mbstowcs(NULL, *mbs, 0);
    
        wchar_t *newArray=malloc((Len+1)*sizeof(wchar_t));
        if (newArray==NULL)
            MemoryErrorExit();
    
        Len=mbstowcs(newArray, *mbs, Len);
        return (newArray);
    }
    The last parameter to mbstowcs() is the buffer size, i.e. includes the space for the string-terminating '\0' that mbstowcs will add (and is not counted in the returned number of wide characters). Therefore, the latter mbstowcs() call should use Len+1 too.

    Using a pointer to a string pointer makes no sense, either, as has been already mentioned. Your function should look more like
    Code:
    wchar_t *wide_string(const char *const s)
    {
        wchar_t *w;
        size_t n, c;
    
        if (!s) {
            /* s is NULL. Set errno? */
            return NULL;
        }
    
        n = mbstowcs(NULL, s, 0);
        if (n == (size_t)-1) {
            /* Invalid character in s. Set errno? */
            return NULL;
        }
    
        w = malloc((n + 1) * sizeof w[0]);
        if (w == NULL) {
            /* Out of memory. Set errno? */
            return NULL;
        }
    
        if (n > 0) {
            c = mbstowcs(w, s, n + 1);
            if (c != s) {
                free(w);
                /* s or locale was modified unexpectedly. Set errno? */
                return NULL;
            }
        } else
            w[0] = L'\0';
    
        return w;
    }
    The above is also safe against NULL inputs; it then returns NULL also.


    The only difference between wprintf() and printf(), or between fwscanf() and fscanf(), is that the wide versions take a wide format string. The type specifiers for the parameters do not change. %c and %s refer to arrays and strings of type char, and %lc and %ls refer to arrays and strings of type wchar_t.

    So, if standard output is in wide mode, you'll want to use
    Code:
    wprintf(L"%s\n", argv[1]);
    wprintf(L"%ls\n", MyWideString);
    but if standard output is in byte oriented mode, you'll want to use
    Code:
    printf("%s\n", argv[1]);
    printf("%ls\n", MyWideString);
    See how the only difference is in the function names, and whether the format string is narrow or wide? The conversion specifiers for the parameters are always the same!

    You should also check the return value of fwide(), too, and not just assume it worked. It is just one more if statement, but may come in handy for later users who use it to mess with something weirder -- say, like interposing dynamic libraries that print something to standard output or standard error, and thus force them in byte oriented mode.

    If using above wide_string(), then try something like
    Code:
        wchar_t *mywidestring;
    
        mywidestring = wide_string(argv[1]);
        if (!mywidestring) {
            /* If you set errno in wide_string, you could check it here. */
            wprintf(L"Cannot convert \"%s\" to a wide string!\n", argv[1]);
            return EXIT_FAILURE;
        }
    
        wprintf(L"Narrow string \"%s\" converted to wide string L\"%ls\".\n", argv[1], mywidestring);
    
        free(mywidestring);
        return EXIT_SUCCESS;

  9. #9
    Registered User
    Join Date
    Feb 2013
    Location
    Sweden
    Posts
    89
    Quote Originally Posted by WoodSTokk View Post
    'char' stand for character and is always 1 byte.
    In an multi byte string can be characters that need more then 1 byte to code a special character, but it is stored byte by byte.
    There is no problem to put a multi byte character inside a char-array and for char-arrays is the format '%s'.
    But a wide character take allways more then 1 byte. I'm working here on an amd64 and a wide character is 4 byte long.
    Because amd64 is an little-endian, a uppercase letter 'A' will be stored as '0x41 0x00 0x00 0x00' in wide format (wchar_t).
    If you use '%s' as format at printf for a wide character string, it will mostly show the first character of the string,
    because '%s' treat the argument byte by byte and the second byte in an wide character is 0x00 if the first character a standard ascii character.
    So we need a format that wide characters can also be handled from the functions and this is '%ls'.
    Yes, I know all that, but what I was trying to say is that someone stated that printf and wprintf doesn't use the same letters, that is %s in wprintf does the same thing as %ls in printf. I don't know if this is true, because I didn't try it yet, I probably will this afternoon, though. The article to which I linked is a tutorial on this site, so I figured that what it says is true. If not, I suggest that someone edit it, maybe the author himself…

    Quote Originally Posted by WoodSTokk View Post
    For clarification:
    Code:
    #include <stdio.h>
    #include <wchar.h>
    
    int main (void)
    {
        char single_byte_string[] = "Text as string or multi byte string.\n";
        wchar_t wide_char_string[] = L"Text as wide character string.\n";
    
    // This fuctions take a character array and write characters out.
        printf("%s", single_byte_string);
    
    // This fuctions take a wide character array and write characters out.
        printf("%ls", wide_char_string);
    
    // This fuctions take a character array and write wide characters out.
    //   wprintf(L"%s", single_byte_string);
    
    // This fuctions take a wide character array and write wide characters out.
    //   wprintf(L"%ls", wide_char_string);
    
        return 0;
    }
    You can only use printf or wprintf at same time because a stream can only have one format.
    If you not set the stream format, the first output will set it automaticaly.
    That is indeed useful information for me, thank you.

    Quote Originally Posted by WoodSTokk View Post
    The 'printf'-function show you that it can output char-arrays and wide-arrays.
    Comment out the 'printf'-lines and uncomment the 'wprintf'-lines and you will see that
    the 'wprintf'-function can also output char-arrays and wide-arrays.
    But in both cases, you must set the right specifier for the argument.
    Thanks!

  10. #10
    Registered User
    Join Date
    Feb 2013
    Location
    Sweden
    Posts
    89
    Quote Originally Posted by Nominal Animal View Post
    The last parameter to mbstowcs() is the buffer size, i.e. includes the space for the string-terminating '\0' that mbstowcs will add (and is not counted in the returned number of wide characters). Therefore, the latter mbstowcs() call should use Len+1 too.
    Yes, you're right. I didn't think of that.
    Quote Originally Posted by Nominal Animal View Post
    Using a pointer to a string pointer makes no sense, either, as has been already mentioned. Your function should look more like
    Code:
    wchar_t *wide_string(const char *const s)
    {
        wchar_t *w;
        size_t n, c;
    
        if (!s) {
            /* s is NULL. Set errno? */
            return NULL;
        }
    
        n = mbstowcs(NULL, s, 0);
        if (n == (size_t)-1) {
            /* Invalid character in s. Set errno? */
            return NULL;
        }
    
        w = malloc((n + 1) * sizeof w[0]);
        if (w == NULL) {
            /* Out of memory. Set errno? */
            return NULL;
        }
    
        if (n > 0) {
            c = mbstowcs(w, s, n + 1);
            if (c != s) {
                free(w);
                /* s or locale was modified unexpectedly. Set errno? */
                return NULL;
            }
        } else
            w[0] = L'\0';
    
        return w;
    }
    The above is also safe against NULL inputs; it then returns NULL also.
    Thanks, seems to be some nice details in there that I can learn from. When I studied C in the 1980's, wide strings didn't exist, at least we never learned about it… On the other hand it was a very long time ago and I have never used C since then until recently, so I may have just forgotten about it…
    Quote Originally Posted by Nominal Animal View Post
    The only difference between wprintf() and printf(), or between fwscanf() and fscanf(), is that the wide versions take a wide format string. The type specifiers for the parameters do not change. %c and %s refer to arrays and strings of type char, and %lc and %ls refer to arrays and strings of type wchar_t.
    Thanks, I didn't realise that. So obviously the tutorial I linked to on this site is wrong then. Maybe someone should edit it.
    Quote Originally Posted by Nominal Animal View Post
    So, if standard output is in wide mode, you'll want to use
    Code:
    wprintf(L"%s\n", argv[1]);
    wprintf(L"%ls\n", MyWideString);
    but if standard output is in byte oriented mode, you'll want to use
    Code:
    printf("%s\n", argv[1]);
    printf("%ls\n", MyWideString);
    See how the only difference is in the function names, and whether the format string is narrow or wide? The conversion specifiers for the parameters are always the same!

    You should also check the return value of fwide(), too, and not just assume it worked. It is just one more if statement, but may come in handy for later users who use it to mess with something weirder -- say, like interposing dynamic libraries that print something to standard output or standard error, and thus force them in byte oriented mode.
    But you are also saying that if I use printf() rather than wprintf(), I don't need fwide() at all?
    Quote Originally Posted by Nominal Animal View Post

    If using above wide_string(), then try something like
    Code:
        wchar_t *mywidestring;
    
        mywidestring = wide_string(argv[1]);
        if (!mywidestring) {
            /* If you set errno in wide_string, you could check it here. */
            wprintf(L"Cannot convert \"%s\" to a wide string!\n", argv[1]);
            return EXIT_FAILURE;
        }
    
        wprintf(L"Narrow string \"%s\" converted to wide string L\"%ls\".\n", argv[1], mywidestring);
    
        free(mywidestring);
        return EXIT_SUCCESS;
    Thanks, I'll experiment more with this to get a better feeling for it.

  11. #11
    Registered User
    Join Date
    Feb 2013
    Location
    Sweden
    Posts
    89
    Quote Originally Posted by Nominal Animal View Post
    Using a pointer to a string pointer makes no sense, either, as has been already mentioned.
    I think I remember why I did that now… I think my original idea was to send the whole argv to the function and let the function convert relevant parts of it to wide string, but then I abandoned that idea and forgot to do all necessary changes. Anyway, thanks everyone for pointing it out.

  12. #12
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by guraknugen View Post
    But you are also saying that if I use printf() rather than wprintf(), I don't need fwide() at all?
    It depends on whether you wish to support locales with wide character sets, or not.

    Code:
    printf("%ls\n", wide_string);
    does internally convert wide_string to multibyte representation (similar to wcstombs()), and
    Code:
    wprintf("%s\n", char_string);
    converts char_string to wide representation (similar to mbstowcs()).

    As far as I know, Windows is the only one with wide character locales, and it uses UTF-16 for all of them. It's the only situation where fwide(stream,1) is used in practice, I think.

    In Linux, there are no wide character locales. It's not a bad thing, since everyone really should be using UTF-8 anyway. Since you're using Linux, you're almost certainly already using it.

    UTF-8 is demonstrably better than UTF-16, as UTF-8 is byte order neutral (no endianness, and specifically no need for that darned byte-order-mark, BOM, that some Windows applications stick at the beginning of text files). UTF-8 can naturally represent all of Unicode, whereas most UTF-16 implementations do not consider the two-code-point case correctly (number of characters in a string and such cases). Only in some very specific non-Latin text cases does UTF-8 representation require more memory than UTF-16, and even then the difference is at most 50% (that is, an UTF-8 string requires at most 1.5 times the bytes the same string in UTF-16 requires).

    There are some programs that may wish to use a wide character set for input/output, but I'd basically use iconv() and fread() for reading and fwrite() for writing, instead of the string operations. That way all internal processing can use UTF-8, which should support all the possible character codes -- definitely all those you could use in a wide character locale anyway. Besides, you can then use any character set known to the system, not just those available as part of a locale.

    However.

    If you want your code to be portable, fwide(stdin/out/err, 1) and wprintf()/fwprintf() will let you do that. Just remember that the C99 standard does not say that wchar_t codes are Unicode code points. You can rely on iswspace(), iswalpha(), towlower(), towupper(), wcscmp(), wcschr() and so on to compare and examine wide strings (case insensitively if you first convert both to lower or upper case), but you cannot rely on any specific wchar_t value to mean a specific (Unicode) character, because they may not.

  13. #13
    Registered User
    Join Date
    Feb 2013
    Location
    Sweden
    Posts
    89
    Okay, thanks. Again very useful information.

    I am currently experimenting with this to correct a small, simple program that I wrote. The program counts characters and looks for characters in strings. It fails if the user enters things like Japanese and Cyrillic characters, and since I'm going to use it for such things I need to correct my program.

    So the input is going to be whatever the user feels like, but the output is only going to be an integer number (unless the user use the --version flag or enters the wrong number of arguments), nothing else.

  14. #14
    Registered User
    Join Date
    Feb 2013
    Location
    Sweden
    Posts
    89
    My test program works fine now after following a selected amount of advice above… Made a few changes and everything works. After replacing %s with %ls and replacing ”*MyString” with ”Mystring”, the program started to output expected strings. I then skipped ”wprintf” altogether and also ”fwide”. Didn't seem to be necessary in my case, and portability is not necessary in my case.

    My string conversion function is just about the same as before, I will change it later. So far I only dereferenced it one step and changed ”main” accordingly. Main also prints both arguments now, not only argv[1]. This is the code at the moment:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <locale.h>
    #include <string.h>
    #include <wchar.h>
    
    
    #define VER "0.2"
    #define PROGNAME "CountCharacters"
    
    
    void UserErrorExit(void) {
        fputs("Felaktigt antal parametrar.\n", stderr);
        fputs("Läs den jävla manualen!\n\n", stderr);
        fprintf(stderr,"man %s\n", PROGNAME);
        exit(EXIT_FAILURE);
    }
    
    
    void MemoryErrorExit(void) {
        fprintf(stderr, "Hoppsan, nu tog visst minnet slut här…\n");
        exit(EXIT_FAILURE);
    }
    
    
    wchar_t *ToWide(const char *const mbs) {
        size_t Len=mbstowcs(NULL, mbs, 0)+1;
    
    
        wchar_t *newArray=malloc((Len)*sizeof(wchar_t));
        if (newArray==NULL)
            MemoryErrorExit();
    
    
        Len=mbstowcs(newArray, mbs, Len)+1;
        return (newArray);
    }
    
    
    int main(int argc, char *argv[])
    {
        setlocale(LC_ALL, ""); // Use system's locale.
    
    
    //    Check user input. ——————————————————————————————————————————————————————————
        if(argc<2 || argc>3)
            UserErrorExit();
        if(argc==2) {
            if(strcmp(argv[1],"--version")==0) {
                printf("Version: %s\n", VER);
                return EXIT_SUCCESS;
            }
            else
                UserErrorExit();
        }
    //    ————————————————————————————————————————————————————————————————————————————
    
    
        for(int i=1; i<=2; i++)
            printf("%ls\n", ToWide(argv[i]));
        return EXIT_SUCCESS;
    }
    Anyway, after a very quick study of the better (I suppose) function given earlier in this thread, I have a few beginner's questions about it, maybe somewhat off topic, but still…

    Here's the function that I have questions about:
    Code:
    wchar_t *wide_string(const char *const s){
        wchar_t *w;
        size_t n, c;
    
    
        if (!s) {
            /* s is NULL. Set errno? */
            return NULL;
        }
    
    
        n=mbstowcs(NULL, s, 0);
        if (n==(size_t)-1) {
            /* Invalid character in s. Set errno? */
            return NULL;
        }
    
    
        w=malloc((n+1)*sizeof w[0]);
        if (w==NULL) {
            /* Out of memory. Set errno? */
            return NULL;
        }
    
    
        if (n>0) {
            c=mbstowcs(w, s, n+1);
            if (c!=s) {
                free(w);
                /* s or locale was modified unexpectedly. Set errno? */
                return NULL;
            }
        } else
            w[0]=L'\0';
    
    
        return w;
    }
    The very first line:
    Code:
    wchar_t *wide_string(const char *const s)
    What's the difference between the two ”const”? I guess that the first is to prevent that the variable (or pointer in this case) is changed, but what's the other one for?
    And why ”*const s”, why not ”const *s”? I tried both and both worked, as it seemed…

    Another line:
    Code:
    if (n==(size_t)-1)
    I've been thinking about this one a little. Seems like it's checking if n==-1 after converting -1 to the size_t type, is that right? And is it really necessary to do that conversion? Shouldn't the compiler do that automatically?

    Well, that was just about it, I think…

    Again, thank you all very much for all help.

  15. #15
    Registered User
    Join Date
    Feb 2013
    Location
    Sweden
    Posts
    89
    Oh, sorry, one more question:
    Code:
    if(n > 0) {
        c = mbstowcs(w, s, n + 1);
        if(c != s) {
            free(w);
            /* s or locale was modified unexpectedly. Set errno? */
            return NULL;
        } else
            w[0] = L'\0';
    }
    mbstowcs returns the length of the string, right? So c should be a number.
    s is a pointer to the mbs string, right?
    So what does ”if(c != s)” actually do and why? I seriously don't understand that one, sorry. Or maybe it's just too late in the evening…

    Could it be that it's a typo? ”if(c != n)” seems to make more sense, doesn't it?

    Edit: My compiler, gcc, complained:
    warning: comparison between pointer and integer [enabled by default]
    if (c != s) {
    Last edited by guraknugen; 05-01-2015 at 02:14 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. In GDB no segmentation fault but while running segmentation fault
    By Tamim Ad Dari in forum C++ Programming
    Replies: 2
    Last Post: 12-10-2013, 11:16 AM
  2. Help! segmentation fault
    By doubty in forum C Programming
    Replies: 15
    Last Post: 06-24-2009, 06:35 AM
  3. segmentation fault...please help
    By liaa in forum C Programming
    Replies: 6
    Last Post: 03-21-2009, 09:45 AM
  4. Segmentation fault
    By bennyandthejets in forum C++ Programming
    Replies: 7
    Last Post: 09-07-2005, 05:04 PM
  5. segmentation fault and memory fault
    By Unregistered in forum C Programming
    Replies: 12
    Last Post: 04-02-2002, 11:09 PM