Thread: mbtowc() - in CODEBLOCKS using minGW ???

  1. #1
    Registered User
    Join Date
    Nov 2023
    Posts
    2

    mbtowc() - in CODEBLOCKS using minGW ???

    Hi All,

    I have been trying to get my head around this for DAYS now...

    I'm using Windows 10 and Codeblocks IDE with MinGW compiler.

    I'm told MinGW has some peculiarities to do with UTF8, Unicode and MultiByte characters?

    I start by converting a char32_t Thai Character to Multibyte using c32rtomb(). That goes fine. It doesn't output to the console correctly but if I copy + paste it into notepad it displays absolutely fine.

    HOWEVER, when I then try to convert that multibyte character to wchar_t I get anomalous strings, or (null) strings, or access violations or whatever, depending on how I attempt to output it using printf()...

    Here is the code:

    Code:
                        char32_t tester = U'ฐ'; //a Thai letter
    
                        int returned = NULL;
                        std::mbstate_t state {};
                        char output[MB_LEN_MAX] {};
    
                        returned = c32rtomb(output, tester, &state);
    
                        printf("[c32rtomb()]: Character Returned is: %s \n\n", output); //Displays incorrectly in console 
                                                          //but copies to clipboard fine...
    
    
                        wchar_t receiver;
    
                        mbtowc(NULL, NULL, 0); //Reset the function
                        
                        auto resultant = mbtowc(&receiver, output, MB_LEN_MAX);
    
                        printf("[mbtowc()]: Byte-length of returned character is: %u \n", resultant);
                  
                        printf("[mbtowc()]: Data returned is: %d \n", receiver);
                        
                        wprintf(L"[mbtowc()]: Character Returned is: %c \n", receiver);
    -The value of resultant is always 1 byte. Shouldn't the wchar_t character be larger than 1 byte?

    - Am I using the wrong symbol (%c) in wprintf() to output the character?

    -Anything else you can see going wrong?

    I have asked this in other places but I don't think most people run the code themselves because the answers I received don't work.

    I would particularly appreciate it if people tried this in Codeblocks IDE with MinGW compiler, since I've heard it's fiddly there but it's the only IDE I have access to right now.

    Many thanks for your assistance and appreciation for your time!

  2. #2
    Registered User
    Join Date
    Dec 2017
    Posts
    1,652
    but I don't think most people run the code themselves

    Maybe you should post a runnable program instead of just part of it.
    Anyway, I believe you need to set the locale, something like this:

    Code:
    #include <iostream>
    #include <cuchar>
    #include <climits>
    #include <cstdlib>
    #include <clocale>
    using namespace std;
     
    int main() {
      setlocale(LC_ALL, "en_US.utf8");
     
      char32_t tester = U'ฐ'; //a Thai letter
      mbstate_t state {};
      char output[MB_LEN_MAX] {};
     
      size_t returned = c32rtomb(output, tester, &state);
      if (returned == size_t(-1)) {
        perror("c32rtomb");
        exit(EXIT_FAILURE);
      }
     
      cout << "bytes: " << returned << '\n';
      cout << "character: " << output << '\n';
     
      return 0;
    }


    Last edited by john.c; 11-03-2023 at 02:09 PM.
    And that's the world in a nutshell, an appropriate receptacle.

  3. #3
    Registered User
    Join Date
    Nov 2023
    Posts
    2
    Quote Originally Posted by john.c View Post

    Maybe you should post a runnable program instead of just part of it.
    Anyway, I believe you need to set the locale, something like this:

    Code:
    #include <iostream>
    #include <cuchar>
    #include <climits>
    #include <cstdlib>
    #include <clocale>
    using namespace std;
     
    int main() {
      setlocale(LC_ALL, "en_US.utf8");
     
      char32_t tester = U'ฐ'; //a Thai letter
      mbstate_t state {};
      char output[MB_LEN_MAX] {};
     
      size_t returned = c32rtomb(output, tester, &state);
      if (returned == size_t(-1)) {
        perror("c32rtomb");
        exit(EXIT_FAILURE);
      }
     
      cout << "bytes: " << returned << '\n';
      cout << "character: " << output << '\n';
     
      return 0;
    }


    Thanks for your reply @John.c

    I have redesigned the code to use a wchar_t[MB_CUR_MAX] array instead of the wchar_t to receive the output from mbtowc().

    Code:
    wchar_t bufferGo[MB_CUR_MAX];
    
    int length = mbtowc(nullptr, 0, 0); //Reset the function
    
    length = mblen(output, returned);
    
    auto resultant = mbtowc(bufferGo, output, length);
    
    printf("Decimal value of returned character is: %u \n", bufferGo);
    printf("Hexadecimal value of returned character is: %lx \n", bufferGo);
    printf("Returned character is: %ls \n", bufferGo);
    I have noticed a few things:

    - The decimal value from mbtowc() is 16 bytes less than the decimal value outputted by c32rtomb().
    -No character is printed when calling printf("%ls", bufferGo)

    What can I do to output the same character from mbtowc() as from c32tomb()?

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,669
    Again, as john.c said, post WHOLE programs.
    Stripping off the include files and main serves no purpose except to make it way more difficult for us to bother helping.

    We like copy/paste/compile/run.
    If we have to start guessing all the things you've removed, it either doesn't happen (you lose) or everyone's collective time is wasted trying to guess all the bits you hid.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Dec 2017
    Posts
    1,652
    Yeah, I don't feel like playing "complete the snippet" this time, especially when it doesn't really make sense. What does "16 bytes less" mean? Why are you printing the address of the array?
    And that's the world in a nutshell, an appropriate receptacle.

  6. #6
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,669
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  7. #7
    Registered User
    Join Date
    Dec 2017
    Posts
    1,652
    Arianax, you just need to post a complete program demonstrating what you are trying to do.
    The codeguru guys don't seem to understand what's happening.
    I know exactly what's happening.
    It's actually something quite technical that doesn't come up too often, so it's understandable that they are stumped.
    And that's the world in a nutshell, an appropriate receptacle.

  8. #8
    Registered User
    Join Date
    Dec 2017
    Posts
    1,652
    Anyway, if anyone is interested in the solution to the actual topic of this thread, it has to do with the little-known stream property called "orientation". It starts out unset and becomes either "narrow" or "wide" after the first use of the stream. The only way to reset it is to close and reopen the stream.
    Code:
    #include <iostream>
    #include <clocale>
     
    int main() {
      setlocale(LC_ALL, "en_US.utf8");
     
      std::cout << "hello\n"; // stdout is now narrow
      std::wcout << L"ฐ\n";   // so this will not print
     
      std::wcerr << L"\nฐ\n"; // stderr is now wide
      std::cerr << "hello\n"; // so this will not print
     
      // closing and reopening a stream will unset it's orientation
      freopen("/dev/stdout", "w", stdout); // this is for linux
      std::wcout << L"\nฐ\n"; // stdout is now wide
      std::cout << "hello\n"; // so this will not print
    }
    And that's the world in a nutshell, an appropriate receptacle.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. CodeBlocks with MinGW virus-free download
    By Joca in forum Tech Board
    Replies: 4
    Last Post: 02-03-2015, 06:23 PM
  2. CodeBlocks with MinGW virus-free download
    By Joca in forum C++ Programming
    Replies: 2
    Last Post: 02-03-2015, 01:33 PM
  3. Need help with codeblocks
    By SilverClif in forum Windows Programming
    Replies: 6
    Last Post: 05-24-2011, 05:12 AM
  4. using codeblocks
    By torquemada in forum Tech Board
    Replies: 7
    Last Post: 04-20-2011, 08:57 PM
  5. Codeblocks.
    By Kitt3n in forum C++ Programming
    Replies: 5
    Last Post: 05-16-2010, 01:50 PM

Tags for this Thread