Thread: Is it possible to open a file which has international chars in it's name?

  1. #1
    Registered User
    Join Date
    Feb 2012
    Posts
    29

    Is it possible to open a file which has international chars in it's name?

    Is it possible to open a file which has international characters in it's name(for example: Turkish characters), using standard C++ library(instead of using a framework's api)?

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    No. The problem is a bit complicated to explain in detail but you run into all sorts of differences between systems. The standard guarantees that you can open files with wide and narrow strings, but is mum on what the encoding of the file names should be. Thus, making some sort of assumption (like passing in a UTF8 file name) is not going to be portable, if it works at all.

  4. #4
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,739
    Expanding on what whiteflags said, in cases like these where you want to use non-English characters in a file's name, you need to provide a way to convert from your encoding of the name to the native encoding of the current system. And that's assuming the system converted the name the same way you would do... It's quite a mess, that's why most programs stick with english/latin letters and numbers only( maybe an underscore and/or minus sign, maybe ).
    Devoted my life to programming...

  5. #5
    Registered User
    Join Date
    Feb 2012
    Posts
    29
    Thanks for your answers.

  6. #6
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    If you're trying to make a portable app then you need to go into the aforementioned details. If you're just trying to open some files on your system then just try it. For example this works for me (on linux) .
    Code:
    #include <iostream>
    #include <fstream>
    
    int main() {
      std::ifstream f("ģĤĥ");
      if (!f) {
        std::cerr << "error opening file\n";
        return 1;
      }
    
      char s[100];
      f.getline(s, 100);
      std::cout << s << '\n';
    
      return 0;
    }

  7. #7
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    The problem is most likely going to mount down to whether or not your system supports UTF8 or not in its interfaces taking char* parameters. Linux does. Windows does not.
    There are wide versions, e.g. std::wifstream, but they come with their own problems and are tricky to use right.
    In short, it's really platform specific and the best way to do it is just to use platform specific APIs because the standard guarantees nothing in this particular topic. You can have different behaviour between compilers and platforms.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  8. #8
    Registered User
    Join Date
    Feb 2012
    Posts
    29
    Thanks for your answers.

    algorism, yes you are right, opening a file (which has international characters in it's name) using standard c++ functions worked well like you said(at least on my system). I was thinking that it wasn't working cause I had tried to allow the user to select a file using C++ Builder's api, then after selection I tried to open that file using standard c++ functions and it was unable to open the file if it had international characters in it.So I thought that standard c++ functions couldn't open that type of files.Though still it has another problem, standard c++ functions seem not able to read the international chars inside files(at least when opened in text mode).

    Like people suggested, it seems the best way is to use platform specific api functions, for portability.
    Last edited by Awareness; 04-08-2017 at 02:38 PM.

  9. #9
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by Awareness View Post
    Though still it has another problem, standard c++ functions seem not able to read the international chars inside files(at least when opened in text mode).
    They may or may not. It depends on your character set. If it's UTF8 or ANSI, there shouldn't be any problems. But if you're using UTF16 or something else, you may get problems with \r and \n characters. If you just use UTF8 everywhere in your files and internally, you will bypass this problem. But it doesn't solve international characters in the filename itself.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  10. #10
    Registered User
    Join Date
    Feb 2012
    Posts
    29
    Thanks Elysia. How can I use UTF-8 in C++?

  11. #11
    [](){}(); manasij7479's Avatar
    Join Date
    Feb 2011
    Location
    *nullptr
    Posts
    2,657
    Quote Originally Posted by Awareness View Post
    Thanks Elysia. How can I use UTF-8 in C++?
    Ah, you could probably write a book on this!

  12. #12
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Best way is to keep all your files in UTF8. Keep all your strings internally as UTF16. This is probably a bit tricky since there is no native way to do this. This depends on your compiler, if it can encode your source file in UTF8 in such a way that international characters are kept intact. Another way is to use some other encoding that your compiler supports (e.g. UTF16) such that it correctly saves international characters and convert them to UTF8 at runtime.

    When reading or writing, use narrow streams. Do not use wide streams (e.g. std::wifstream). When interfacing with platform API on windows, convert to UTF16. Linux works natively with UTF8. For other platforms, you need to check if they accept Unicode, and how. Use strings, but not wstring. All string algorithms work natively with UTF8. Opening files with international characters will be problematic unless you use platform API. Avoid using files with international chars.

    This should mostly work. You might find some edge cases, though. But that's life since C++ has such bad Unicode support.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  13. #13
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    Quote Originally Posted by Elysia
    All string algorithms work natively with UTF8. [...] This should mostly work. You might find some edge cases, though. But that's life since C++ has such bad Unicode support.
    They work, as long as you don't mind ending up in the middle of a code point. It's a little more than an edge case.

  14. #14
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Well, depends on the algorithm, I guess. I can see sort() messing up the string. I can see a reverse algorithm also messing up the string. Good point. Didn't think of that. I've rarely run algorithms on UTF8 strings.
    If you need to use algorithms on use UTF8 strings, I would recommend you get a UTF8 library for C++ on the web. There are some intuitive ones out there that provide a u8string similar to std::string.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  15. #15
    Registered User
    Join Date
    Feb 2012
    Posts
    29
    Thanks for your answers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Switching chars in file.
    By cdummie in forum C Programming
    Replies: 8
    Last Post: 04-12-2015, 11:22 AM
  2. Printing chars to console and to a *.txt file
    By Ducky in forum C++ Programming
    Replies: 2
    Last Post: 10-25-2012, 10:10 AM
  3. Reading in select chars from a file
    By B.Grills in forum C Programming
    Replies: 6
    Last Post: 09-30-2012, 03:35 PM
  4. Reading chars from file into array
    By AJOHNZ in forum C++ Programming
    Replies: 1
    Last Post: 08-19-2009, 03:37 PM
  5. reading in chars from file
    By AJOHNZ in forum C++ Programming
    Replies: 2
    Last Post: 08-16-2009, 12:50 AM

Tags for this Thread