Getting ASCII equivalent of UNICODE string
I would like to get a matching ASCII (7bit) string for a UNICODE string.
Example: I would like to have "Lÿdia" transformed into "Lydia".
I'm aware that this would be a good guess at best and irreversible. However, I only have ASCII and I have to fit UNICODE in there the best looking way possible.
I tried narrowing and widening it again with the classic locale which is what I need in the end, but it doesn't do any transformations for characters unknown, it just replaces them with blanks. No surprise really. Is there any way to transform those characters, or do I have to create a huge lookup table wchar_t to char myself ?
Code:
#include <string>
#include <locale>
#include <iostream>
std::wstring widen( const std::string& s, const std::locale& loc = std::locale() )
{
std::wstring out;
out.reserve( s.size() );
const std::ctype<wchar_t>& f = std::use_facet<std::ctype<wchar_t> >(loc);
for( std::string::size_type i = 0 ; i < s.size() ; ++i )
{
out.push_back( f.widen( s[i] ) );
}
return out;
}
std::string narrow( const std::wstring& s, const std::locale& loc = std::locale() )
{
std::string out;
out.reserve( s.size() );
const std::ctype<wchar_t>& f = std::use_facet<std::ctype<wchar_t> >(loc);
for( std::wstring::size_type i = 0 ; i < s.size() ; ++i )
{
out.push_back( f.narrow( s[i] ) );
}
return out;
}
std::wstring asciify( const std::wstring& text )
{
std::string norm = narrow( text, std::locale::classic() );
return widen( norm, std::locale::classic() );
}
int main()
{
std::wstring s = L"Lÿdia";
std::wcout << asciify( s ) << std::endl;
system( "pause" );
return 0;
}
The solution doesn't have to be standard C++ only, MFC would be ok, too. But stl would be cool. And no, I don't have boost.