Thread: C sintax, underscore

  1. #1
    Registered User
    Join Date
    Jun 2009
    Posts
    56

    C sintax, underscore

    Hi all,

    Actually I'm reading some C functions, but I don't understand some notations and sintax, like:

    Code:
    fputs (_("Filesystem    Type"), stdout);
    for me is not clear why the develop use the underscore after "fputs (".
    Can anyone give me a tip please?
    Thx

    D.

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    You are looking at either a function call or the use of a function-style macro. It is probably shorthand to enable the string to be converted appropriately for the different languages supported by the program.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    Sep 2008
    Posts
    200
    Quote Originally Posted by Dedalus View Post
    Hi all,

    Actually I'm reading some C functions, but I don't understand some notations and sintax, like:

    Code:
    fputs (_("Filesystem    Type"), stdout);
    for me is not clear why the develop use the underscore after "fputs (".
    Can anyone give me a tip please?
    Thx

    D.
    _ will be defined as a macro somewhere - I can't remember where I've seen it before. Either search for the definition or pass -E to gcc to see what it actually expands to (look up -E in the man page if you're not familiar with it first).

  4. #4
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    I've seen it before, but it's the dumbest thing ever coded imho.

    See, it might make sense on first sight, especially for English people. But once you know a language other than English you should be able to realize that it makes no sense to use it this way.
    Here's why: Quite commonly, sentences are reused in English that are slightly different in those context in other languages.
    This might mean that the same string is used for the _ function (or macro) even though they are in a different context, making one of the contexts a horrible mistranslation.
    But not just that, it also makes the translator's job hard. Let's say we get a random word: "comment". Now how is it used? In "1 comment"? As "give comment"? As "Person 1 and 2 comment:"? Good luck finding out translator: it might even be a mix of several of those.

    It gets worse when the result is mixed with words/names appended or prepended to it: in other languages it may make sense only on the other end of the sentence. Or what about plural? Some languages use different grammar rules for plural. I think I've heared of a language before (think it was Chinese) what would not use plural for "0" as most languages do.

    Yes, I've seen ugly mis-translations because of this. In reality, it just doesn't give enough flexibility to actually make your program multi-lingual, even though it tries to.

    Well, that is, as I've always seen it used. Maybe there are proper ways to use it, but I doubt it. There are better ways, I coded one myself once as a proof of concept, but it had to use certain "scripting-like" construct to allow things like plural forms to be specifide.

    I bet there must be a library to do that, though, but I don't know any.

  5. #5
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by JohnGraham View Post
    _ will be defined as a macro somewhere - I can't remember where I've seen it before. Either search for the definition or pass -E to gcc to see what it actually expands to (look up -E in the man page if you're not familiar with it first).
    From the way that's setup it looks like the underscore might be used as a substitute for the _TEXT macro that is commonly used to delineate unicode text in string constants.

    But no matter it's purpose it's a real bad idea since the underscore is very commonly used as a space in constants and even variables (eg: MAX_PATH in Windows) and would be translated there as well, causing a gazillion compiler errors.

    In the history of bad ideas that one probably ranks in the top 100.

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by CommonTater
    But no matter it's purpose it's a real bad idea since the underscore is very commonly used as a space in constants and even variables (eg: MAX_PATH in Windows) and would be translated there as well, causing a gazillion compiler errors.
    I don't understand your reasoning. How would the underscore be translated in those cases too? If you're thinking of it as a function-style macro, then you probably forgot that the relatively dumb macro replacement is not that dumb as it is done with respect to tokens, but _ is an entirely different token compared to MAX_PATH.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    Quote Originally Posted by EVOEx View Post
    I've seen it before, but it's the dumbest thing ever coded imho.

    See, it might make sense on first sight, especially for English people. But once you know a language other than English you should be able to realize that it makes no sense to use it this way.
    Here's why: Quite commonly, sentences are reused in English that are slightly different in those context in other languages.
    This might mean that the same string is used for the _ function (or macro) even though they are in a different context, making one of the contexts a horrible mistranslation.
    But not just that, it also makes the translator's job hard. Let's say we get a random word: "comment". Now how is it used? In "1 comment"? As "give comment"? As "Person 1 and 2 comment:"? Good luck finding out translator: it might even be a mix of several of those.
    While the _ macro could be used in any number of ways, the typical (in free software) use is as a call to the gettext() function, which does not do translation as you seem to be implying. It's not a simple “look each word up in the dictionary” scheme: gettext() looks up strings in a message catalogue so that proper translations can be done. As an example, CUPS has a translation of the string "Unknown printer-op-policy \"%s\"." into French as "Paramètre printer-op-policy « %s » inconnu." This is clearly not a simple mechanical translation: The quote marks have changed, the order of words has changed, and a new word has been added.

    The downside to the gettext() method is that each and every string has to be translated by hand. This means adding a new strings (or modifying one) means that a new translation must be done by a human being. But this is also its upside: as a result, translations that sound right in the native language can be used. Whether they do depends on the ability of the translator, of course.

    The only time your concern would matter is if you are trying to print out a single word (or small phrase) that, depending on context, has wildly different meanings. While possible, this is rare, and can be worked around easily enough. Your example of “comment” is not susceptible; at least, not in the examples used: _("1 comment") would look up a different translation than _("give comment").

    gettext() is not perfect, but it works well in practice.

  8. #8
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by EVOEx View Post
    I've seen it before, but it's the dumbest thing ever coded imho.

    See, it might make sense on first sight, especially for English people. But once you know a language other than English you should be able to realize that it makes no sense to use it this way.
    Here's why: Quite commonly, sentences are reused in English that are slightly different in those context in other languages.
    This might mean that the same string is used for the _ function (or macro) even though they are in a different context, making one of the contexts a horrible mistranslation.
    But not just that, it also makes the translator's job hard. Let's say we get a random word: "comment". Now how is it used? In "1 comment"? As "give comment"? As "Person 1 and 2 comment:"? Good luck finding out translator: it might even be a mix of several of those.
    Code:
    #define _(string) TranslateString(string, __FILE__, __LINE__)
    Problem solved, except in the incredibly improbable case where the same string appears twice on the same line of code and requires different translations.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  9. #9
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Quote Originally Posted by cas View Post
    While the _ macro could be used in any number of ways, the typical (in free software) use is as a call to the gettext() function, which does not do translation as you seem to be implying. It's not a simple “look each word up in the dictionary” scheme: gettext() looks up strings in a message catalogue so that proper translations can be done. As an example, CUPS has a translation of the string "Unknown printer-op-policy \"%s\"." into French as "Paramètre printer-op-policy « %s » inconnu." This is clearly not a simple mechanical translation: The quote marks have changed, the order of words has changed, and a new word has been added.

    The downside to the gettext() method is that each and every string has to be translated by hand. This means adding a new strings (or modifying one) means that a new translation must be done by a human being. But this is also its upside: as a result, translations that sound right in the native language can be used. Whether they do depends on the ability of the translator, of course.

    The only time your concern would matter is if you are trying to print out a single word (or small phrase) that, depending on context, has wildly different meanings. While possible, this is rare, and can be worked around easily enough. Your example of “comment” is not susceptible; at least, not in the examples used: _("1 comment") would look up a different translation than _("give comment").

    gettext() is not perfect, but it works well in practice.
    I know how _ (or, rather, gettext) works. I was referring to that: I've seen many translation errors because of this function. Yes, the format string style would solve some problems, even though not natively supported in gettext.

    The problem I'm referring to is more when a variable value would be combined with a certain string. The plural form is one of the most obvious. We can't use "%d comments" or "%d comment"; it depends on the value of '%d'. Well, okay, even that has a solution for gettext I believe (I looked it up, and it seems to support even more advanced plural forms) :P.

    Actually, I believe that using format strings in the files as you say and the way it supports plural forms might solve all of my complaints: if you'd use "%s comment" and "comment" as two seperate strings, for example, in stead of prepending the value directly to "comment".

    Well, I think I'll have to eat my words. The gettext function is acceptable when used correctly. Unfortunately, I've seen it used in the "bad" ways so commonly.

  10. #10
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by laserlight View Post
    I don't understand your reasoning. How would the underscore be translated in those cases too? If you're thinking of it as a function-style macro, then you probably forgot that the relatively dumb macro replacement is not that dumb as it is done with respect to tokens, but _ is an entirely different token compared to MAX_PATH.
    Code:
    #define _ L
    What does that do to MAX_PATH ?

  11. #11
    Registered User
    Join Date
    Jul 2007
    Posts
    131
    Quote Originally Posted by CommonTater View Post
    Code:
    #define _ L
    What does that do to MAX_PATH ?
    Nothing. Why would it? cpp would be horrible to use if it would touch thing in smaller scale than tokens.

  12. #12
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by fronty View Post
    Nothing. Why would it? cpp would be horrible to use if it would touch thing in smaller scale than tokens.
    C not C++

  13. #13
    Registered User
    Join Date
    Jul 2007
    Posts
    131
    Quote Originally Posted by CommonTater View Post
    C not C++
    What would that mean? The preprocessor works to my knowledge 1:1 in C and C++. And in both languages working on subtoken level would be insane.

  14. #14
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by fronty View Post
    What would that mean? The preprocessor works to my knowledge 1:1 in C and C++. And in both languages working on subtoken level would be insane.
    Did you actually try it?

  15. #15
    Registered User
    Join Date
    Jul 2007
    Posts
    131
    Quote Originally Posted by CommonTater View Post
    Did you actually try it?
    No, I didn't. I didn't have to.

    EDIT: Ah, now I possibly got where you got that C++ thing. cpp means C preprocessor, not C++.
    Last edited by fronty; 10-26-2010 at 04:04 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Underscore prefix
    By Petike in forum C Programming
    Replies: 5
    Last Post: 11-23-2008, 11:32 AM
  2. underscore use in c++
    By te5la in forum C++ Programming
    Replies: 6
    Last Post: 07-17-2008, 03:24 PM
  3. underscore and variable names
    By l2u in forum C++ Programming
    Replies: 2
    Last Post: 06-10-2007, 09:57 AM
  4. [ANN] New script engine (Basic sintax)
    By MKTMK in forum C++ Programming
    Replies: 1
    Last Post: 11-01-2005, 10:28 AM

Tags for this Thread