Syntax Highlighting for English?

This is a discussion on Syntax Highlighting for English? within the A Brief History of Cprogramming.com forums, part of the Community Boards category; Let's throw a bit of theory into this, shall we: Computational processing of textual data Interesting parts: Part-of-Speech tagging (Brill, ...

  1. #31
    Lean Mean Coding Machine KONI's Avatar
    Join Date
    Mar 2007
    Location
    Luxembourg, Europe
    Posts
    444
    Let's throw a bit of theory into this, shall we:

    Computational processing of textual data

    Interesting parts:

    Part-of-Speech tagging (Brill, HMM)

    Part-of-Speech tagging (HMM ctnd.)

    Then I recommend "Parsing, formal grammars" and "Stochastic Parsing", as well as "Classification - Visualization". As long as we don't have the same theoretical background, there's no use discussing anything. Personally, I took computational processing of textual data and "natural language processing" (audio) and some opinions/beliefs in this thread amuse me.

  2. #32
    Mayor of Awesometown Govtcheez's Avatar
    Join Date
    Aug 2001
    Location
    MI
    Posts
    8,825
    Quote Originally Posted by KONI View Post
    some opinions/beliefs in this thread amuse me.
    How's about you and brewbuck drop a little science and enlighten us? That's certainly more productive than saying "I'm right hahaha"

  3. #33
    Deathray Engineer MacGyver's Avatar
    Join Date
    Mar 2007
    Posts
    3,211
    This idea is interesting.

  4. #34
    S Sang-drax's Avatar
    Join Date
    May 2002
    Location
    Göteborg, Sweden
    Posts
    2,072
    brewbuck, if you encountered a language you've never seen before, for example Hindi, then I am sure it would be impossible for you to figure out which words that are nouns. You would probably not even recognice the characters used.

    It is impossible for you to determine where the nouns are -- the information just isn't there.

    But! After syntax highlighting using a human or an extensive database of words and grammar it would be very easy for you to see the nouns -- they would be bold. There's your extra information.

    The problem here is that you don't seem to acknowledge the external database/human experience as additional information. I don't really understand why, but obviously you must have some deep insight in information theory that I don't. I skipped that course. Feel free to enlighten me with arguments other than what subjects I don't understand.
    Last edited by Sang-drax : Tomorrow at 02:21 AM. Reason: Time travelling

  5. #35
    Registered User
    Join Date
    May 2006
    Posts
    903
    I don't get how that could ever be helpful... When I'm reading, I don't give a damn if a word is a verb or a noun, I just naturally know it. English isn't even my native language... Awful idea..

  6. #36
    CSharpener vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,484
    Quote Originally Posted by Desolation View Post
    I don't get how that could ever be helpful... When I'm reading, I don't give a damn if a word is a verb or a noun, I just naturally know it. English isn't even my native language... Awful idea..
    Sometimes to undersand a sentence correctly yo HAVE to know if the word is noun or verb - it can greatly change the meaning of the sentence... Not all people can do it easely with the foreign language... especially if it is english that does not bother to distinguish verbs and nouns with some additional parts of the word...
    The first 90% of a project takes 90% of the time,
    the last 10% takes the other 90% of the time.

  7. #37
    Lean Mean Coding Machine KONI's Avatar
    Join Date
    Mar 2007
    Location
    Luxembourg, Europe
    Posts
    444
    You're completely forgetting the fact that natural language is by definition ambiguous.
    Sometimes writers don't want you to know if something is a verb or a noun and they make full usage of the two very different meanings the phrase would have to express a more subtle message.

    Go literature ! \o/

  8. #38
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    I agree that it's interesting, but it doesn't really have any other than trivial use.

  9. #39
    S Sang-drax's Avatar
    Join Date
    May 2002
    Location
    Göteborg, Sweden
    Posts
    2,072
    Quote Originally Posted by KONI View Post
    You're completely forgetting the fact that natural language is by definition ambiguous.
    Sometimes writers don't want you to know if something is a verb or a noun and they make full usage of the two very different meanings the phrase would have to express a more subtle message.
    Yeah, but seriously, how common is that?
    Last edited by Sang-drax : Tomorrow at 02:21 AM. Reason: Time travelling

  10. #40
    Moderately Rabid Decrypt's Avatar
    Join Date
    Feb 2005
    Location
    Milwaukee, WI, USA
    Posts
    300
    All the time, if you consider slang and colloquialisms (sp?)

    "He hits a grounder to short."
    "I need to code for awhile if I'm ever going to finish this project."

    First of all, grounder isn't even a "proper" English word (I think). Second, "short" is usually an adjective; here it's a noun. The word "code" in the second sentence is usually a noun, but, as well all know, it's used as a verb quite often. Using a simple dictionary and assigning each word to a color would never work. One of the things I always hear about English is that it's a hard language to learn because it has so many "exceptions to the rule." Because of that, this is going to be an immense project.

    You may have to implement it by looking at the general form, instead. By looking at the sentence structure, I think you can determine that "hits" is a verb because "He" is a pronoun. If "hits" is a verb, then "grounder" must be a noun, since "a" is a determiner (pretty sure, anyway). If all of this is true, we can assume that "short" is a noun, since "to" is a preposition.

    I did not really double check to make sure all of that is correct, but you get the idea. Instead of focusing on what each word means, you'd probably have to start with the underlying structure, and determine where the determiners, prepositions, and the like are. Then you can build up to the nouns and verbs. Probably. Maybe.

    Since there is a set of rules for grammar, it can probably be done. It will be no small task, though. As an ESL teaching tool, it might be pretty useful, actually, since it focuses on the abstract sentence structure, instead of something like "this word is a noun, except in the following cases..."
    There is a difference between tedious and difficult.

  11. #41
    aoeuhtns
    Join Date
    Jul 2005
    Posts
    581
    Quote Originally Posted by Decrypt View Post
    You may have to implement it by looking at the general form, instead. By looking at the sentence structure, I think you can determine that "hits" is a verb because "He" is a pronoun. If "hits" is a verb, then "grounder" must be a noun, since "a" is a determiner (pretty sure, anyway). If all of this is true, we can assume that "short" is a noun, since "to" is a preposition.
    No, there "to short" may also be a verb.
    There are 10 types of people in this world, those who cringed when reading the beginning of this sentence and those who salivated to how superior they are for understanding something as simple as binary.

  12. #42
    Registered User whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    7,758
    Quote Originally Posted by Rashakil Fol View Post
    No, there "to short" may also be a verb.
    No, Decrypt's point about American English is correct. In baseball, the shortstop is sometimes called short (especially by the announcers). In the sentence "He hit a grounder to short," the shortstop is the indirect object. Provided an ESL student has any inkling of what those are, he should be able to understand it is a noun. But this is where the syntax highlighter could come in and clear things up, because to short is not a verb; to shorten is.

    Ah, how baseball has impacted American culture....
    Last edited by whiteflags; 06-02-2007 at 12:28 PM. Reason: indirect object, excuse me. :o

  13. #43
    aoeuhtns
    Join Date
    Jul 2005
    Posts
    581
    Quote Originally Posted by citizen View Post
    to short is not a verb; to shorten is.
    Play with electricity much? Or stocks?
    There are 10 types of people in this world, those who cringed when reading the beginning of this sentence and those who salivated to how superior they are for understanding something as simple as binary.

  14. #44
    Registered User whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    7,758
    Fair enough. Still, there's context to wrestle with and I don't think that particular sentence makes sense in any other way. I could be wrong, but it doesn't go against his real point anyway. The machine would have to be able to understand context to work perfectly, and I think he knows that.

  15. #45
    Moderately Rabid Decrypt's Avatar
    Join Date
    Feb 2005
    Location
    Milwaukee, WI, USA
    Posts
    300
    No, there "to short" may also be a verb.
    On it's own it can be, but, in context, I'm not sure that's the case. If we've determined that the sentence so far is <pronoun> <verb> <determiner> <noun>, I'm not sure that a verb in its general form "to short" makes sense, assuming complete sentences. You're right, the sentence could just as easily read "He sells a stock to short," but the sentence is incomplete; another noun is required, I think: "He sells a stock to short it." Due to that, I think something like the above analysis would work. This is English we're talking about, though, so nearly anything is possible.

    The idea is that, if this highlighter is to work, I think that it'd have to use general grammatical rules instead of a strict set of uses for each word as mentioned above. However, to start, you'd have to have some set of words whose use is iron-clad, and, in the end, you'd probably have to use both a set of grammatical rules and a database of words and their uses to implement it properly.
    There is a difference between tedious and difficult.

Page 3 of 4 FirstFirst 1234 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. more then 100errors in header
    By hallo007 in forum Windows Programming
    Replies: 20
    Last Post: 05-13-2007, 09:26 AM
  2. We Got _DEBUG Errors
    By Tonto in forum Windows Programming
    Replies: 5
    Last Post: 12-22-2006, 05:45 PM
  3. Using VC Toolkit 2003
    By Noobwaker in forum Windows Programming
    Replies: 8
    Last Post: 03-13-2006, 07:33 AM
  4. Connecting to a mysql server and querying problem
    By Diod in forum C++ Programming
    Replies: 8
    Last Post: 02-13-2006, 10:33 AM
  5. Dikumud
    By maxorator in forum C++ Programming
    Replies: 1
    Last Post: 10-01-2005, 07:39 AM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21