Thread: Proper nouns

  1. #1
    Registered User
    Join Date
    Dec 2007
    Posts
    932

    Proper nouns

    I would like to uppercase proper nouns in a file.

    So i thought to make a list of proper nouns in a b.txt file, open a.txt and b.txt and compare each word in a.txt with the list in b.txt.
    Am i on the right way to go like this?
    Wouldn't it slow down the program significantly?

    Thanks!

  2. #2
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Yes, it would. You should load the dictionary into an in-memory data structure (e.g. a hash map or a trie) and do the lookup there. File I/O is simply too slow. Unless you memory-map the file and it already has a good internal structure, but that's advanced stuff.

    The main problem is that there's overlap between proper nouns and other words. How do you decide which one is meant?
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  3. #3
    Registered User
    Join Date
    Dec 2007
    Posts
    932
    You mean like overlap between proper noun Bo and the word bowling?

    If this is what you mean, they are not the same.

    Or you mean proper nouns that have a meaning too? Like bill.

    Yes, true that this would cause a problem.

    In that case i can only think of making a dictionary only with proper nouns without meaning.
    Last edited by Ducky; 12-30-2007 at 01:20 PM.

  4. #4
    Registered User
    Join Date
    Dec 2007
    Posts
    11
    To get around that you could try and use some kind of grammer checking code, but that would be realy complicated and slow.

  5. #5
    Registered User
    Join Date
    Dec 2007
    Posts
    932
    Ah yes i guess i see what you mean, thats a good idea actually.

    I think you mean to check if there is a verb beside it or something.

    Yes that could be pretty hard to realize.

  6. #6
    Registered User
    Join Date
    Dec 2007
    Posts
    11
    in all honesty im not the best person to ask for help, im really new to programming. It just struck me that programs such as word processors somehow tell if it is a proper noun, but i have no idea how.

  7. #7
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Bill is one example.

    OK, another problem: what about invented names? Can you recognize Jingizu, A'sua, and all that stuff from fantasy and science fiction novels? Or even simple foreign names, like Günther?

    Language analysis is extremely difficult, precisely because language is so ambiguous.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  8. #8
    Registered User
    Join Date
    Dec 2007
    Posts
    11
    is this program scanning a text document and then highlighting the proper nouns? Becuase you could scan for all words beggining with a capital, and then filter out things that are not a proper nouns like sentance starts and titles (Mr ,Mrs, Doc).

  9. #9
    Registered User
    Join Date
    Dec 2007
    Posts
    932
    @CornedBee
    I think the user of the program could use a list of proper nouns in the language of the file that he intends to modify. And he could add invented names to the list.

    @yblad
    No, actually the program is meant to capitalize proper nouns.

  10. #10
    Registered User
    Join Date
    Dec 2007
    Posts
    11
    sorry, that was a typo i meant uppercase them. Being able to have the user add thier own sounds liek a good idea.

  11. #11
    Registered User
    Join Date
    Dec 2007
    Posts
    932
    Ok i start to realize that in C++ the same things could have different names.
    So there is how we beginners learn things there is how you "gods" talking.
    Now by any chance , when CornedBee says to me to "load the dictionary into an in-memory data structure" could mean to use pointers to each word of the dictionary?
    Because i dont seem to find an in-memory data structure tutorial.
    Last edited by Ducky; 12-31-2007 at 10:46 AM.

  12. #12
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by CornedBee View Post
    Or even simple foreign names, like Günther?
    Günther isnt really a foreign name where I live. I guess that woudl depend on where in the US or world you reside. If you are doing language analysis, most researchers have had good luck using multilayer perceptrons.

  13. #13
    Registered User
    Join Date
    Dec 2007
    Posts
    932
    Thanks for the advice but i dont think im gonna go that far.
    Right now im just searching how to load a file in memory.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Proper method of clearing buffer
    By Oldman47 in forum C++ Programming
    Replies: 14
    Last Post: 04-23-2007, 07:14 PM
  2. Making proper wrappers.
    By bladerunner627 in forum C++ Programming
    Replies: 3
    Last Post: 11-02-2005, 05:18 PM
  3. Proper way to read in a file
    By elaechelin in forum C++ Programming
    Replies: 2
    Last Post: 04-28-2004, 10:00 AM
  4. Proper use of duo timers.......
    By Alphabird32 in forum Windows Programming
    Replies: 2
    Last Post: 09-10-2002, 12:49 PM
  5. How to find the proper divisor?
    By yusiye in forum C Programming
    Replies: 6
    Last Post: 07-24-2002, 01:14 PM