Thread: Two things - Seach for name exact or simialr to user input, and name recognition

  1. #1
    Registered User
    Join Date
    Aug 2007
    Posts
    22

    Two things - Seach for name exact or simialr to user input, and name recognition

    Hey,

    I have an assignment for school, not too hard just stuck on a couple of things.

    One the teacher ecpects us to search for a name the user inputs and match it to the closest match, or couple of matches i guess, never done this, so just a little confused on how to do it.
    And the other thing is he gave us a txt file to read and write info to, such as fname sname pnumber etc, anyway we have to input the text and store it in the array, which is not hard but we are to expect that the first name and surname can both include white space, for example two first names and one surname or any other combo, the thing is we cant make any markers in the txt file such as a _ between the names which we get rid of in output. Any ideas would be helpful

    Thnx in advance.

  2. #2
    The larch
    Join Date
    May 2006
    Posts
    3,573
    So, what are the couple of things you are stuck with? What do you have so far?

    Neither looks like a trivial task. The second seems quite impossible (if there are no rules, I can't see how a program should sort out where the surname starts).
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  3. #3
    Registered User
    Join Date
    Aug 2007
    Posts
    22
    Well ive started the code now going on the assumption that the person only has one first name and one surname, just so i could finish the rest of the functions but my problem is the algorithm, i just cant think of a way, i dont want any code, im just not sure about the logic behing both issues:

    1)searching for a name equal too or similar to user input
    2) input from txt without any markers etc, assuming there can be white space in both first and surname.

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    if you have NO markers, there is no possibility to determine what is a first name or last name. It's just not possible:
    Peter James Osboure 1234567890

    What is first name and last name of that? It's not possible to tell the difference between "Peter James" "Osbourne" and "Peter" "James Osbourne".

    I would query the teachers thoughts on this, because it is either not well thought out, or you have misunderstood something.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Registered User
    Join Date
    Aug 2007
    Posts
    22
    thats what i thought but im waiting for his reply, anyway any ideas about the searching for similar names ?

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,412
    any ideas about the searching for similar names ?
    You could search the Web for "soundex", though I am not sure if such an algorithm is overkill and your teacher just wants you to search for matching substrings.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    >1)searching for a name equal too or similar to user input

    You could look up (and implement) edit distance. Then you just find the record with the smallest edit distance to the query.

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Aside from soundex, you can do "least difference matching", which would involve counting how many characters in a row matches, and display the "most number of matching in a row". Or how large a proportion (e.g. a percentage) of the characters that match the search-term.

    These are both fairly simple algorithms to use, so shouldn't be too difficult to implement.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #9
    Registered User
    Join Date
    Aug 2007
    Posts
    22
    Thanks Heaps for help, that should keep me going

  10. #10
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    >Or how large a proportion (e.g. a percentage) of the characters that match the search-term.

    Becareful of this one, %100 precision says nothing about recall. (i.e. the entire alphabet as a query string will have a %100 match to every name, but likely isn't close to any of them).

    Matching the longest prefix is probably the best simple thing to do, from there I'd try computing edit distance.

  11. #11
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Yes, single chars and percentage are not a good working model. But then editing distance or soundex aren't going to be HUGELY better when the user enters a single letter - sure they will be better, but the results (on a large phone directory) for entering a single letter will still be hopelessly useless.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  12. #12
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    If the struct with the first name and last name are saved in "struct" format, there is no issue with determining what is the first or last name.

    If a struct is not used, similarly, you could use fixed length fields, like the fname is 30 bytes and sname is 40 bytes.

    As for a fuzzy match when searching, you could keep stats like how many characters matched, how many characters that matched were in order, etc.

    Todd
    Mainframe assembler programmer by trade. C coder when I can.

  13. #13
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Since the input file is a TEXT file, you should not store structs in it. But using fixed width would of course be a possibility for storing first and last name.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed