Thread: It's Alive! I've Build a Monster!

    Smile It's Alive! I've Build a Monster!

    I am thinking about putting my life's work out as freeware, and was wondering whether anybody would be interested in using it.

    I once spent far too long locked up in a dark little room (around 3 years to be precise) where I built a monster. My monster is a NLP Lexical Reference API & and large amount of English language content I amassed from free sources around the internet.

    Applications for my API would include chatbots, document retrieval, translation, document categorisation & comparison, summarisation, spell checking, word games etc. I suspect this kind of thing will be big in a few years. I know that M$ & Kodak are pouring money into it (see My API works with VC++ 6, BCB, and I believe it will work with Bloodshed (though I haven't tested it). But beware, it's not for the faint hearted.

    I am asking for interest because it would take me a couple of weekends to get everything ready & update my site. I put a huge amount of effort into my monster & would like to see it go somewhere, and get some feedback of what people think. But if there's little interest, I'll leave it until I have more time on my hands.

    If you want to get a feel for what I've done, you can have a look at my site (address below). There is a shareware application, called Tara Lexical Deskop, which I am (as of this moment) making freeware. This application was built with my API. You are all welcome to it, and feedback, good, bad or ugly is requested.

    As well as putting my API out as freeware (which is fully documented), I plan to put the content data (lexical files) out as freeware also.

    So post me a reply if you would be interested, or could have a use for this kind of thing.

    I grant all those in possession of this notice, the following:

    You're Software Activation Code is as follows:


    You can unlock Tara Lexi Desktop immediately by entering this string into the application using the 'Update License' option under the Tools menu. Ensure you keep this Activation Code safe in case you need to re-install the software.

    Alpha India 5 Ltd grants you the right to install and use the product on an UNLIMITED number of computers, provided such computers are located at a single physical site. Please refer to the License Agreement which accompanies the software installation for further details.

    Andy Thomas
    AKA. Davros
    I've just downloaded your program and it looks very nice.
    The search function is the best I've seen anywhere. Near-match really helps to find the correct spelling.

    One thing: Why am I unable to select text from the definition window to the right?

    Have you used a program called Babylon? It is a dictionary that is invoked automatically when the user clicks shift+right button and translates the highlighted word (from any program).

    As for using your API, I don't have any projects going on that require such functions.
    How much time did you have?

    More time than me! God damn 56k....anyway I got the full version and I must say..

    Nice job. Will be useful.
    Such is life.

    holy crap dood, did you copy+paste all those words off the net?? i bet you went through a couple of keyboards pushing all those ctrl c and p buttons, damn.

    Looks to be fairly interesting, and it handles well in the standard tree of Wine.

    I shudder to think about the time it took to compile that list...

    Also, you mentioned freeware...

    Have you looked into the GPL or the AFPL? Much better for distribution than a freeware or public domain license.

    Wow thats an ugly mug...

    Peer into the forehead of Davros...


    Hey thanks for the posts. Here are my replies:

    >One thing: Why am I unable to select text from the definition window to the right?

    The window on the right isn't an 'edit control'. And the 'hot track' synonyms are my own construction. While it won't let you select text from the definition directly, it will let you copy the definition to clipboard (see Edit menu).

    >More time than me! God damn 56k....anyway I got the full version and I must say.

    Yeh I know what you mean. I uploaded the whole lot with a dialup. It took around 8 hours.

    >holy crap dood, did you copy+paste all those words off the net??

    Not quite, but my efforts were not insignificant. What I did was design my own lexicon structure, and use disparate source data to populate it. The two principle sources of data were WordNet & Moby, which are both freely available. However, I would be hurt if people thought I simply pasted these into one file. Moby is good at listing a huge number of word spellings; and while it attempts to categorise words as noun/verb/adjective etc., many words are incorrectly categorised and it doesn't seem to have a really consistent approach. WordNet provides some really useful content, including definitions, antonyms, hypernyms etc. However, WordNet only contains nouns, verbs, adjectives and adverbs. Also, WordNet only lists base word forms, so it would list the word 'heavy', but not 'heavier', or 'heaviest'. So in building the data, I matched such words with their base form using a technique called 'morphology'. If you look up the word 'heaviest' in Tara, you will find the definition 'adj. (superlative of 'heavy')', while all synonyms, antonyms etc. will be of superlative form, again found by morphology. 'Tables' is there also as plural form of 'table'. However, 'Londons' is not there because London is a proper noun and has no plural form. See what I mean? To do all this properly took me three years. I also added several hundred word definitions by hand.

    >Have you looked into the GPL or the AFPL? Much better for distribution than a freeware or public domain license.

    Yes. Thanks for the suggestion. I use the term 'freeware' losely at the moment. While I do want to put everything out in an open way, I still need to look closely at the licensing.

    >Wow thats an ugly mug. Peer into the forehead of Davros...

    Thanks for that.

    There are a lot really powerful things you can do with software if you have access to a large stock of language data. Companies charge high prices for enterprise solutions which rely on such data. While there is lot language data freely available out there, it is disparate & in many cases you will a need Phd in lexicography just to break the ice if you want to understand it. I wanted to bring all this together under one roof, and present it to programmers in a much more accessible way.

    Edit: I am planning to do this in a non-commercial way. So please ignore any commerical references on my site, which I'll be updating in the near future.

    This is what I would like to achieve.

    Let me know what you think.



    All work & no play makes Davros a dull boy.
