Thread: IDEA: Text Summarization

  1. #1
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640

    IDEA: Text Summarization

    I sat in on a masters thesis seminar that inspired this idea. The problem is to create a summary of a given document. the trade offs are accuracy for readability.

    example problem:

    Write a program that will create a 50 word summary of a 300 word document. summaries should be judged in 2 catagories, Accuracy and Human Readability.

    extreme solutions:

    1) take the top 50 words based on frequency count. accuracy rating will be high, human reability will be low.

    2) take the first 'n' sentances up to 50 words. Human Readability will be high, however many important points in the document will be left out resulting in a low accuracy score.

    The objective is to find a balance somewhere. Creating a summary that encompases the most information while still being human readable.

    I thought this might be a good contest idea as anyone that can read a file into a program can participate. Newbs can use simple sentance or word selection algorithms while more advanced programmers can dip into areas of NLP (natural language processing) or anything else they can think of.

    Just a thought...

  2. #2
    Pursuing knowledge confuted's Avatar
    Join Date
    Jun 2002
    Posts
    1,916
    Good God that sounds complicated. How long are you expecting this contest to take?
    Away.

  3. #3
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    Originally posted by blackrat364
    Good God that sounds complicated. How long are you expecting this contest to take?
    its really not that complicated. you could write a 5 min program to just select a few sentances which could possibly perform better than someones 2 week implementation of some nasty NLP based word selection algorithm.

    you basically just need to count some words, decide which words/sentances to take and write them to a file. The only thing that might take a long time is judging. i volunteer blackrat for that task lol

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. DirectX | Drawing text
    By gavra in forum Game Programming
    Replies: 4
    Last Post: 06-08-2009, 12:23 AM
  2. reading from text file
    By jamez in forum C Programming
    Replies: 3
    Last Post: 11-30-2005, 07:13 PM
  3. How to use FTP?
    By maxorator in forum C++ Programming
    Replies: 8
    Last Post: 11-04-2005, 03:17 PM
  4. Replies: 1
    Last Post: 07-13-2002, 05:45 PM
  5. Ok, Structs, I need help I am not familiar with them
    By incognito in forum C++ Programming
    Replies: 7
    Last Post: 06-29-2002, 09:45 PM