Thread: Compress and uncompress? Please help....

  1. #1
    Registered User
    Join Date
    Apr 2008
    Posts
    57

    Question Compress and uncompress? Please help....

    I don't understand what is being asked.
    I don't want the code, I just want to know how to go about it and where to look for examples of this.

    A simple scheme for creating a compressed version of a text file can be used for files which contain no digit characters. The compression scheme requires making a list of the words in the uncompressed file. When a non-alphabetic character is encountered in the uncompressed file, it is copied directly into the compressed file. When a word is encountered in the uncompressed file, it is copied directly into the compressed file only if this is the first occurrence of the word. In that case, the word is put at the front of the list. If it is not the first occurrence, the word is not copied to the compressed file. Instead, its position in the list is copied into the compressed file and the word is moved to the front of the list. The numbering of list positions begins at 1.

    Write a program that takes a compressed file as input and generates a reproduction of the original uncompressed file as output. You can assume that no word contains more than 50 characters and that the original uncompressed file contains no digit characters.

    For the purposes of this problem, a word is defined to be a maximal sequence of upper- and lower-case letters. Words are case-sensitive - the word abc is not the same as the word Abc. For example,

    x-ray, two words: x and ray
    that's it, three words: that and is it

    There is a 100 word upper limit on the number of different words in the input file. The end of the input file is signified by the number 0 on a line by itself. The terminating 0 merely indicates the end of the input and should not be part of the output produced by your program.

    Sample Input

    Dear Sally,

    Please, please do it--1 would 4
    Mary very, 1 much. And 4 6
    8 everything in 5's power to make
    14 pay off for you.

    -- Thank 2 18 18--
    0

    Sample Output

    Dear Sally,

    Please, please do it--it would please
    Mary very, very much. And Mary would
    do everything in Mary's power to make
    it pay off for you.

    -- Thank you very much--

  2. #2
    Registered User
    Join Date
    Jan 2008
    Posts
    28
    uhhh. Is this all the info you were given with this problem? There must be something missing....

    Based on the rules from your explanation, this is what the 'Sample Input' should actually look like:

    "Please, please do it--4 would 1 Mary very, 7 much. And 6 5 3 everything in 6's power to make 4 pay off for you."

    Did you copy this verbatim?

    -Feen
    Last edited by Feengur; 03-02-2010 at 03:29 PM. Reason: spelin' mestake

  3. #3
    Registered User
    Join Date
    Jan 2010
    Location
    Germany, Hannover
    Posts
    15
    well, interesting task
    I don't understand what is being asked.
    what is being asked is clear:
    Write a program that takes a compressed file as input and generates a reproduction of the original uncompressed file as output.
    how to do this ? first u have to read the input file line by line. then u have to process each line char by char and collect chars into a word , until a non alphabetical char is met (not a-z, A-Z, maybe digit or line feed)
    with the given boundaries ( max. no of words, max length of each word ) i would suggest the use of a char array ( e.g.: char word[100] [50] ), so u can solve the problem easier without the use of pointers.
    ( the description however seems to favour a solution with the use of a linked pointer list "the word is put at the front of the list..." )
    each word u recognize u put into the array. if u parse a number ( don't forget it can have 2 digits ) get that word from ur array word list and write it to the output file.
    u have to calculate the correct index, it should be somthing like ~ max-index + 1 - number

    if u recognize a char, that is no letter nor digit - write it to the output file.
    if u get the "0", ur done. close output and input file...

    happy coding
    Last edited by kermitaner; 03-02-2010 at 02:36 PM.

  4. #4
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    Based on the problem statement, the contents of the compressed input file make no sense.
    Concur with Feengur, tho' Feengur's list positions are off by 2 as "Dear Sally" got overlooked.

  5. #5
    Registered User
    Join Date
    Jan 2008
    Posts
    28
    woops!

    I missed the 'Thank you' part as well....

  6. #6
    Registered User
    Join Date
    Apr 2008
    Posts
    57
    ok, I am sorry, how would I get the index. Feengur and itCbitC, how are you figuring this out?

  7. #7
    Registered User
    Join Date
    Jan 2010
    Posts
    412
    Actually the input does make sense.
    Using this part as example.
    Code:
    Please, please do it--1 would 4
    Mary very, 1 much. And 4 6
    Code:
    Please, please do it--1 would 4
    For the 1, look 1 word back ("it")
    For the 4, look 4 words back ("please") and then move "please" to the front. The input word list is now
    { Please, do, it, would, please }

    Output text:
    Please, please do it--it would please

    Code:
    Mary very, 1 much. And 4 6
    For the 1, look 1 word back ("very")
    For the 4, look 4 words back ("Mary") and then move "Mary" to the front. The input word list is now
    { Please, do, it, would, please, very, much, And, Mary }

    For the 6, look 6 words back ("would") and then move "would" to then front. Input word list is now
    { Please, do, it, please, very, much, And, Mary, would }

    Output text:
    -- Mary very, very much. And Mary would

    Although these lines
    Code:
    14 pay off for you.
    -- Thank 2 18 18--
    seems wrong. Unless I'm missing a word somewhere they should be
    13 pay off for you.
    -- Thank 2 17 17--[/CODE]

  8. #8
    Registered User
    Join Date
    Mar 2010
    Posts
    3
    okay... everyone here is very close but not quite there... this is supposed to be read from a file... and whenever a word is read it is added to the front of the array (it , do , please) is what we get when we hit the first number... which calls the first value of the array we have... meaning that it would be used in ones place and MOVED to the front of the array ( would it do please ) would be when we hit the 4 which would cause us to use the word please as the number 4 (fourth spot in the array)... which would leave us with (please would it do)... so on so forth...
    Last edited by sk8er2017; 03-02-2010 at 08:58 PM.

  9. #9
    Registered User
    Join Date
    Apr 2008
    Posts
    57
    this is the part I don't understand. do you count the numbers as words to count back?

  10. #10
    Registered User
    Join Date
    Mar 2010
    Posts
    3
    Quote Originally Posted by patso View Post
    this is the part I don't understand. do you count the numbers as words to count back?
    im just going to say no... think of it as an array and when you read another word you put it to the front... *reading it backwards but it doesn't matter* ... then whatever number you encounter is the word in the array (1 being first word)... that clear it up?

  11. #11
    Registered User
    Join Date
    Apr 2008
    Posts
    57
    i believe so

  12. #12
    Registered User
    Join Date
    Jan 2008
    Posts
    28
    Your explanation makes sense sk8er, but the problem itself does not......

    I guess I overlooked the line that stated that a new occurrence of a word would be added to the front of the list.

    Why would you ever want to do it this way?

    @patso
    Is this for a class? If so, what is the name of the class and what 'level' is it?

    -Feen

  13. #13
    Registered User
    Join Date
    Apr 2008
    Posts
    57
    it is for an intro to algorithms course. It is first year second semester

  14. #14
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    IMO, the o/p needs an array of char* pointers, each pointing to a unique word while the array index denotes the position of the word, the first of its kind, in the input text. As list positions start at 1, the zeroth (first) element of the array needs to point somewhere, and one way would be to point to the last word seen. That perhaps may resolve the confusion caused by "...and the word is moved to the front of the list", tho' really the o/p should clarify that.

  15. #15
    Registered User
    Join Date
    Mar 2010
    Posts
    3
    yeah the problem doesn't make much sence in my opinion either ... its just a question that our professor used from International ACM Programming Contest ... and we are really supposed to be doing it the way i explained it but he kinda did it a different way today in a lab... like read the file one line at a time putting it into a char array then reading the array one char at a time putting each char into another variable word which is then dumped into a wordlist (that i was talking about) whenever you hit a delimiter... this will generate the wordlist necessary for the numbers... then whenever you hit a number while checking the chars you would do the same (read until a space (for the case of 18 which has two numbers)) then replace it with whichever word from the wordlist we have created.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Using zlib to compress traffic
    By StanleyC in forum Networking/Device Communication
    Replies: 1
    Last Post: 08-12-2004, 02:18 PM