File Compression

**Gr3g** · 04-16-2002

Hey,

I’m going to write a text compression program. I’ve attempted to make it relatively simple. Here is the algorithm I will use. Give me some feed back on if the algorithm is logical, and if you have additional ideas on how to increase compression while not increasing complexity significantly, please reply.

To aid in explanation Im going to use a simple pretend text that I want to compress it is "hello the cat is on the big wall."

1. Read a word from file.

2. If the word is in the library then replace with number that corresponds to word.

3. If the word is not in library then put it in the library and replace the word with a number that corresponds to word in library.

4. Repeat 1-3 until file is completely read.
(When the previous steps are finished the output will look something like:

hello the cat is on big wall (note: word position defines number used to represent word)
1 2 3 4 5 2 6 7

5. Go through and look for numbers that are used only one time. When one is found delete that word from the list, and replace the representing number with the actual word.

6. Find the most common number, move the word that corresponds to that letter to the beginning of the word list (for example: if "the" is the most common word and it is said 100 times, but it is represented by the number "11". Then changing the representing number to "1" will half the number of characters required for saying "the").

Thank you,
Gr3g

**Prelude** · 04-16-2002

It looks good to me. What do you plan to do when your word list becomes very large though? Perhaps if you were compressing Knuth's Art of Programming series. *grin*

-Prelude

Thread: File Compression

Thread Tools

Search Thread

Display

File Compression

Similar Threads

Data Structure Eror

Can we have vector of vector?

Basic text file encoder

System

Hmm....help me take a look at this: File Encryptor