Iím going to write a text compression program. Iíve attempted to make it relatively simple. Here is the algorithm I will use. Give me some feed back on if the algorithm is logical, and if you have additional ideas on how to increase compression while not increasing complexity significantly, please reply.
To aid in explanation Im going to use a simple pretend text that I want to compress it is "hello the cat is on the big wall."
1. Read a word from file.
2. If the word is in the library then replace with number that corresponds to word.
3. If the word is not in library then put it in the library and replace the word with a number that corresponds to word in library.
4. Repeat 1-3 until file is completely read.
(When the previous steps are finished the output will look something like:5. Go through and look for numbers that are used only one time. When one is found delete that word from the list, and replace the representing number with the actual word.Quote:
hello the cat is on big wall (note: word position defines number used to represent word)
1 2 3 4 5 2 6 7
6. Find the most common number, move the word that corresponds to that letter to the beginning of the word list (for example: if "the" is the most common word and it is said 100 times, but it is represented by the number "11". Then changing the representing number to "1" will half the number of characters required for saying "the").