Thread: Implementing RLE

  1. #1
    and the hat of copycat stevesmithx's Avatar
    Join Date
    Sep 2007
    Posts
    587

    Implementing RLE

    Hi all,
    I implemented RLE using Turbo C.
    However there are a few limitations with the code.
    It compresses 256-bit color images and monochrome bitmap images well.
    However fails with most other bmp files.
    Is this normal?
    Also please give suggestions about the code.
    Code:
    #include <stdio.h>
    #include <conio.h>
    #include <string.h>
    #include <stdlib.h>
    int main(void){
    	FILE *fp,*fq,*fr;
    	unsigned long int t1,t2,i=1,t;
    	int repeat;
    	char fname[30],fname1[30],str[10]="",choice;
    	while(1){
    		clrscr();
    		fflush(stdin);
    		printf("Do you want to Compress or Decompress[C/D]\n");
    		scanf("%c",&choice);
    		switch(choice){
    		case 'C':
    			printf("Enter the name of the file to be compressed\n");
    			scanf("%s",fname);
    			fp=fopen(fname,"rb");
    			printf("Enter the name of the output file\n");
    			scanf("%s",fname1);
    			fq=fopen(fname1,"wb");
    			t1=fgetc(fp);
    			t2=t1;
    			while((t2=fgetc(fp))!=EOF){
    				if(t1==t2){
    					i++;
    				}
    				else{
    					fputc((char)t1,fq);
    					fprintf(fq,"%d",i);
    					fputc('#',fq);
    					i=1;
    					t1=t2;
    				}
    			}
    			fputc((char)t1,fq);
    			fprintf(fq,"%d",i);
    			fputc('#',fq);
    			fseek(fp,0,SEEK_END);
    			fseek(fq,0,SEEK_END);
    			printf("\nFile Compression Succesful\n");
    			printf("\nSize of original file(%s) is %ld bytes\n",fname,ftell(fp));
    			printf("\nSize of Compressed file(%s) is %ld bytes\n",fname1,ftell(fq));
    			fclose(fp);
    			fclose(fq);
    			getch();
    			break;
    		case 'D':
    			printf("Enter the name of the compressed file\n");
    			scanf("%s",fname);
    			printf("Enter the name of the output file\n");
    			scanf("%s",fname1);
    			fp=fopen(fname,"rb");
    			fr=fopen(fname1,"wb");
    			while((t2=fgetc(fp))!=EOF){
    				i=0;
    				while((t1=getc(fp))!='#'){
    					str[i]=t1;
    					i++;
    				}
    				str[i]='\0';
    				repeat=atoi(str);
    				for(i=1;i<=repeat;i++){
    					fprintf(fr,"%c",t2);
    				}
    			}
    			fseek(fp,0,SEEK_END);
    			fseek(fr,0,SEEK_END);
    			printf("\nFile Decompression Succesful\n");
    			printf("\nSize of Compressed file(%s) is %ld bytes\n",fname,ftell(fp));
    			printf("\nSize of Decompressed file(%s) is %ld bytes\n",fname1,ftell(fr));
    			fclose(fp);
    			fclose(fr);
    			getch();
    			break;
    		default:
    			exit(0);
    		}
    	}
    	getch();
    	return 0;
    }
    Thanks in advance.
    Not everything that can be counted counts, and not everything that counts can be counted
    - Albert Einstein.


    No programming language is perfect. There is not even a single best language; there are only languages well suited or perhaps poorly suited for particular purposes.
    - Herbert Mayer

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Describe "fails" - does it crash, or is it just not compressing very well?

    On a 24BPP image, you probably want to read each pixel and then tell how many repeats you have, rather than looking at individual bytes, since typically, the individual R, G and B values are not the same, but several pixels may have the same R, G and B values.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    and the hat of copycat stevesmithx's Avatar
    Join Date
    Sep 2007
    Posts
    587
    Thank you very much for replying.
    Describe "fails" - does it crash, or is it just not compressing very well?
    it is not compressing well.
    Result o/p becomes larger.
    Not everything that can be counted counts, and not everything that counts can be counted
    - Albert Einstein.


    No programming language is perfect. There is not even a single best language; there are only languages well suited or perhaps poorly suited for particular purposes.
    - Herbert Mayer

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by stevesmithx View Post
    Thank you very much for replying.

    it is not compressing well.
    Result o/p becomes larger.
    So, perhaps that's related to what I've just commented regarding 24BPP images - that the Red, Green and Blue components are often different from their fellow components, (e.g. a red(dish) pixel may be 200, 50, 30 of R, G and B respectively.), so your "byte by byte" compression won't do that very well.

    Another thing you may want to do is to check that the sequence of RLE-encryption is actually shorter than the original sequence.

    And finally, your code will fail if the colour of a pixel happens to be 0x23 ('#') - you need some way of "hiding" # in the output file.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Registered User
    Join Date
    Feb 2008
    Posts
    146
    Hello Sir,

    I also worked a little on RLE and soon realized that it does not produce impressive compression..and I was just wondering instead of fetching bytes or pixels and comparing it with the next byte or pixel..if we directly encode runs of binary 0's & 1's..it might give very good compression..Am I right by any chance and if yes then is there any way of addressing bits??

    Thanks
    Edesign

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by edesign View Post
    Hello Sir,

    I also worked a little on RLE and soon realized that it does not produce impressive compression..and I was just wondering instead of fetching bytes or pixels and comparing it with the next byte or pixel..if we directly encode runs of binary 0's & 1's..it might give very good compression..Am I right by any chance and if yes then is there any way of addressing bits??

    Thanks
    Edesign
    Indeed, RLE is best when you only have a few colours to choose from [e.g. 2, 4, 16 or 256 colour modes], since that gives a much higher likelyhood of "the next pixel is the same as the previous one".

    I doubt that looking at it bitwise will help much - you'll still not have many consecutive bits that are similar enough to allow much compression.

    If your images are photographs or similar, then using JPEG is much more efficient (at the cost of loosing some of the original content).

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  7. #7
    Registered User
    Join Date
    Feb 2008
    Posts
    146
    hello Sir,

    True for raw images that they'll have a lot of redundancy but what for the other data.(apart from images)....I know about lossless and lossy modes of JPEG compression...but they are too tough to implement, includes everything huffman & RLE and toughest part I guess will be the quantization matrix.. (may be I can think of doing that as my minor project after improving my programming skills... ) Also I want to know how the professional compression softwares like winzip & winace etc. work..?? Do they distinguish between file types (i.e. it it is an image do something else something) or they just use 1 standard technique for all files?

    Thanks
    Edesign

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    zip and it's relatives use a mixture of techniques, including huffman and RLE, and it compresses on a byte basis (or blocks of bytes), but the output is bit-based. They do not care one iota what the name/type of the file is, they just try the different methods on the file. If nothing works, they just store the file [and if the file-size is small enough, there isn't even an attempt to compress it]. JPEG files usually compress a very small amount, mostly because of compression of the table data that JPEG itself generates [likewise, if you zip a zip-file, it sometimes can compress the directory information, so if you have a zip-file with many small files, it can compress that by a few percent].

    The zip source code is open source, so you can find that on some web-site somewhere (or included in Linux).

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #9
    and the hat of copycat stevesmithx's Avatar
    Join Date
    Sep 2007
    Posts
    587
    Thanks again matsp.
    I found an interesting compression software here which shows unbelievably high compression ratio.
    Not everything that can be counted counts, and not everything that counts can be counted
    - Albert Einstein.


    No programming language is perfect. There is not even a single best language; there are only languages well suited or perhaps poorly suited for particular purposes.
    - Herbert Mayer

  10. #10
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    RLE is a poor choice for 24-bit color. There are 16.7 million possible colors. What are the chances that you'll have a run of all the same color? Not high.

    If your image is 24 bit RGB, but just happens to contain only a few colors, then it might work. But in general, not.

    A better approach for compressing 24-bit color is what PNG does, which is a combination of filtering and prediction.

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by stevesmithx View Post
    Thanks again matsp.
    I found an interesting compression software here which shows unbelievably high compression ratio.
    Where do you get this word "unbelievable?" From that page? The best performance listed in that chart is only 5% or so better than RAR. I hardly call that "unbelievable."

  12. #12
    and the hat of copycat stevesmithx's Avatar
    Join Date
    Sep 2007
    Posts
    587
    Thanks for ur suggestion brewbuck.
    Where do you get this word "unbelievable?" From that page? The best performance listed in that chart is only 5&#37; or so better than RAR. I hardly call that "unbelievable."
    >I used the s/w and found it unbelievable.It shrank a 300MB avi file to 30.8MB.
    However the time taken is very large as mentioned in the site.
    It took something like 1.5 hrs on my 3Ghz CPU
    Not everything that can be counted counts, and not everything that counts can be counted
    - Albert Einstein.


    No programming language is perfect. There is not even a single best language; there are only languages well suited or perhaps poorly suited for particular purposes.
    - Herbert Mayer

  13. #13
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by stevesmithx View Post
    Thanks for ur suggestion brewbuck.

    >I used the s/w and found it unbelievable.It shrank a 300MB avi file to 30.8MB.
    However the time taken is very large as mentioned in the site.
    It took something like 1.5 hrs on my 3Ghz CPU
    That's downright ridiculous, and I almost don't believe it. Did you decompress it and play it?

    There is a theoretical limit to how much a piece of data can be compressed, which is given by its true entropy. The problem is that it is often impossible to calculate what the true entropy is. The point, though, is that data can't be compressed arbitrarily small, no matter how smart your algorithm is. Compressing an AVI by 90&#37; either indicates that the compression algorithm is a lie (in which case, the decompressed result would be garbage), or that the original codec for that AVI file was a piece of garbage.

    Also, requiring hours to compress a piece of data makes it unusable. How fast is the decompression time in comparison?

    Data compression seems to attract the same kind of freakos who invent "perpetual motion machines." I'm not saying you are one, but I think this other guy might be.
    Last edited by brewbuck; 04-03-2008 at 12:01 PM.

  14. #14
    and the hat of copycat stevesmithx's Avatar
    Join Date
    Sep 2007
    Posts
    587
    That's downright ridiculous, and I almost don't believe it. Did you decompress it and play it?

    There is a theoretical limit to how much a piece of data can be compressed, which is given by its true entropy. The problem is that it is often impossible to calculate what the true entropy is. The point, though, is that data can't be compressed arbitrarily small, no matter how smart your algorithm is. Compressing an AVI by 90&#37; either indicates that the compression algorithm is a lie (in which case, the decompressed result would be garbage), or that the original codec for that AVI file was a piece of garbage.

    Also, requiring hours to compress a piece of data makes it unusable. How fast is the decompression time in comparison?

    Data compression seems to attract the same kind of freakos who invent "perpetual motion machines." I'm not saying you are one, but I think this other guy might be.
    Of course it is hard to believe that(that should be the reason for them advertising it as "unbelievable" on their site.).But it is true.
    if you want to give yourself a try,
    Click here to download the s/w itself.(exe)
    and
    Click
    here to download its source.(kgb format).
    the source file (compressed kgb format) size is around 1.28MB
    whereas the decompressed size is around 13.8MB

    and yes the decompressed files are working!
    Not everything that can be counted counts, and not everything that counts can be counted
    - Albert Einstein.


    No programming language is perfect. There is not even a single best language; there are only languages well suited or perhaps poorly suited for particular purposes.
    - Herbert Mayer

  15. #15
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by stevesmithx View Post
    Of course it is hard to believe that(that should be the reason for them advertising it as "unbelievable" on their site.).But it is true.
    if you want to give yourself a try,
    Click here to download the s/w itself.(exe)
    and
    Click
    here to download its source.(kgb format).
    the source file (compressed kgb format) size is around 1.28MB
    whereas the decompressed size is around 13.8MB

    and yes the decompressed files are working!
    It is very unlikely that an already compressed AVI file will compress by 90% without loss of information.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Implementing TLVs
    By brooksbp in forum Networking/Device Communication
    Replies: 1
    Last Post: 07-30-2008, 01:41 PM
  2. Implementing shared/exclusive locking (readers/writer) lock
    By ruj.sabya in forum Linux Programming
    Replies: 0
    Last Post: 05-08-2008, 12:06 AM
  3. Replies: 16
    Last Post: 01-10-2008, 11:38 PM
  4. Useless RLE?
    By VirtualAce in forum Game Programming
    Replies: 17
    Last Post: 03-08-2005, 04:16 PM
  5. Implementing Mathematical Concepts in Code
    By MisterWonderful in forum Tech Board
    Replies: 6
    Last Post: 03-08-2004, 07:44 AM