Thread: Gzip encoded data from HTTP

  1. #1
    Registered User
    Join Date
    Jan 2012
    Posts
    3

    Gzip encoded data from HTTP

    Hi. I having a problem with a web site sending data gzip compressed even when told not to (Transfer-encoding: idenity). So I looked up how gzip is compressed (RFC 1951 & 1952) and wrote am inflate function. It compiles and runs, and the first 1000-2000 bytes seem to decompress fine, then errors start to occur. It's taken 3 weeks to bash this into my head and turn it into c code. I'm hoping some one with some experience can see a logic error I may have made.

    gzip.h
    gzip.cpp

    Sorry this is in the networking forum it's in a rss downloading program.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Do you have a particular reason not to use zlib Home Site

    > I'm hoping some one with some experience can see a logic error I may have made.
    A test wrapper "main()" and an example data file which causes the problem would help.
    Then we could just run the code in a debugger and start looking (have you run the code in the debugger?)
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Dec 2007
    Posts
    2,675
    I'm concerned you've spent a long time re-inventing the wheel. Based on your use of MessageBoxA, you seem to be using Windows, which allows you to use WinInet for HTTP transport which in turn will do the decoding for you. You might consider investigating that.

  4. #4
    Registered User
    Join Date
    Dec 2011
    Posts
    795
    > which allows you to use WinInet for HTTP transport which in turnwill do the decoding for you.

    Am I the only one concerned that
    a) it's not portable, and
    b) you have no idea as to the overhead that this API provides, the potential waste of memory due to this,
    c) it doesn't leave any of the low-level understanding and fine-tuning up to the user, and finally,
    d) its features are limited to what MS wants to limit it to, and not what the language can do

  5. #5
    Registered User
    Join Date
    Dec 2007
    Posts
    2,675
    If you're using Windows (and he is based on the use of MessageBoxA, which to the best of my knowledge is a Windows API call and therefore non-portable), why not avail yourself of the facilities made available to you by the Win32 SDK?

  6. #6
    Registered User
    Join Date
    Jan 2012
    Posts
    3

    response

    Salem:
    I've actually used the gzip.org source to try and figure this thing and I could have changed the input from files to a buffer, but I got interested in how it works, and the the best way to learn is to do it.

    As to running it in a debugger I've spent plenty of time watching the data decompress. That's how I know the <1000 bytes are decompressing correctly. once the data starts to get over that size errors start appearing.

    wrapper
    wrapper.cpp

    I did have to change gzip.c a little to compile
    change #include<stdafx.h> to #include<stdio.h>
    and remove the call to MessageBoxA(); it's only there for error checking

    data
    Site won't let me up load a data file.
    https://docs.google.com/open?id=0Bxj...IxMWY0YzY4YWIx

    Rags:
    Same thing, doing it to learn not just to get it done.

    Memcpy:
    Nice points, but Rags it kinda right It's all ready a win32 app so there is a bunch of over head any way.
    Last edited by Michael Colvin; 01-17-2012 at 05:00 AM.

  7. #7
    Registered User
    Join Date
    Jan 2012
    Posts
    3
    Howdy guys. if your still following, I solved my problem. In the GetBit function v is declared short (2 bytes) and BitBuffer is declared int (4 bytes). So "G_BitBuffer|=v;" is converting G_BitBuffer to a 2 byte element. There is a case when the bits will exceeded 16 and be lost. You never get more then 13 at one time but if your at 12 and need 13 you pull the next char (8 bits) running over and loosing your top bits.

    So this is one thing I've wondered for a long time, Why was't C designed with fixed size data types? a short is only defined as being longer or equal to and char and shorter or equal to a int in size. Visual C allows for _int8,_int16,_int32,_int64 but I don't think that is portable. Do you guys know or have link to find out why?

  8. #8
    Registered User
    Join Date
    Dec 2011
    Posts
    795
    C is designed with fixed-size data types, or at least the potential to implement them. The best way to use the <stdint.h>, which deals with the byte size problems (types follow the standard "uint8_t, int8_t, uint16_t, ..."). Or, you could implement them yourself by doing #ifdef for the architecture and system, and if it's not defined, define them.

    The automatic conversion thing really shouldn't be an issue, but if it is, typecast.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. http post data to a website help
    By Anddos in forum C++ Programming
    Replies: 3
    Last Post: 08-24-2009, 05:45 PM
  2. uploading file to http server via multipart form data
    By Dynamo in forum C++ Programming
    Replies: 1
    Last Post: 09-03-2008, 04:36 AM
  3. Replies: 3
    Last Post: 12-07-2006, 03:06 AM
  4. sending HTTP POST data with Socket
    By Overtaker in forum Networking/Device Communication
    Replies: 10
    Last Post: 09-07-2006, 10:11 AM
  5. GZIP Data
    By cx323 in forum C# Programming
    Replies: 1
    Last Post: 06-22-2006, 07:31 AM