Thread: Downloading ZIP

  1. #1
    Registered User
    Join Date
    Dec 2008
    Posts
    104

    Downloading ZIP

    Hello,

    In my application, I have the need to download a ZIP file from an http server. I simply send the http server the GET request (along with other requirements, ie host header, etc) and I continuously read from the buffer and through each iteration, I write the read data to a file (of extension ZIP) using the fwrite() function.

    What happens is, that when I come to extract the file it says: "No archives found". But if I download the same ZIP from my browser, I can extract the files perfectly.

    My question is, that if I am reading ALL of the data from a ZIP file and writing it to another ZIP file, why should it not work?

    PS. I know you might think it is a problem with the connecting/reading/writing, but it is not, because when I request a file other than a file with the zip-compression, it works fine.

    Thank you,
    abraham2119

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Are both files the same size? If so, are the content the same?

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    int x = *((int *) NULL); Cactus_Hugger's Avatar
    Join Date
    Jul 2003
    Location
    Banks of the River Styx
    Posts
    902
    PS. I know you might think it is a problem with the connecting/reading/writing, but it is not, because when I request a file other than a file with the zip-compression, it works fine.
    It's a problem with reading/writing.

    Are you taking into account encodings, such as chunked? Are you opening your file in binary mode, if applicable? And could you post some code, so that I can actually analyze it instead of taking shots in the dark?

    Additionally, check out libcurl. It's a library that does HTTP, and takes the pain out of it for you. You might be able to get something working, but I guarentee you that you'll miss a corner case like chunked/gzip'd encoded transfers, or an IIS server sending random 100 Continue statements.
    long time; /* know C? */
    Unprecedented performance: Nothing ever ran this slow before.
    Any sufficiently advanced bug is indistinguishable from a feature.
    Real Programmers confuse Halloween and Christmas, because dec 25 == oct 31.
    The best way to accelerate an IBM is at 9.8 m/s/s.
    recursion (re - cur' - zhun) n. 1. (see recursion)

  4. #4
    Registered User
    Join Date
    Dec 2008
    Posts
    104
    Quote Originally Posted by matsp View Post
    Are both files the same size? If so, are the content the same?

    --
    Mats
    No they are not the same size, nor the same content. That is why I posted here, what can be the possible reason if I am reading all of the data from the buffer.

    Quote Originally Posted by Cactus_Hugger View Post
    It's a problem with reading/writing.

    Are you taking into account encodings, such as chunked? Are you opening your file in binary mode, if applicable? And could you post some code, so that I can actually analyze it instead of taking shots in the dark?

    Additionally, check out libcurl. It's a library that does HTTP, and takes the pain out of it for you. You might be able to get something working, but I guarentee you that you'll miss a corner case like chunked/gzip'd encoded transfers, or an IIS server sending random 100 Continue statements.
    Thank you very much for your response.

    No, I am not taking chunked encodings into consideration. (Can you please link me?)
    No, I am not opening my file in binary mode.

    And, here is my code:
    Code:
    	FILE *pfile = fopen(file, "w");
    	char buff[8000];
            char writebuff[8000];
    	sock s;
    	int first = 0;
    
    	startup();
    	opensock(&s);
    	sockconnect(&s, "<my_host>", 80);
    
    	write(&s, "GET /<my_file>.zip HTTP/1.1\r\n");
    	write(&s, "HOST: <my_host>\r\n");
    	write(&s, "User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)\r\n");
    	write(&s, "Connection: Close\r\n\r\n"); int i = 0;
    
    	
    	while(1) {
    		memset(buff, 0, 7999);
    
    		if ((i = read(&s, buff, 7999)) > 0) {
    			printf("READ: %d\n", i);
    			memset(writebuff, 0, 7999);
    			if (!first) {
    				char secbuff[8000];
    				substring(buff, secbuff, indexofstr(buff, "\r\n\r\n", 1) + 4, strlen(buff));
    				strncpy(writebuff, secbuff,strlen(secbuff) + 1); 
    			}
    
    			else {
    				strncpy(writebuff, buff,strlen(buff) + 1); 
    			}
    
    			fwrite(writebuff, sizeof(char), i, pfile);
    			fflush(pfile);
    		}
    
    		else { break; }
    
    		if (!first) { printf("%s", buff); first = 1; }
    	}
    
    	fclose(pfile);
    	free(pfile);
    
    	closesocket((SOCKET) s); printf("\n\n\nDONE");
    Sorry for the sloppy coding, I was getting frustrated. :P

    I do not think it is necessary to show you my string functions or socket functions.

    Thanks again.
    Last edited by abraham2119; 05-05-2009 at 04:34 PM.

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Using strlen() on binary data is probably not going to work...

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #6
    Registered User
    Join Date
    Dec 2008
    Posts
    104
    Quote Originally Posted by matsp View Post
    Using strlen() on binary data is probably not going to work...

    --
    Mats
    I edited it:
    Code:
    fwrite(writebuff, sizeof(char), i, pfile)
    'i' being the amount of bytes read from the buffer.

  7. #7
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    And, you open the file for writing, but you should add binary.

    And the stripping of the header isn't really great either. You can't be sure the header end in the first packet you receive...

  8. #8
    int x = *((int *) NULL); Cactus_Hugger's Avatar
    Join Date
    Jul 2003
    Location
    Banks of the River Styx
    Posts
    902
    Code:
    fwrite(writebuff, sizeof(char), strlen(writebuff), pfile);
    If your buffer has embedded \0's (nuls), then strlen() is going to give you the wrong value. A zip file will almost certainly have a 0 byte.

    Code:
    free(pfile);
    You do not need to free a FILE *. fclose() handles this for you.

    You need to fopen() with "wb" - write binary. Otherwise, some operating systems may translate things like "\n" into "\r\n" (namely Windows) and corrupt your ZIP.

    Finally, your reading loop... you _need_ to check for error codes. read() is not guarenteed to give you the amount of data you asked for (or even any data at all).
    Your buffer management is haphazard. Why do you only zero 7999 bytes of the 8000? You never touch the last byte, and fill in 7999/8000. If that last byte is not null (and is very likely not), if read() does actually read 7999 bytes, and you call strlen() on it (you do), you risk a buffer overflow.

    Allocate a buffer. Read data into that buffer until you've found the header. Once you have found the header, write everything that is not header to your file. Then continue reading the socket again, and just write to file.
    Ignore chunked-encoding for the moment... check your headers. If you see "Transfer-Encoding: chunked" in your headers, then you're getting chunked data and subsequently have to worry, but one step at a time...
    long time; /* know C? */
    Unprecedented performance: Nothing ever ran this slow before.
    Any sufficiently advanced bug is indistinguishable from a feature.
    Real Programmers confuse Halloween and Christmas, because dec 25 == oct 31.
    The best way to accelerate an IBM is at 9.8 m/s/s.
    recursion (re - cur' - zhun) n. 1. (see recursion)

  9. #9
    Registered User
    Join Date
    Dec 2008
    Posts
    104
    The problem is still occuring.

    Here is my code:
    Code:
    	FILE *pfile = fopen(file, "wb");
    	char buff[8000];
            char writebuff[8000];
    	sock s;
    	int first = 0;
    	int total = 0;
    
    	startup();
    	opensock(&s);
    	sockconnect(&s, "<my_host>", 80);
    
    	write(&s, "GET /<my_file>.zip HTTP/1.1\r\n");
    	write(&s, "HOST:<my_host>\r\n");
    	write(&s, "User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)\r\n");
    	write(&s, "Connection: Close\r\n\r\n"); int i = 0;
    
    	
    	while(1) {
    		total += i;
    		memset(buff, 0, 8000);
    
    		if ((i = read(&s, buff, 8000)) > 0) {
    			printf("READ: %d\n", i);
    			memset(writebuff, 0, 8000);
    			if (!first) {
    				char secbuff[8000];
    				substring(buff, secbuff, indexofstr(buff, "\r\n\r\n", 1) + 4, i);
    				strncpy(writebuff, secbuff, (i - (indexofstr(buff, "\r\n\r\n", 1) + 4)) + 1); 
    			}
    
    			else {
    				strncpy(writebuff, buff,i + 1); 
    			}
    
    			if (!first) { fwrite(writebuff, sizeof(char),(i - (indexofstr(buff, "\r\n\r\n", 1) + 4)) + 1, pfile); printf("WROTE: %d out of: %d\n", (i - (indexofstr(buff, "\r\n\r\n", 1) + 4)) + 1, i);} 
    			else { fwrite(writebuff, sizeof(char), i + 1, pfile); }
    			fflush(pfile);
    		}
    
    		else { break; }
    
    		if (!first) { printf("%s", buff); first = 1; }
    	}
    
    	fclose(pfile);
    
    	closesocket((SOCKET) s); 
            printf("\n\n\nTotal bytes read: %d\n", total);
    Last edited by abraham2119; 05-05-2009 at 05:18 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. opening a zip file
    By smooth in forum C Programming
    Replies: 8
    Last Post: 12-16-2008, 08:32 AM
  2. C libraries for unpacking zip and rar
    By OnionKnight in forum Tech Board
    Replies: 0
    Last Post: 02-23-2008, 12:16 PM
  3. Downloading files
    By johnchain in forum C Programming
    Replies: 24
    Last Post: 08-15-2005, 11:17 PM
  4. Zip Extractor
    By harryp in forum C++ Programming
    Replies: 2
    Last Post: 09-02-2002, 10:44 AM
  5. downloading DJGPP
    By Garfield in forum A Brief History of Cprogramming.com
    Replies: 2
    Last Post: 11-17-2001, 08:00 AM