C Board  

Go Back   C Board > General Programming Boards > Networking/Device Communication

Reply
 
LinkBack Thread Tools Display Modes
Old 07-09-2009, 06:02 AM   #1
Registered User
 
Join Date: Jan 2006
Location: Latvia
Posts: 96
recv() returns 0 the second time

Hello, I want to loop through a message board and read HTML of the topics using Winsock. On the first iteration of the loop it works flawlessly - I can receive the response with no problems, however, on the next iterations recv() returns 0 immediately without even waiting for the data to arrive. Here's my code:
Code:
	for(set<string>::iterator it = topicIDs.begin(); it != topicIDs.end(); it++)
	{
		int iResult = 0;
		string sendbuf = 
			"GET /phpBB3/viewtopic.php?f=7&t=" + *it + " HTTP/1.1\r\n"
			"Host: <valid host>\r\n"
			"\r\n"; 
            //a valid request is generated here

		iResult = send(client, sendbuf.c_str(), sendbuf.length(), 0);
		if(iResult == SOCKET_ERROR)
		{
			Error("send failed");
		}
           //nothing wrong with this send(), it returns the number of bytes

		char recvbuf[512];
		memset(recvbuf, 0, 512);

		do {
			iResult = recv(client, recvbuf, 512, 0);
			if ( iResult > 0 )
			{
				HTML += (recvbuf);
                                     //HTML is an std::string object
			}
			else if ( iResult == 0 )
				Error("Connection closed, recv done\n");
                                                //returns 0 immediately on the second iteration of the loop
			else
				Error("recv failed with error: ");

		} while( iResult > 0 );

Process(HTML);
 //random processing of the code

	}
errno is 0 after each call to recv(). What could be the fault? Thanks.
Overlord is offline   Reply With Quote
Old 07-09-2009, 06:38 AM   #2
Registered User
 
Join Date: Oct 2008
Posts: 452
I expect that is because the http server finished sending its response. By default, http will close the connection after the first request. If you don't want that to happen you can send a "Connection: Keep-Alive" header, even though there's still no guarantee the server won't close the connection.

So either reconnect for each request or add a "Connection: Keep-Alive" and reconnect only when the server closed the connection.

Furthermore, your read routine is wrong. You assume you read all data when the server closes the connection or an error occurs. This isn't always true; you can be done reading the response before that happens. You'll have to read the data, parse it as you receive it, get the proper headers (Content-Length, mostly. But there's also chunked transfer encoding, and maybe even a few others). Then you'll have to parse the data.

All in all, it's not a trivial task, and it depends on how standard compliant you want to be. Usually you should get away with only the content-length, even though I usually also implement the chunked transfer encoding if I need to do such a task.

My advise: read up on the protocol a bit more.
EVOEx is online now   Reply With Quote
Old 07-09-2009, 07:43 AM   #3
Registered User
 
Join Date: Jan 2006
Location: Latvia
Posts: 96
First of all thanks for helping me with my problem. I now understand why that is happening - I'm just waiting for a connection to close before parsing the data. Now it would be easy to fix that if the server I'm sending the requests to had a 'content-length' header, but unfortunately it does not. Do you know of a way to find the length of the content any other way?

Thanks again.
Overlord is offline   Reply With Quote
Old 07-09-2009, 08:39 AM   #4
Registered User
 
Join Date: Oct 2008
Posts: 452
Quote:
Originally Posted by Overlord View Post
First of all thanks for helping me with my problem. I now understand why that is happening - I'm just waiting for a connection to close before parsing the data. Now it would be easy to fix that if the server I'm sending the requests to had a 'content-length' header, but unfortunately it does not. Do you know of a way to find the length of the content any other way?

Thanks again.
Hmmm I'm not sure if the protocol supports a header saying "don't transfer chunked". I expect there must be some way, though. But chunked transfer encoding isn't too hard to implement. It's simply a line with a hexadecimal number N, followed by N bytes, repeated until N = 0. All these chunks form the response. Leave out the chunk sizes.
EVOEx is online now   Reply With Quote
Old 07-09-2009, 10:34 AM   #5
Registered User
 
Join Date: Sep 2004
Location: California
Posts: 2,845
Quote:
Hmmm I'm not sure if the protocol supports a header saying "don't transfer chunked". I expect there must be some way, though.
Chunked encoding can be sent to the client even if it is not specified in the accept header. I don't think there is any way to tell the server you don't accept chunked encoding.
bithub is online now   Reply With Quote
Old 07-09-2009, 04:57 PM   #6
Registered User
 
Join Date: Oct 2008
Posts: 452
Quote:
Originally Posted by bithub View Post
Chunked encoding can be sent to the client even if it is not specified in the accept header. I don't think there is any way to tell the server you don't accept chunked encoding.
I think you're right. From the RFC (http://www.faqs.org/rfcs/rfc2616.html):

Quote:
Originally Posted by RFC2616
All HTTP/1.1 applications that receive entities MUST accept the
"chunked" transfer-coding (section 3.6), thus allowing this mechanism
to be used for messages when the message length cannot be determined
in advance.
So, no choice other than to implement it...
EVOEx is online now   Reply With Quote
Old 07-09-2009, 07:20 PM   #7
int x = *((int *) NULL);
 
Cactus_Hugger's Avatar
 
Join Date: Jul 2003
Location: Banks of the River Styx
Posts: 891
Code:
HTML += (recvbuf);
Buffer overflow here. If recv() actually fills all 512 bytes of that buffer, and none of them are '\0' (which is likely, with HTTP), then you'll have a buffer overrun -- std::string believes recvbuf to be null terminated. Instead, explicitly call a std::string constructor, and pass it the length:
Code:
HTML += std::string(recvbuf, return_value_of_recv);
Quote:
Now it would be easy to fix that if the server I'm sending the requests to had a 'content-length' header, but unfortunately it does not. Do you know of a way to find the length of the content any other way?
The server is required to send a Content-Length header, unless it sends the Connection: close header, in which case it is optional OR it sends a Transfer-Encoding, in which case sending a Content-Length is forbidden. If a Connection: close is sent, then the content length is implicitly all the data following the header.
(In reality the rules are a bit more complex... the RFC is your friend.)

Quote:
So, no choice other than to implement it...
Um, you could also stop re-inventing the wheel, and use something like libcurl.
__________________
long time; /* know C? */
Unprecedented performance: Nothing ever ran this slow before.
Any sufficiently advanced bug is indistinguishable from a feature.
Real Programmers confuse Halloween and Christmas, because dec 25 == oct 31.
The best way to accelerate an IBM is at 9.8 m/s/s.
recursion (re - cur' - zhun) n. 1. (see recursion)
Cactus_Hugger is offline   Reply With Quote
Old 07-10-2009, 04:09 AM   #8
Registered User
 
Join Date: Jan 2006
Location: Latvia
Posts: 96
Thank you all for the replies, I tried to implement chunked encoding, but it was a mess to do and would probably be much slower than libcurl algorithm. I guess I'll stick to libcurl then, since my objective is not worth all this stirring with HTTP. Thank you for the advice.

EDIT:

libcurl throws a lot of linker errors when I compile my application, although I have built the library and linked it to my project

Code:
1>script.obj : error LNK2019: unresolved external symbol __imp__curl_easy_cleanup referenced in function _main
1>script.obj : error LNK2019: unresolved external symbol __imp__curl_easy_perform referenced in function _main
1>script.obj : error LNK2019: unresolved external symbol __imp__curl_easy_setopt referenced in function _main
1>script.obj : error LNK2019: unresolved external symbol __imp__curl_easy_init referenced in function _main

Last edited by Overlord; 07-10-2009 at 09:12 AM.
Overlord is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Polynomials and ADT's Emeighty C++ Programming 20 08-19-2008 08:32 AM
Read and set\change system time Hexxx C++ Programming 9 01-02-2006 07:11 AM
calculating user time and time elapsed Neildadon C++ Programming 0 02-10-2003 06:00 PM
time class Unregistered C++ Programming 1 12-11-2001 10:12 PM


All times are GMT -6. The time now is 04:28 PM.


Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.0 RC2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22