Thread: recv() returns 0 the second time

  1. #1
    Registered User
    Join Date
    Jan 2006
    Location
    Latvia
    Posts
    102

    recv() returns 0 the second time

    Hello, I want to loop through a message board and read HTML of the topics using Winsock. On the first iteration of the loop it works flawlessly - I can receive the response with no problems, however, on the next iterations recv() returns 0 immediately without even waiting for the data to arrive. Here's my code:
    Code:
    	for(set<string>::iterator it = topicIDs.begin(); it != topicIDs.end(); it++)
    	{
    		int iResult = 0;
    		string sendbuf = 
    			"GET /phpBB3/viewtopic.php?f=7&t=" + *it + " HTTP/1.1\r\n"
    			"Host: <valid host>\r\n"
    			"\r\n"; 
                //a valid request is generated here
    
    		iResult = send(client, sendbuf.c_str(), sendbuf.length(), 0);
    		if(iResult == SOCKET_ERROR)
    		{
    			Error("send failed");
    		}
               //nothing wrong with this send(), it returns the number of bytes
    
    		char recvbuf[512];
    		memset(recvbuf, 0, 512);
    
    		do {
    			iResult = recv(client, recvbuf, 512, 0);
    			if ( iResult > 0 )
    			{
    				HTML += (recvbuf);
                                         //HTML is an std::string object
    			}
    			else if ( iResult == 0 )
    				Error("Connection closed, recv done\n");
                                                    //returns 0 immediately on the second iteration of the loop
    			else
    				Error("recv failed with error: ");
    
    		} while( iResult > 0 );
    
    Process(HTML);
     //random processing of the code
    
    	}
    errno is 0 after each call to recv(). What could be the fault? Thanks.

  2. #2
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    I expect that is because the http server finished sending its response. By default, http will close the connection after the first request. If you don't want that to happen you can send a "Connection: Keep-Alive" header, even though there's still no guarantee the server won't close the connection.

    So either reconnect for each request or add a "Connection: Keep-Alive" and reconnect only when the server closed the connection.

    Furthermore, your read routine is wrong. You assume you read all data when the server closes the connection or an error occurs. This isn't always true; you can be done reading the response before that happens. You'll have to read the data, parse it as you receive it, get the proper headers (Content-Length, mostly. But there's also chunked transfer encoding, and maybe even a few others). Then you'll have to parse the data.

    All in all, it's not a trivial task, and it depends on how standard compliant you want to be. Usually you should get away with only the content-length, even though I usually also implement the chunked transfer encoding if I need to do such a task.

    My advise: read up on the protocol a bit more.

  3. #3
    Registered User
    Join Date
    Jan 2006
    Location
    Latvia
    Posts
    102
    First of all thanks for helping me with my problem. I now understand why that is happening - I'm just waiting for a connection to close before parsing the data. Now it would be easy to fix that if the server I'm sending the requests to had a 'content-length' header, but unfortunately it does not. Do you know of a way to find the length of the content any other way?

    Thanks again.

  4. #4
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Quote Originally Posted by Overlord View Post
    First of all thanks for helping me with my problem. I now understand why that is happening - I'm just waiting for a connection to close before parsing the data. Now it would be easy to fix that if the server I'm sending the requests to had a 'content-length' header, but unfortunately it does not. Do you know of a way to find the length of the content any other way?

    Thanks again.
    Hmmm I'm not sure if the protocol supports a header saying "don't transfer chunked". I expect there must be some way, though. But chunked transfer encoding isn't too hard to implement. It's simply a line with a hexadecimal number N, followed by N bytes, repeated until N = 0. All these chunks form the response. Leave out the chunk sizes.

  5. #5
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    Hmmm I'm not sure if the protocol supports a header saying "don't transfer chunked". I expect there must be some way, though.
    Chunked encoding can be sent to the client even if it is not specified in the accept header. I don't think there is any way to tell the server you don't accept chunked encoding.

  6. #6
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Quote Originally Posted by bithub View Post
    Chunked encoding can be sent to the client even if it is not specified in the accept header. I don't think there is any way to tell the server you don't accept chunked encoding.
    I think you're right. From the RFC (http://www.faqs.org/rfcs/rfc2616.html):

    Quote Originally Posted by RFC2616
    All HTTP/1.1 applications that receive entities MUST accept the
    "chunked" transfer-coding (section 3.6), thus allowing this mechanism
    to be used for messages when the message length cannot be determined
    in advance.
    So, no choice other than to implement it...

  7. #7
    int x = *((int *) NULL); Cactus_Hugger's Avatar
    Join Date
    Jul 2003
    Location
    Banks of the River Styx
    Posts
    902
    Code:
    HTML += (recvbuf);
    Buffer overflow here. If recv() actually fills all 512 bytes of that buffer, and none of them are '\0' (which is likely, with HTTP), then you'll have a buffer overrun -- std::string believes recvbuf to be null terminated. Instead, explicitly call a std::string constructor, and pass it the length:
    Code:
    HTML += std::string(recvbuf, return_value_of_recv);
    Now it would be easy to fix that if the server I'm sending the requests to had a 'content-length' header, but unfortunately it does not. Do you know of a way to find the length of the content any other way?
    The server is required to send a Content-Length header, unless it sends the Connection: close header, in which case it is optional OR it sends a Transfer-Encoding, in which case sending a Content-Length is forbidden. If a Connection: close is sent, then the content length is implicitly all the data following the header.
    (In reality the rules are a bit more complex... the RFC is your friend.)

    So, no choice other than to implement it...
    Um, you could also stop re-inventing the wheel, and use something like libcurl.
    long time; /* know C? */
    Unprecedented performance: Nothing ever ran this slow before.
    Any sufficiently advanced bug is indistinguishable from a feature.
    Real Programmers confuse Halloween and Christmas, because dec 25 == oct 31.
    The best way to accelerate an IBM is at 9.8 m/s/s.
    recursion (re - cur' - zhun) n. 1. (see recursion)

  8. #8
    Registered User
    Join Date
    Jan 2006
    Location
    Latvia
    Posts
    102
    Thank you all for the replies, I tried to implement chunked encoding, but it was a mess to do and would probably be much slower than libcurl algorithm. I guess I'll stick to libcurl then, since my objective is not worth all this stirring with HTTP. Thank you for the advice.

    EDIT:

    libcurl throws a lot of linker errors when I compile my application, although I have built the library and linked it to my project

    Code:
    1>script.obj : error LNK2019: unresolved external symbol __imp__curl_easy_cleanup referenced in function _main
    1>script.obj : error LNK2019: unresolved external symbol __imp__curl_easy_perform referenced in function _main
    1>script.obj : error LNK2019: unresolved external symbol __imp__curl_easy_setopt referenced in function _main
    1>script.obj : error LNK2019: unresolved external symbol __imp__curl_easy_init referenced in function _main
    Last edited by Overlord; 07-10-2009 at 09:12 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Polynomials and ADT's
    By Emeighty in forum C++ Programming
    Replies: 20
    Last Post: 08-19-2008, 08:32 AM
  2. Read and set\change system time
    By Hexxx in forum C++ Programming
    Replies: 9
    Last Post: 01-02-2006, 07:11 AM
  3. calculating user time and time elapsed
    By Neildadon in forum C++ Programming
    Replies: 0
    Last Post: 02-10-2003, 06:00 PM
  4. time class
    By Unregistered in forum C++ Programming
    Replies: 1
    Last Post: 12-11-2001, 10:12 PM