Thread: Proxy

  1. #1
    Registered User
    Join Date
    Aug 2012
    Posts
    13

    Proxy

    I have a program that creates two sockets. One for the browser and the other connecting to the server, given in the header. As long as I modify the header option "Content-Encoded: " from 'gzip,deflate' to 'true' the program works correctly, and a proper webpage is displayed in the browser. I been having a hard time lately trying to figure out whats wrong with my program when it recieves gzip encoded messages. Nothing interrupts either socket. All the receiving functions are buffered to 1000, yet I'll only get messages from 10-500 bytes at times. The server quits after 3 - 10 replies. The browser quits responding after that because the message was incomplete. Any ideas what might be causing this issue?

  2. #2
    Registered User
    Join Date
    May 2012
    Location
    Arizona, USA
    Posts
    948
    It could be many things, like how you are multiplexing between reading from each socket, but we would need to see your code before we can say for sure.

  3. #3
    Registered User
    Join Date
    Aug 2012
    Posts
    13
    Quote Originally Posted by christop View Post
    It could be many things, like how you are multiplexing between reading from each socket, but we would need to see your code before we can say for sure.
    // More stuff behind here to fetch browser response which should only be a header.

    Code:
    
    while ( true )
            {
                MSG = "";
                
                if ( DEBUG1 ) {
                    counter++;
                    printf( " - Entering Exchange #%d\n\n", counter );
                    fprintf( DEBUG_FP, " - Entering Exchange #%d\n\n", counter );
                }
                
                FD_1 = GET_FD( SERVER );
                FD_2 = GET_FD( CLIENT );
                
                FD_ZERO(&READ_FDS);
                FD_SET(FD_1, &READ_FDS);
                FD_SET(FD_2, &READ_FDS);
                
                MAX_FD = ( FD_1  > FD_2 ) ? FD_1 : FD_2; 
                
                if ( select( MAX_FD+1, &READ_FDS, NULL, NULL, &waitd ) <= 0 )
                {
                    if ( DEBUG1 )
                    {
                        printf( "%c[1;91m - ERROR: Could not select...%c[0m\n", 27, 27 );
                        fprintf( DEBUG_FP, " - ERROR: Could not select...\n" );
                    }
                    
                    return true;
                }
    
                if ( FD_ISSET ( FD_1, &READ_FDS ) )
                {
                    if ( DEBUG2 )
                    {
                        printf ( "\n - The browser wants to talk...\n" );
                        fprintf ( DEBUG_FP, "\n - The browser wants to talk...\n" );
                    }
                    
                    break;
                } else {
                    FD_CLR( FD_1, &READ_FDS );  Not sure if this was pointless...?
                }
                
                if ( FD_ISSET ( FD_2, &READ_FDS ) )
                {   
                    if ( DEBUG2 )
                    {
                        printf( "\n - The web server wants to talk...\n" );
                        fprintf( DEBUG_FP, "\n - The web server wants to talk...\n" );
                    }
                    
                    memset ( message, 0, BUFFER );
                    
                    if ( RECV ( CLIENT, message, BUFFER) == false )
                    {
                        return true;
                    }
                    
                    MSG = message;
                    
                    if ( GZIP_ENCODE == true )
                    {
                        DATA += MSG;
                    }
                    
                    //we're dealing with the server's body - send it to the browser    
                    if ( DEBUG2 ) {
                        printf( "\n - Says this (%d):\n\n", (int)MSG.length() );    
                        fprintf( DEBUG_FP, "\n - Says this (%d):\n\n", (int)MSG.length() );  
                        
                        if ( OUTPUT )
                        {
                            printf( "%s\n", (char*)MSG.c_str() );
                            fprintf( DEBUG_FP, "%s\n", (char*)MSG.c_str() );
                        }
                    }
                    
                    if ( !SEND ( SERVER, message ) )
                    {
                        break;
                    }
                    
                    if ( DEBUG2 )
                    {
                        printf( "\n - Wrote to browser...\n\n" );
                        fprintf( DEBUG_FP, "\n - Wrote to browser...\n\n" );
                    }
                }
                else
                {
                    FD_CLR( FD_2, &READ_FDS ); // Same thing here...
                }
            }
    }
    
    Yeap so there it is, its partial, but I'm only concerned with the fd_set which might be the problem, or maybe encoding and how the program handles that, im just not sure.

  4. #4
    Registered User
    Join Date
    Aug 2012
    Posts
    13
    I think part of the problem is that the gzip compressed data contains NULL characters. Would NULL chars create a problem in a char* or string?

  5. #5
    Registered User
    Join Date
    Mar 2010
    Posts
    583
    Quote Originally Posted by jsx-27 View Post
    I think part of the problem is that the gzip compressed data contains NULL characters. Would NULL chars create a problem in a char* or string?
    Yes, think you're on the right track there - it could cause problems. A char* buffer is just a buffer of bytes and is absolutely fine when used with the socket functions, but you mustn't try to treat it like a string at all if it has NULL chars in it. Looking at your code, I'm pretty sure your "MSG" variable is a C++ string. Trying to initalise a string object from a char* with NULLs in it will not work well: it'll stop at the first NULL, and report the length only up until that point.

    Code like:
    Code:
      DATA += MSG;
    will start writing MSG into DATA at the first NULL terminator in DATA and stop at the NULL in MSG.

    I would guess that your RECV function returns the number of bytes read -- that'd be a more reliable way to find out how much you've read.

  6. #6
    Registered User
    Join Date
    Aug 2012
    Posts
    13
    Ok, lets just discard "DATA" it has no official use in the program as of yet. The recv() message function, I print off all data that has been recieved and it shows up the same as the string. This has gotten very frustrating, I decided to use tcpdump to monitor what was going on. It seems as if data is being sent, but the socket isnt recieving it. The device is receiving the information and then routing it to the socket. So then I figured that maybe it depends on the socket options or which kind of socket i'm using.

    Either way, regular html data works fine, goes through and fills up each buffer, all the way up to 1000 bytes. GZIP data, nope, no go.

  7. #7
    Registered User
    Join Date
    Aug 2012
    Posts
    13
    Actually, just reading up on things, it turns out I should probably use "unsigned char*" instead of "char*"? That or using wchar_t* but im lost as far as that goes.
    Last edited by jsx-27; 08-17-2012 at 09:49 PM.

  8. #8
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Are you assuming that the RECV call will receive the whole packet? The select call only guarantees that there is something in the receive buffer already, but that could be as much as the whole thing, or as little as just one byte. You generally need to loop over the RECV calls until you find the end of packet, or receive enough bytes, or time-out etc.
    What size is the html data you are receiving and what size is the gzip data?

    If you have some code that works (regular html), and something that doesn't (gzip data) then please post either both of them, or at least the parts that are different.

    Make sure you don't try and tell anyone to not look in a particular area, eg. DATA. If that's not used or not related to the problem, then people will figure that out on our own. Whatever the problem is, it's highly likely that it isn't where you expect it to be (or you would have solved it already). Forcing others to only look at the same things will provent them from solving it also. Let people solve it on their own.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  9. #9
    Registered User
    Join Date
    Aug 2012
    Posts
    13
    Would there be any difference using poll(), because I've switched to that. Also, I've tried that looping method, and it turns out that I end up getting receiving errors. Then it has to go back to poll() or select(), and then the same issue...

    Say I put a while loop for receiving web server messages:

    Then I get this,

    HTTP/1.1 200 OK
    Server: Apache
    Cache-Control: private
    Content-Type: text/html;charset=UTF-8
    Content-Encoding: gzip
    Content-Length: 11094
    Vary: Accept-Encoding
    Date: Sat, 18 Aug 2012 04:31:10 GMT
    Connection: keep-alive
    Set-Cookie: flyout=; Domain=.ask.com; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/
    Set-Cookie: WRUID=; Domain=.ask.com; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/
    Set-Cookie: __qca=; Domain=.ask.com; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/

    ã

    - Wrote to browser...


    - The web server wants to talk...

    - Says this (112):

    1‰ïéƒø®
    „^í˘≥¥»/√ NìÎoÁˆ!±áPiíT!§L≤ÿe*t˜fi)(¡±g-\¡:˜BØU[œ'ôwù¥æπ:]Oh¿>”(ì´tEƒÏ≠2üã@ú–Ω_¡ΩãrgqÉ¥wMe

    - Wrote to browser...


    - The web server wants to talk...

    - Says this (101):

    òı…)ÿ⁄ÿ‹‚ ¨∫PÙÕJº∞G†È≤˙pòÓ]{ë‚äÃÿ㟆
    ‚›N¯t≤ø¶lêZÉ ùàçCg∆ÛÈÕn}4ç˝4÷rZæhZ…rOhèjZq5ËÇ›˝7è

    - Wrote to browser...


    - The web server wants to talk...

    - Says this (71):

    •Å•T©„êÙ`Æf2Í=”7på*©R+·ÌøŸı˜¿y‘·ãÄ|ùêOT≈åÔsL¯~†CÒÁ °·≠flièß
    4\í^|

    - Wrote to browser...


    - The web server wants to talk...

    - Says this (116):

    ^#‚/(L_wzÚaà≥\Kq¢Ñ™ô©±ElÆQ/%ko*.üÕ7bjM∆.ú6ô°¯(r!ŸF4-Ö%S(ö∏K)iP0¿Æ˝\ö
    uü± ≤xñ∂èvŸß
    -Ú`·ók…›—/Ò']˝ªh˘f±ÄÙ

    - Wrote to browser...


    - The web server wants to talk...

    - Says this (4):

    P` S

    - Wrote to browser...


    - The web server wants to talk...

    - Says this (628):

    ΩEŸ´Œ_öd∫¿…ÍQ»¸hàΩy Ï—‚„b^”1ü”KüÕã‹&¸YW˘ºpx=˙G˜ï9û¬Ì˜ÆÌÒ?∫fl
    N∫ùîB±›È≤⁄]s?v£ûߢs|§˜˚É£A∑≈)⁄y‹NH'~¥ŸÕ=QXy?¡®é耇≠◊jÜô@Pƒ¬S ÄV'™åŸóʯãÑŸ4S≤`#s:–K¥ìıú1ùɧɩµú ∂÷ôábëœì%>√ÆáNûa™pã8Ü3ÔpYLÂw∞ûqŸ ∫RµÜ¶Ûé‘|7!<£Zè∞fi6Ó< m?À^BëÜ#ózÙ´®†o}_ fiKQèêmR¿∫m¬î≈3ZÕ(ıd=¥á£˙N¢¬ Æç≠TÌ.ÀÊÉAÍ;·>e´}áÅ´„∞LùN'

    µ>`$•%ÜT1¨P ÈQ¡©ΩH

    «⁄∆]X3Ñ"ùgâŸ˘yÒ;‰Ô≈•Öˆ˚HCÙ’èÕ’cÌ IÊfl
    bµÜµHklGn©⁄‹ñ’C›~ À¸
    -¸N˛çæx84Ü>h>òBgJq®flÕ>
    °nX,ç—â‡j0Nˇuà!÷>AÔ¡AmK™·PéU|/˜‚
    °•≈Sî˘É|2ó/5P¶î:|/©€QÍ◊é€oÚ»Tòw›mµn:G≠Îv¡m®s9θ…Ç2â±(ú6'fl±òÀ†‚ªF¸ØD! ¡ê¬,õ[+ï¥òú=¶¶µ≥ûfl>^Ùê® Q∑
    ß5æÃåº&
    Last edited by jsx-27; 08-17-2012 at 10:33 PM.

  10. #10
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Okay, but how are you determining that you have received the whole packet?
    You need to explain what "receiving errors" means; What are they? Show code?
    It quite possibly might be that what you are seeing is normal, or you might just be trying to now receive bytes beyond what was actually sent.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  11. #11
    Registered User
    Join Date
    Aug 2012
    Posts
    13
    Code:
    bool RECV_MSG ( SOCKET_DATA* PIPE, char *message, int BUF ) {
        
        ssize_t status;
        
        status = recv ( PIPE->m_sock, message, BUF, 0 );
        
        if ( status < 0 ) {
            printf ("%c[1;91m - ERROR #%d: Cannot recieve message!%c[0m\n", 27, errno, 27);
            return false;
        } else if ( status == 0 ) {
            printf ("%c[1;95m - WARNING #%d: Could not read from socket.%c[0m\n", 27, errno, 27);
            return false;
        } else {
            return true;
        }
    }

  12. #12
    Registered User
    Join Date
    Aug 2012
    Posts
    13
    So that log I showed you terminated automatically, so then i set the option SO_NOSIGPIPE. That fixed that issue, but I'm still running into problems.

  13. #13
    Registered User
    Join Date
    Mar 2010
    Posts
    583
    If you do this:

    Code:
    bool RECV_MSG ( SOCKET_DATA* PIPE, char *message, int BUF ) {
         
        ssize_t len;
         
        len = recv ( PIPE->m_sock, message, BUF, 0 );
         
        if ( len < 0 ) {
            printf ("%c[1;91m - ERROR #%d: Cannot recieve message!%c[0m\n", 27, errno, 27);
            return false;
        } else if ( len == 0 ) {
            printf ("%c[1;95m - WARNING #%d: Could not read from socket.%c[0m\n", 27, errno, 27);
            return false;
        } else {
            printf("recv received %d bytes\n", len);
            return true;
        }
    }
    Does the value you see there match up with the places where you've tried to treat the data as a string? I still think converting it to a string is a bad idea for non-text data.

    How is SEND implemented? Are you sending just the bytes read, or the entire buffer including the unitialised parts?

    It seems to me you have a few possible failure points. You could be receiving the data fine and just not sending it on correctly. If it were me trying to debug this, I'd try to isolate small chunks of the code and verify/debug them. I'd probably stop trying to send data on to the browser, and just save the received gzip to a binary file. Then see if it's a valid gzip archive. If not, how does it differ from what you intended to send? Bigger, smaller, corrupted, missing stuff...? If the socket has stopped receiving stuff, you need to work out why -- probably errno might help. What is the error when the connection fails?

  14. #14
    Registered User
    Join Date
    Aug 2012
    Posts
    13
    If you look up top I do mention null... its very funny because most chars* are ended with a '\0'. If I use fwrite instead of printf, then I get a much different view of whats going on, (fwrite continues to write after '\0') for example:

    Code:
     - Says this (485):
    
    HTTP/1.1 200 OK
    Server: Apache
    Cache-Control: private
    Content-Type: text/html;charset=UTF-8
    Content-Encoding: gzip
    Content-Length: 11097
    Vary: Accept-Encoding
    Date: Sat, 18 Aug 2012 05:35:38 GMT
    Connection: keep-alive
    Set-Cookie: flyout=; Domain=.ask.com; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/
    Set-Cookie: WRUID=; Domain=.ask.com; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/
    Set-Cookie: __qca=; Domain=.ask.com; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/
    
    rÌSWÿÅÕ]Í‘j/fihD≥Ç¿;¨’&ìIu“™rPª~[ª∑Ç딿∆—W=Hµ¨öÅ©u∑é±Ñ‹èWú‰Äit:’ZãñU˚ΩLÌ>5XèÛa’‡£Z≥fi®◊˙Ω˘&|ê4·s>ı,œÁ7∏#˚Ü≈F¥ñme&çL€êò˚”∏óFÕÑNp
                                                                                                                                 åö]ŸÏxƒJ∞ÖŒÓB{|¢ùs7`n†SèiƒPøN¥Ä›5◊1,Í
                                                                                                                                                                      úÑA_?–jKQ¿OGp>wRê\Æ¥ºÂ•O#Z–D6
                                                                                                                                                                                                   Ï¿a›ÁôQíˇêó∂kí¿b‰'F]B·¡s÷∑]9ìÇ>˘Ö˚&$€¸∏¶Ä NTO
    E«váƒgŒâ&,ÓF∞”àÂ≥Bq–¿6™¢è§óD«/Ü√C≥÷ßclQÖ?Ò(í°Ã∆îÍ'ò:LXåkt‚7⁄˚≠›ö!DÕ
                                                                       -Ø
    _4ÇÕü¸=b¶MO4Í8ir.AHæÌi∑tL’SflXØ‘ìÄÒáˇXÚáˇ>Íáˇåjáˇ{jáˇhjáˇXjáˇCdáˇ4dáˇ&dáˇû_áˇ<_áˇ&_ᡂVᡬVᡕVáˇóZáˇÉZáˇpZáˇ~Váˇ^VáˇAVáˇVáˇ˙UᡛUáˇSáˇSᡢRáˇ√aáˇ¥aᡶaáˇÍJáˇê‰íáˇ∂êáˇ≥áˇBÖáˇ
    áˇ~ïᡲíáˇpp
                G•
                  áˇêáˇ∆©    áˇÿ¿áˇ#≥áˇÑòᡧ¿áˇ˙ëá
    
    
    The count shown shows 485, but clearly this isnt the case, its around 1500 chars. The send and receive functions are working as intended, like I said I have been able to download web pages without gzip form perfectly. In those cases there will be very few '\0' terminating characters. Since I'm dealing with a gzip compressed code, there's going to be more '\0' null chars intended.

    I think I understand perfectly what you're saying smokyangel. The send function is buffered to the size of the message char, which is 485, but as you can see up top, thats not the case.

  15. #15
    Registered User
    Join Date
    Aug 2012
    Posts
    13
    The tricky part now is to count how many chars are stored in message with all those terminating characters. Since I know my buffer has been limited to just 1000, then i made sure to send out 1000. It's working now with some websites, but others are kind of skewed.
    Last edited by jsx-27; 08-18-2012 at 12:23 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. proxy.pac (proxy auto config) question
    By boreder in forum C Programming
    Replies: 2
    Last Post: 01-20-2009, 03:13 AM
  2. Simple Proxy
    By Lina in forum C Programming
    Replies: 0
    Last Post: 04-01-2007, 12:36 PM
  3. Connecting through a proxy...
    By jverkoey in forum Networking/Device Communication
    Replies: 1
    Last Post: 07-20-2005, 11:53 AM
  4. DNS via proxy
    By iain in forum Tech Board
    Replies: 2
    Last Post: 02-26-2005, 04:59 AM
  5. proxy
    By shiju in forum C Programming
    Replies: 2
    Last Post: 02-21-2004, 09:37 AM