Thread: fetching html pages in C. output only shows html source sometimes

  1. #1
    Registered User
    Join Date
    Nov 2009
    Posts
    16

    fetching html pages in C. output only shows html source sometimes

    My output usually is something like:
    Second Half New Way: /finance
    First Half: Google
    Google = 74.125.47.10574.125.47.10674.125.47.14774.125.47.9 974.125.47.
    10374.125.47.104

    GET /finance HTTP/1.1
    Host: Google

    HTTP/1.1 200 OK
    Content-Type: text/html; charset=UTF-8
    Date: Fri, 04 Nov 2011 02:02:48 GMT
    Expires: Fri, 04 Nov 2011 02:02:48 GMT
    Cache-Control: private, max-age=0
    X-Content-Type-Options: nosniff
    X-Frame-Options: SAMEORIGIN
    X-XSS-Protection: 1; mode=block
    Server: GSE
    Transfer-Encoding: chunked


    but every now and then the program will actually spit back some actual html code of the page. how come i only get the actual html source some of the time?


    Code:
    /* 
     * File:   webClient.c
     * Author: Adam
     *
     * Created on November 3, 2011, 1:26 AM
     */
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    #include <sys/types.h>
    #include <sys/socket.h>
    #include <netinet/in.h>
    #include <netdb.h>
    
    /*
     * 
     */
    int main(int argc, char** argv) {
        
        char arg[500];
        char firstHalf[500];
        char secondHalf[500];
        char request[1000];
        struct hostent *server;
        struct sockaddr_in serveraddr;
        int port = 80;
        int tcpSocket;
        
        strcpy(arg, argv[1]);
        
        int i;
        for (i = 0; i < strlen(arg); i++)
        {
            if (arg[i] == '/')
            {
                    strncpy(firstHalf, arg, i);
                    firstHalf[i] = '\0';
                    break;
            }     
        }
        
        strncpy(secondHalf, arg + i, strlen(arg) - i);
        secondHalf[strlen(arg) - i] = '\0';
    
        
        printf("\nSecond Half New Way: %s", secondHalf);
        
        printf("\nFirst Half: %s", firstHalf);
        
        tcpSocket = socket(AF_INET, SOCK_STREAM, 0);
        
        if (tcpSocket < 0)
            printf("\nError opening socket");
        
        server = gethostbyname(firstHalf);
        
        if (server == NULL)
        {
            printf("gethostbyname() failed\n");
        }
        else
        {
            printf("\n%s = ", server->h_name);
            unsigned int j = 0;
            while (server -> h_addr_list[j] != NULL)
            {
                printf("%s", inet_ntoa(*(struct in_addr*)(server -> h_addr_list[j])));
                j++;
            }
        }
        
        printf("\n");
    
        bzero((char *) &serveraddr, sizeof(serveraddr));
        serveraddr.sin_family = AF_INET;
    
        bcopy((char *)server->h_addr, (char *)&serveraddr.sin_addr.s_addr, server->h_length);
        
        serveraddr.sin_port = htons(port);
        
        if (connect(tcpSocket, (struct sockaddr *) &serveraddr, sizeof(serveraddr)) < 0)
            printf("\nError Connecting");
      
        bzero(request, 1000);
    
        sprintf(request, "GET %s HTTP/1.1\r\nHost: %s\r\n\r\n", secondHalf, firstHalf);
        
        printf("\n%s", request);
        
        if (send(tcpSocket, request, strlen(request), 0) < 0)
            fprintf(stderr, "Error with send()");
        
        bzero(request, 1000);
        
        recv(tcpSocket, request, 999, 0);
        printf("%s", request);
        
        close(tcpSocket);
        
        return (EXIT_SUCCESS);
    }

  2. #2
    Registered User
    Join Date
    Nov 2009
    Posts
    16
    i solved this by adding a print statement for every iteration of the loop and making sure i put null terminator at end of every recv:

    Code:
    do {
            
        iResult = recv(tcpSocket, request, 999, 0);
        
        if (iResult > 0)
        {
            request[iResult] = '\0';
            printf("%s", request);
            printf("\nBytes received: %d", iResult);
        }
        else if (iResult == 0)
            printf("\nConnection closed");
        else
            perror("recv failed");
        }while(iResult > 0);
    also i needed to create a timeout so the loop would eventually exit:

    Code:
        struct timeval tv;
        tv.tv_sec = 2;
        setsockopt(tcpSocket, SOL_SOCKET, SO_RCVTIMEO, (struct timeval *)&tv, sizeof(struct timeval));
    however this solution doesn't exit loop in cygwin/windows environment like it does in ubuntu. anyone know why?

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    Registered User
    Join Date
    Nov 2009
    Posts
    16
    i dont see the problem in it really. realistically there are some people on this board who aren't on dreamincode. and realistically there are some people on dreamincode that aren't on this board. so why not post in both and get answers from more people? i have posted the same question in multiple forums many times where i got answers in all of them but only one of them from one forum was the correct answer that actually helped me. therefore if i had only chosen one arbitrarily and that one i had chosen ended up being one without the helpful answer then i would have been set back. it's only logical to post in as many as possible to get the most helpful answer as quickly as possible. it's all about quickly furthering the knowledge of the human race, not setting arbitrary limits on slowing the gaining of knowledge down because you're not 'supposed to' post in more than one forum because of a pointless social standard.

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    I'm going to assume you didn't actually bother to read the ask smart questions FAQ, and therefore continue to believe that you're "the special one" who deserves all kinds of special attention from as much of the internet as possible.

    "pointless social standard"
    ROFLMAO - you're an idiot.
    Do you go round ........ting on pavements or indulge in other kinds of anti-social behaviour because you don't agree with whatever "social standard" appears to be in place.

    It wouldn't surprise me in the least - some sections of modern youth have a vastly exaggerated sense of "me me me me me" self-importance.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Registered User
    Join Date
    Nov 2009
    Posts
    16
    Quote Originally Posted by Salem View Post
    I'm going to assume you didn't actually bother to read the ask smart questions FAQ, and therefore continue to believe that you're "the special one" who deserves all kinds of special attention from as much of the internet as possible.

    "pointless social standard"
    ROFLMAO - you're an idiot.
    Do you go round ........ting on pavements or indulge in other kinds of anti-social behaviour because you don't agree with whatever "social standard" appears to be in place.

    It wouldn't surprise me in the least - some sections of modern youth have a vastly exaggerated sense of "me me me me me" self-importance.
    well when making a decision that can benefit yourself and affect other people in a 'negative' way, everyone weighs the scales differently. for example if there existed a situation where i could gain $10 but 200 people in the world had to have a finger chopped off i would most certainly not choose to have the $10 as i would assume you would also. likewise if there existed a situation where i could gain 1 million dollars if only 1 person had to go through the pain of having their finger pricked, i would most certainly take the 1 million dollars, as i am sure you would also. my point is that there are different degrees of what a seeker has to gain and it's negative affect on society. in the case of "........ting on pavements" as you put it i would have nothing to gain while other people would have to look at and smell something disgusting. in this case i would most certainly not choose to **it on a pavement. if a situation existed where i could post a question on 3 different forums to increase my chances of finding a correct answer by about 3 fold thereby saving myself hours of time and the negative side affect of my gain is a few people clicking on my question and then hitting the back button after they realized they didn't want to answer it or hitting the back button after they realized it had been posted before (about a waste of 4 seconds of your time IF you didn't recognize it was the same question by the title) then sure, i'll save myself a few hours at the expense of a few seconds of a couple other peoples time. sorry if i seemed hostile in my previous post. i do like debating though so please continue.

    EDIT: i should add that i only do multiple forum posts after i have done extensive googling and can't find a solution. i also think that my philosophy would be ok if it were adopted by everyone. i certainly wouldn't be mad at someone if they did what i was doing and i became 'victim' to the 'negative' side affect of having to hit my back button.

    EDIT2: i should also add that i believe my way of multiple forum posts is actually more beneficial to future seekers of knowledge that i have already found. i do make sure i always go back and post a solution to all questions i ask so that people in the future will see how i dealt with my problem. and the fact that there will be multiple forums out there with the same problem will give the future person more of a chance to find one of them through a search engine, and they will also see the varied responses from different users on different forums where one of the responses that didn't help me could be what they are looking for.
    Last edited by c++guy; 11-04-2011 at 02:48 AM.

  7. #7
    Registered User
    Join Date
    Mar 2008
    Posts
    18
    why don't you guyz forget about who have to do what and focus on code problem please

  8. #8
    Registered User
    Join Date
    Nov 2009
    Posts
    16
    fair enough. so anyone know why setsockopt() isn't having the same functionality in cygwin when im on windows as it is when i run it in ubuntu?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. source file to HTML...
    By Bleech in forum A Brief History of Cprogramming.com
    Replies: 5
    Last Post: 11-21-2006, 01:27 PM
  2. Read source from a HTML online
    By tboy in forum C Programming
    Replies: 2
    Last Post: 11-22-2004, 08:10 PM
  3. view source of html file
    By bballzone in forum C++ Programming
    Replies: 17
    Last Post: 09-04-2004, 04:23 PM
  4. requesting html source from a server
    By threahdead in forum Linux Programming
    Replies: 2
    Last Post: 08-01-2003, 07:52 PM
  5. HTML Man Pages?
    By mart_man00 in forum Tech Board
    Replies: 1
    Last Post: 04-16-2003, 03:13 PM