Thread: Linux Socket Nightmare

  1. #1
    Registered User
    Join Date
    May 2012
    Location
    Manitoba, Canada
    Posts
    12

    Question Linux Socket Nightmare

    I'm out of ideas, I've been trying to get this whole socket thing working for a couple days now and I haven't even made it passed the basics. It seems the recv() function outputs garbage data in the read buffer which obviously can't be right, there's some problem on my end. It also reads differing amounts at a time, uncomment the read bytes line and watch that puppy go. It tends to loop from 1024 bytes to around 415 and it does that until the last read, does it have something to do with HTTP chuncks? I read the tutorial found here: Beej's Guide to Network Programming. I'm sure there's some glaring problem somewhere, if anyone has any experience with this hopefully they'll find it right away. I'm willing to take critique or advice on anything else as well, I'm trying to get better.

    Code:
    #include <netdb.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <sys/socket.h>
    #include <sys/types.h>
    #define BUFFER_SIZE 1025
    int mexit(char *message);
    int get_host(char *host, char *url);
    int get_path(char *path, char *url, int pathPosition);
    int find_substring(char *string, char *substring);
    int
    main(int argc, char *argv[])
    {
        char *host, length[20], *path, readBuffer[BUFFER_SIZE], request[250];
        int c, cLength, contentLengthEnd, headerEnd, i, pathPosition, readBytes, sent, sock;
        struct addrinfo hints, *res;
        if(argc != 2) {
            printf("Usage: %s [FILE]\n", argv[0]);
            return 1;
        }
        memset(&hints, '\0', sizeof(hints));
        hints.ai_family = AF_INET;
        hints.ai_socktype = SOCK_STREAM;
        if((host = (char *)malloc(sizeof(argv[1]))) == NULL)
            mexit("Error: Failed to allocate host memory!\n");
        pathPosition = get_host(host, argv[1]);
        if((path = (char *)malloc(sizeof(argv[1]))) == NULL)
            mexit("Error: Failed to allocate path memory!\n");
        get_path(path, argv[1], pathPosition);
        if((getaddrinfo(host, "80", &hints, &res)) != 0)
            mexit("Error: Unable to get host information!\n");
        if((sock = socket(res->ai_family, res->ai_socktype, res->ai_protocol)) < 0)
            mexit("Error: Unable to create socket!\n");
        if((connect(sock, res->ai_addr, res->ai_addrlen)) != 0)
            mexit("Error: Failed to connect socket to host!\n");
        snprintf(request, sizeof(request), "GET %s HTTP/1.1\nHost: %s\nUser-Agent: Firefox\n\n", path, host);
        if((sent = send(sock, request, sizeof(request), 0)) <= 0)
            mexit("Error: Failed to send request header!\n");
        readBytes = recv(sock, readBuffer, 1024, 0);
        readBuffer[readBytes+1] = '\0';
        headerEnd = find_substring(readBuffer, "\r\n\r\n") + 4;
        contentLengthEnd = find_substring(readBuffer, "Content-Length:") + 15;
        for(i = 0; (c = readBuffer[contentLengthEnd + i]) != '\n'; i++)
            length[i] = c;
        length[i] = '\0';
        cLength = atoi(length);
        while((readBytes = recv(sock, readBuffer, 1024, 0)) != 0) {
            //printf("\nRead Bytes: %d\n", readBytes);
            readBuffer[readBytes+1] = '\0';
            printf("%s", readBuffer);
        }
        free(host);
        free(path);
        freeaddrinfo(res);
        close(socket);
        return 0;
    }
    int
    mexit(char *message)
    {
        printf("%s\n");
        exit(1);
    }
    int
    get_host(char *host, char *url)
    {
        int c, i, j = 0;
        for(i = 7; (c = url[i]) != '/'; i++)
            host[j++] = c;
        host[j] = '\0';
        return i;
    }
    int
    get_path(char *path, char *url, int pathPosition)
    {
        int c, i, j = 0;
        for(i = pathPosition; (c = url[i]) != '\0'; i++)
            path[j++] = c;
        path[j] = '\0';
        return 0;
    }
    int
    find_substring(char *string, char *substring)
    {
        int i, j = 0, length = 0, loc = 0, longest = 0;
        for(i = 0; i < strlen(string); i++)
            if(string[i] == substring[j]) {
                while(string[i++] == substring[j++])
                    length++;
                if(length >= longest) {
                    longest = length;
                    loc = i - length;
                }
                length = 0;
                j = 0;
            }
        return loc;
    }

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    All those socket related functions set errno if they fail. It will be set to a specific code that describes why it failed. Use strerror(errno) or perror() functions to generate messages that explain what's wrong, it will help you troubleshoot.

    Code:
    sizeof(argv[1])
    This is not what you want to malloc. argv is an array of char *, so sizeof(argv[1]) is sizeof(char *). Try malloc(strlen(argv[1] + 1)). The +1 is to leave room for the null byte.

    You need to set readBuffer[readBytes] = '\0', not readBytes + 1. Remember, arrays start at 0 in C, so if you read 4 bytes, they are buf[0] through buf[3], and buf[4] is where you want your null. Also, you never check readBytes to see if recv actually worked. If it fails, readBytes will be -1, and you need to print an appropriate error message and possibly exit.

    Instead of your complicated find_substr function, can you just use strstr? Something like strstr(string, substring) - string should give you the index of the substring.

    Lastly, socket is a function name, sock is the file descriptor of your socket. Try close(sock) instead of close(socket). If you didn't get a compiler warning for that, crank up the warning settings. Try adding -Wall to your gcc command.

    That's all I got at a quick glance, try that out and see how it goes.

  3. #3
    Registered User
    Join Date
    May 2012
    Location
    Manitoba, Canada
    Posts
    12
    This is not what you want to malloc. argv is an array of char *, so sizeof(argv[1]) is sizeof(char *). Try malloc(strlen(argv[1] + 1)). The +1 is to leave room for the null byte.
    This is the type of stuff I'm looking for. It's so obvious now that you mention it, my brain was so wrapped up in trying to get the socket working that stuff like this flies right by me, I've been having problems with this earlier and ended up hard-coding it in but manged to (somehow) get it working before posting the code this time. Thanks for pointing it out.

    You need to set readBuffer[readBytes] = '\0', not readBytes + 1. Remember, arrays start at 0 in C, so if you read 4 bytes, they are buf[0] through buf[3], and buf[4] is where you want your null.
    I've known about this for so long and still have trouble with it, it's embarrassing.

    Also, you never check readBytes to see if recv actually worked. If it fails, readBytes will be -1, and you need to print an appropriate error message and possibly exit.
    Yeah I probably should do that, you know because it seems like I'm getting errors, maybe I should check for them .

    Instead of your complicated find_substr function, can you just use strstr? Something like strstr(string, substring) - string should give you the index of the substring.
    I figured there must have been a standard function for this already, the one I'm using is my own makeshift one. I've read about strstr() before but saw that it returned a character pointer whereas I believe I wanted the integer position. You seem to know what you're talking about, I'll look into it.

    Lastly, socket is a function name, sock is the file descriptor of your socket. Try close(sock) instead of close(socket). If you didn't get a compiler warning for that, crank up the warning settings. Try adding -Wall to your gcc command.
    Yeah all the warnings are off at the moment, it probably wouldn't even let me compile this abomination otherwise . You're correct on that point as well, it should be sock, not socket.

    I'll make the changes you proposed and return back with the news. Thanks for your input anduril462!

  4. #4
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Quote Originally Posted by Kazansky View Post
    I figured there must have been a standard function for this already, the one I'm using is my own makeshift one. I've read about strstr() before but saw that it returned a character pointer whereas I believe I wanted the integer position. You seem to know what you're talking about, I'll look into it.
    I just realized that it might not be clear from my post, but that is a subtraction, not a dash. Subtracting two pointers in C (they must point to the same type) gives you the number of things pointed to, between those two pointers, regardless of how big/small those things are (it divides by sizeof for you). Imagine a big array that spans those two addresses. Pointer subtraction gives you the difference in those array indexes.

    So to get the index of the substring, just do:
    Code:
    strstr(string, substring) - string

  5. #5
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    You should probably test for NULL (if substring isn't found)
    Code:
    char   *p;
    size_t  i;
    
    p = strstr(string, substring);
    if (!p) {
        // substring not found.
        // Do something reasonable.
    }
    else {
        i = p - string;
    }
    The cost of software maintenance increases with the square of the programmer's creativity. - Robert D. Bliss

  6. #6
    Registered User
    Join Date
    May 2012
    Location
    Manitoba, Canada
    Posts
    12
    So to get the index of the substring, just do:
    Code:
    strstr(string, substring) - string
    This is pretty neat, it's a lot nicer than my old method.

    You should probably test for NULL (if substring isn't found)
    I will.

    One small problem here in my get_path() function (same as I had before), I believe it has something to do with returning from the function.

    Code:
    int
    get_path(char *path, char *url, int pathPosition)
    {
        int c, i, j = 0;
        for(i = pathPosition; (c = url[i]) != '\0'; i++)
            path[j++] = c;
        printf("Path: %s\n", path);
        path[j] = '\0';
        printf("Path: %s\n", path);
        return 0;
    }
    The first print of path contains the right path with some garbage at the end (as you'd expect), the second print prints the correct NULL terminated string, but printing path again except in the main function, it appends what appears to be an 'l' character? Of course if the path is wrong the whole thing doesn't work. Do you know what would be causing it to append an 'l' character when it returns to main? I may be able to figure it out. If I do I'll report back.

  7. #7
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Post your updated code. It's probably related to how you malloc space for path or how you call this.

  8. #8
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    I may know why this isn't working. I gave you bad code:
    Code:
    malloc(strlen(argv[1] + 1));
    That +1 inside the inner parentheses causes it to start at the second character. You want
    Code:
    malloc(strlen(argv[1]) + 1)
    That was casuing a buffer overflow error. The last byte of the string was memory you didn't own, so you would get undefined behavior.

  9. #9
    Registered User
    Join Date
    May 2012
    Location
    Manitoba, Canada
    Posts
    12
    Here's the updated code, not finished. Turns out it wasn't the function return that's causing it, it's somewhere between "Main Path 1" and "Main Path 2" where it changes though it's not even called at all.

    Code:
    #include <netdb.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <sys/socket.h>
    #include <sys/types.h>
    #include <unistd.h>
    #define HTTP_PORT "80"
    #define BUFFER_SIZE 1024
    int get_host(char *host, char *url);
    int get_path(char *path, char *url, int pathPosition);
    int find_substring(char *string, char *substring);
    int
    main(int argc, char *argv[])
    {
        char *host, length[20], *path, readBuffer[BUFFER_SIZE], *p, request[250];
        int c, cLength, contentLengthEnd, headerEnd, i, pathPosition, readBytes, sent, sock, status;
        struct addrinfo hints, *res;
        if(argc != 2) {
            printf("Usage: %s [FILE]\n", argv[0]);
            return 1;
        }
        memset(&hints, '\0', sizeof(hints));
        hints.ai_family = AF_INET;
        hints.ai_socktype = SOCK_STREAM;
        if((host = (char *)malloc(strlen(argv[1]) + 1)) == NULL) {
            fprintf(stderr, "Error: Failed to allocate host memory.\n");
            return 1;
        }
        pathPosition = get_host(host, argv[1]);
        if((path = (char *)malloc(sizeof(argv[1]) + 1)) == NULL) {
            fprintf(stderr, "Error: Failed to allocate path memory.\n");
            return 1;
        }
        get_path(path, argv[1], pathPosition);
        printf("Main Path 1: %s\n", path);
        if((status = getaddrinfo(host, HTTP_PORT, &hints, &res)) != 0) {
            fprintf(stderr, "getaddrinfo() encountered an error: %s\n", gai_strerror(status));
            return 1;
        }
        if((sock = socket(res->ai_family, res->ai_socktype, res->ai_protocol)) < 0) {
            perror("socket() encountered an error");
            return 1;
        }
        if((connect(sock, res->ai_addr, res->ai_addrlen)) != 0) {
            perror("connect() encountered an error");
            return 1;
        }
        snprintf(request, sizeof(request), "GET %s HTTP/1.1\nHost: %s\nUser-Agent: Firefox\n\n", path, host);
        if((sent = send(sock, request, sizeof(request), 0)) <= 0) {
            perror("send() encountered an error");
            return 1;
        }
        if((readBytes = recv(sock, readBuffer, 1024, 0)) < 0) {
            perror("recv() encountered an error");
            return 1;
        }
        printf("Main Path 2: %s\n", path);
        readBuffer[readBytes] = '\0';
        if((p = strstr(readBuffer, "\r\n\r\n")) == NULL) {
            fprintf(stderr, "Error: Couldn't find substring!\n");
            return 1;
        }
        headerEnd = p - readBuffer;
        if((p = strstr(readBuffer, "Content-Length:")) == NULL) {
            fprintf(stderr, "Error: Couldn't find substring!\n");
            return 1;
        }
        cLength = p - readBuffer;
        printf("Header End: %d\n", headerEnd);
        printf("Content Length End: %d\n", contentLengthEnd);
        for(i = 0; (c = readBuffer[contentLengthEnd + i]) != '\n'; i++)
            length[i] = c;
        length[i] = '\0';
        cLength = atoi(length);
        printf("Content Length: %d\n", cLength);
        free(host);
        free(path);
        freeaddrinfo(res);
        close(sock);
        return 0;
    }
    int
    get_host(char *host, char *url)
    {
        int c, i, j = 0;
        for(i = 7; (c = url[i]) != '/'; i++)
            host[j++] = c;
        host[j] = '\0';
        return i;
    }
    int
    get_path(char *path, char *url, int pathPosition)
    {
        int c, i, j = 0;
        for(i = pathPosition; (c = url[i]) != '\0'; i++)
            path[j++] = c;
        printf("Function Path 1: %s\n", path);
        path[j] = '\0';
        printf("Function Path 2: %s\n", path);
        return 0;
    }
    Sorry it takes me so long to reply, this is my first time writing this and it's all a bit strange and confusing when random letters start popping up.

  10. #10
    Registered User
    Join Date
    May 2012
    Location
    Manitoba, Canada
    Posts
    12
    So yeah, I found out I can do what I want with "libcurl", apparently it was made specifically for this purpose... That's a little embarrassing , on one hand I spent like three days trying to learn sockets and didn't really get anywhere but on the other hand now I don't have to worry about the disaster I was trying to make. Oh well it was a learning experience. Thanks for helping you two, I just wanted to post this to say my "problem" has been solved .

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. c++ socket in linux
    By mahi in forum C++ Programming
    Replies: 1
    Last Post: 12-21-2011, 08:59 AM
  2. Linux raw socket
    By odomae in forum Networking/Device Communication
    Replies: 3
    Last Post: 05-15-2011, 11:06 PM
  3. Linux raw socket programming
    By cnb in forum Networking/Device Communication
    Replies: 17
    Last Post: 11-08-2010, 08:56 AM
  4. Socket Programming using C on linux
    By mgnidhi_3july in forum C Programming
    Replies: 2
    Last Post: 05-18-2010, 03:40 AM
  5. socket programming in linux
    By crazeinc in forum C Programming
    Replies: 1
    Last Post: 05-27-2005, 07:40 PM

Tags for this Thread