Hostname to IP Address for Fx bookmarks

This is a discussion on Hostname to IP Address for Fx bookmarks within the C Programming forums, part of the General Programming Boards category; Here is a program to convert a Firefox bookmarks file from using hostnames, requiring DNS lookups, to IP addresses retrieved ...

  1. #1
    Registered User htoip's Avatar
    Join Date
    Dec 2010
    Posts
    9

    Hostname to IP Address for Fx bookmarks

    Here is a program to convert a Firefox bookmarks file from using hostnames, requiring DNS lookups, to IP addresses retrieved by the use of ping. This may be helpful in the case of a web site's being pulled from the DNS servers, at least, as I understand DNS.
    I'm rather new to all of this, as you'll see if you glance at the source code (around 200 well styled lines). I'm hoping a few people will find this useful and, too, that some experienced programmers can point out problems and make suggestions.

    Web Site
    Main Download Page
    Source

  2. #2
    Registered User htoip's Avatar
    Join Date
    Dec 2010
    Posts
    9
    Fooey. Would a moderator kindly move this thread from C++ to C?

  3. #3
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,344
    A couple of problems on first reading:

    1. You call malloc a lot, but never call free() to release that memory when you're done.

    2. You never call pclose() on all the handles you create with popen(). This is more serious, as file handles tend to be in short supply, and you will soon run out of them.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.
    I support http://www.ukip.org/ as the first necessary step to a free Europe.

  4. #4
    Registered User htoip's Avatar
    Join Date
    Dec 2010
    Posts
    9
    Thanks. I'm still a little fuzzy on those, so I'll work on them. It's hard to know sometimes when it's safe to use free, but I think I can figure it out. I believe I can call pclose(fp) right after I get the string from it like this
    Code:
    fgets(t, SSIZE + 21, fp);
    pclose(fp);

  5. #5
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,326
    Not a bad little exercise. Program seems to work fine (only did a few tests), but according to your README, some URLs get garbled. Can you give some examples? As far as my advice, it's mostly suggestions and style stuff:

    Code:
    typedef char *String;
    My personal preference is to not mask pointerness with a typedef, especially for something trivial like this. Just use char *. Any halfway decent programmer will understand you mean it as a string. Besides, you don't always want to allocate memory via malloc for a string (see below).

    Code:
    s = (String)malloc(SSIZE * sizeof(char));
    Don't cast malloc.

    Code:
      String ping_call(String);/* converts hostname to IP address with ping */
      String dig_call(String);/* converts hostname to IP address with dig +short */
      String lookformore(String);/* after HREF, look for ICON_URI and SHORTCUTURL */
    Don't place protocols inside functions. It's bad style and makes code hard to read. Put them at the top of the file if they're only used in that .c file (in which case they should be static functions), or put them in a header file if they are to be used outside that .c file.

    Code:
    /* hopefully this switch statement will save useless overhead in calling good_URI_scheme */
    This is probably pre-optimization, which is the root of all evil. You're trading a function call for some comparisons and jumps, the difference in number of CPU cycles is negligible. It probably doesn't speed anything up by a measurable amount and it makes your code harder to read and maintain. This certainly isn't your bottleneck either (which is where you should focus your optimization efforts if you need it). Your bottleneck would most likely be the popen/ping/dig calls. Everything else is plenty fast.

    Code:
    int good_URI_scheme(String s)
    {
        if ( strlen(s) < 6 )
            return(0);
        if ( strncmp(s, "http://", 7) == 0 )
            return(1);
        else if ( strncmp(s, "https://", 8) == 0 )
            return(2);
        else if ( strncmp(s, "ftp://", 6) == 0 )
            return(3);
        else
            return(0);/* unlisted URI scheme */
    }
    The strlen only saves you time in the off chance somebody has a bookmark entry with less than 6 chars. For every other case, it costs you extra cycles (if you're that concerned with speed). Let the strncmp calls handle it. Also, if there is (or will be) significance to the return value of good_URI_scheme, you should avoid returning magic numbers like 1, 2 and 3. Instead, return a symbolic constant, like URI_SCHEME_HTTP (you can declare a const int or #define it).

    Code:
    SSIZE + 21
    More magic juju. Avoid cryptic statements like this and replace these with appropriate constants like PING_RESULT_LEN and DIG_RESULT_LEN.

    Code:
    String t = (String)malloc((SSIZE + 21) * sizeof(char));
    If you're going to have fixed-length strings, just use a char t[PING_RESULT_LEN+1]; so you don't have to worry about malloc failing or remembering to free the memory when you're done.

  6. #6
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Actually, there are many sites out there you cannot ping.

    If you look into the sockets api you will find gethostbyname() which does an actual DNS lookup and also returns secondary names and IPs for you.

  7. #7
    Registered User htoip's Avatar
    Join Date
    Dec 2010
    Posts
    9
    Thanks for your help, Salem, anduril462, and CommonTater. I've learned a lot already.
    1. You call malloc a lot, but never call free() to release that memory when you're done.
    I see now I was over liberal with the malloc() calls and reduced them to 2, but I couldn't find but one place for free().
    2. You never call pclose() on all the handles you create with popen(). This is more serious, as file handles tend to be in short supply, and you will soon run out of them.
    Fixed.
    according to your README, some URLs get garbled. Can you give some examples?
    The problem seems to have disappeared. Some question marks backed by diamonds were finding themselves in odd places. I don't have the file anymore to show, though.
    My personal preference is to not mask pointerness with a typedef, especially for something trivial like this. Just use char *. Any halfway decent programmer will understand you mean it as a string.
    Fixed. Would you mask it for multiple indirection?
    Besides, you don't always want to allocate memory via malloc for a string (see below).
    The only other option I can think of is to make it static, but with malloc only the calling function and the called function are messing with the strings created.
    Don't cast malloc.
    Fixed. I've heard that before, but now I think I understand it.
    Don't place protocols inside functions. It's bad style and makes code hard to read. Put them at the top of the file if they're only used in that .c file (in which case they should be static functions), or put them in a header file if they are to be used outside that .c file.
    Fixed. Good idea.
    I took care of the optimization stuff, too. Is there anything to do about the ping calls? It's actually the network that's slow, right?
    Avoid cryptic statements like this and replace these with appropriate constants like PING_RESULT_LEN and DIG_RESULT_LEN.
    Fixed. Excellent improvement. It's much easier to read, now.

  8. #8
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,326
    Quote Originally Posted by htoip View Post
    Fixed. Would you mask it for multiple indirection?
    By pointerness, I meant indirection. This is mostly just my personal preference and a matter of readability/comprehension for me, but I prefer not to use typedefs that cover any indirection or array dimensions. I use it largely for structures and enums.
    The only other option I can think of is to make it static, but with malloc only the calling function and the called function are messing with the strings created.
    Static variables have their own set of problems, which will definitely surface given your recursive approach. Looking at convert, ping_call and dig_call more closely, you malloc memory, never use what you malloc'ed then lose the pointer to it so you can't free it. You're never actually using the memory malloc'ed for u in any of those functions. Instead, I would do the following:
    Code:
    char *convert(char *s)
    {
        char u[SSIZE+1];
        ...
        ping_call(s, u, sizeof(u));  // pass in the string to be filled with the IP, and it's size
    }
    char *ping_call(char *s, char *u, int u_len)
    {
        char *p;
        char t[PING_RESULT_LEN+1];
        ...
        fgets(t, sizeof(t) - 1, fp);
        p = strtok(t, "()");
        p = strtok(NULL, "()");
        ...
        strncpy(u, p, u_len);
        fix_ip(u);
    
        return u;
    }
    Fixed. Good idea.
    I took care of the optimization stuff, too. Is there anything to do about the ping calls? It's actually the network that's slow, right?
    Yes, that will be the single slowest point. The part of the bottleneck you can control is the use of popen/pclose. Those functions take a considerable amount of time compared to the rest of your program which is just string manipulation. The best option is the gethostbyname function that Tater suggested, which may even be what ping is doing under the hood anyways.

  9. #9
    Registered User htoip's Avatar
    Join Date
    Dec 2010
    Posts
    9
    I forgot about return parameters. I implemented your suggestion and I plan to upload my current build soon.
    I spent a few hours reading about and testing gethostbyname(), but I don't understand how to get an IP address from it.

  10. #10
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by htoip View Post
    I took care of the optimization stuff, too. Is there anything to do about the ping calls? It's actually the network that's slow, right?
    No... it's Ping that's slow.

    As I mentioned before using Ping to discover IP addresses is one step removed from the actual task you are trying to perform. When you call Ping (apparently as a shell command) the Ping program itself has to do a "gethostbyname" DNS lookup internally before it can send the ICMP Ping packet to the host. Since Ping is not really about returning an IP --it's a connectivity test-- you end up waiting for retries upon retries as it times the connection or waits eternally for it to fail.

    A simple, and direct call to DNS would be over in a couple of milliseconds. A bookmark list of several thousand entries should process through in just a few seconds. Doing it with Ping would take an hour or more...

    gethostbyname Function (Windows)

  11. #11
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by htoip View Post
    I forgot about return parameters. I implemented your suggestion and I plan to upload my current build soon.
    I spent a few hours reading about and testing gethostbyname(), but I don't understand how to get an IP address from it.
    Here's a simple proggy to demonstrate...

    Code:
    #include <winsock2.h>
    #include <windows.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    #pragma comment(lib, "ws2_32.lib")  // for linking
    
    
    
    
    
    int main(void)
      { char      url[100];   // user entered address
        HOSTENT*  hostent;    // host entry data
        WSADATA   wsadata;   // WSA init data
        IN_ADDR   addr;          //  IP address
        // intialize WSA Sockets
        if (WSAStartup(MAKEWORD(2,0),&wsadata))
          { puts("WSA Startup Error\n");
            exit(1); }
     
        // get URL from user
        puts("Enter the URL : "); 
        scanf("%s", url);
        puts("\n");
    
        // get HOSTENT data
        hostent = gethostbyname(url);
        if (hostent == NULL)
          { puts("Unresolvable host name\n");
            exit (1); }
    
        // extract main IP address     
        addr.S_un.S_addr = *(ULONG*)hostent->h_addr_list[0]; 
        printf("IP Address is : %s", inet_ntoa(addr));
        puts("\n\n");    
    
    
        // release WSA 
        WSACleanup();
    
        return 0; }
    Last edited by CommonTater; 12-23-2010 at 04:25 PM.

  12. #12
    Registered User htoip's Avatar
    Join Date
    Dec 2010
    Posts
    9
    Quote Originally Posted by CommonTater View Post
    Here's a simple proggy to demonstrate...

    Code:
    #include <winsock2.h>
    #include <windows.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    #pragma comment(lib, "ws2_32.lib")  // for linking
    
    
    
    
    
    int main(void)
      { char      url[100];   // user entered address
        HOSTENT*  hostent;    // host entry data
        WSADATA   wsadata;   // WSA init data
        IN_ADDR   addr;          //  IP address
        // intialize WSA Sockets
        if (WSAStartup(MAKEWORD(2,0),&wsadata))
          { puts("WSA Startup Error\n");
            exit(1); }
     
        // get URL from user
        puts("Enter the URL : "); 
        scanf("%s", url);
        puts("\n");
    
        // get HOSTENT data
        hostent = gethostbyname(url);
        if (hostent == NULL)
          { puts("Unresolvable host name\n");
            exit (1); }
    
        // extract main IP address     
        addr.S_un.S_addr = *(ULONG*)hostent->h_addr_list[0]; 
        printf("IP Address is : %s", inet_ntoa(addr));
        puts("\n\n");    
    
    
        // release WSA 
        WSACleanup();
    
        return 0; }
    Excellent, Tater. I got my program working using gethostbyname() and it's much faster. It seems to be Windows only, so my GNU/Linux build is still slow.
    Windows build
    Windows source

  13. #13
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,326
    Quote Originally Posted by htoip View Post
    It seems to be Windows only, so my GNU/Linux build is still slow
    No, it exists in Linux as well, but it just requires a different set of headers. Do "man gethostbyname" on your linux box, it will show you what headers you need. You'll probably need some #ifdef WIN32 or #ifdef LINUX statements around the different headers and any other code that is Windows/Linux specific.

  14. #14
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by htoip View Post
    Excellent, Tater. I got my program working using gethostbyname() and it's much faster. It seems to be Windows only, so my GNU/Linux build is still slow.
    Windows build
    Windows source
    Ok... couple of things..

    I see you are calling WSAStartup() and WSACleanup() every time you go through your function... This is incorrect usage. You would be a lot further ahead to call WSAStartup once at the very beginning of your program and then call WSACleanup() once right before you exit. All these functions do is load and unload the winsock library for you.

    Second, I see your response to WSAStartup errors is that you simply return the host name... The error at that point is actually that you can't load the DLL, which you want to know about long before you start looping through URLs. You would be more clever to return the existing url as the result of a failure in gethostbyname().

    Third you may want to add a temporary printf statement to show you the IP that comes back in your u variable... I think you'll find that inet_ntoa() formats it correctly for you so your fix_ip call is unnecessary.
    Last edited by CommonTater; 12-30-2010 at 03:03 PM.

  15. #15
    Registered User htoip's Avatar
    Join Date
    Dec 2010
    Posts
    9
    I fixed those problems, CommonTater. You were absolutely right about the speed problem. It feels blazing fast, now.
    anduril462, gethostbyname is deprecated, according to its man page, so I used getaddrinfo() and getnameinfo(). Maybe I don't need both, but that's the only way I could figure to get the IP address into a string.

Page 1 of 2 12 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. strstr problem
    By Kempelen in forum C Programming
    Replies: 2
    Last Post: 09-08-2009, 03:46 AM
  2. how to get hostname and IP address of a machine
    By Chandana in forum Networking/Device Communication
    Replies: 4
    Last Post: 08-17-2009, 11:56 AM
  3. DX - CreateDevice - D3DERR_INVALIDCALL
    By Tonto in forum Game Programming
    Replies: 3
    Last Post: 12-01-2006, 06:17 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21