Domain name from URL

This is a discussion on Domain name from URL within the C Programming forums, part of the General Programming Boards category; Hello, I'm fairly new to C programming and I am looking for a method to extract the domain name from ...

  1. #1
    Registered User
    Join Date
    Nov 2007
    Posts
    2

    Question Domain name from URL

    Hello,

    I'm fairly new to C programming and I am looking for a method to extract the domain name from a web page url.

    EX. User provides http://www.google.com/index.html

    I want to extract google.com from this string.

    I know I can do this using regular expressions, but C doesnt seem to support Regex. Could someone give me an idea on how to handle this?

    Thanks,
    Kyle

  2. #2
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    Get a regex library or parse it yourself,

    One possible solution:

    Code:
    char domain[64];
    
    sscanf("http://google.com/index.html", "http://%[^/]", domain);
    Or something.

  3. #3
    Registered User
    Join Date
    Nov 2007
    Posts
    2
    Thanks a ton, this will work perfect. Could you explain the format string for me"http://%[^/]" ? I looked at the man pages for sscanf but I think your format is abit more advanced. This is a very powerful tool and I would love to be able to utilize this.

    Thanks again,
    Kyle

  4. #4
    Registered User ssharish2005's Avatar
    Join Date
    Sep 2005
    Location
    Cambridge, UK
    Posts
    1,682
    what that specifes is this

    Code:
    http://google.com/index.html
    http:// ==> Match the literal values with the orginal string, but dont store them in domain string
    [^/] ==> Read eveything but not '/' char and store the read value on to domain string
    So the parser read the http:// and matched and excluded them. And it keep on reading until '/' hits. And the condition breaks and the parser quits.


    ssharish

  5. #5
    tpe
    tpe is offline
    Registered User
    Join Date
    Nov 2010
    Posts
    21
    OK, I know that this is a very late "answer", but in my case it is not working.
    I have the following data:
    Code:
    char *srv;
    char *prxHostname;
    
    srv=getenv("http_proxy");
    sscanf(strcat(srv, "/"), "http://%[^/]", prxHostname);
    printf("Hostname: %s\n", prxHostname);
    The http_proxy variable is: http://192.168.0.10:3128
    Now, whatever I do, the prxHostname is always NULL!
    Any ideas why?

  6. #6
    Registered User ssharish2005's Avatar
    Join Date
    Sep 2005
    Location
    Cambridge, UK
    Posts
    1,682
    There are few issues with your code. When you concatenate the string with '/', have you made sure that serv string has enough space? And your trying to store the fetched value in prxHostname. Have you allocated enough memory before doing that. Its just a pointer but not a stirng to hold the fetchd value does it?

    ssharish

    EDIT: You need to open a new thread for these kind of issues, instead of reopening a thread which 3 years old!!
    Last edited by ssharish2005; 11-17-2010 at 07:57 AM.
    Life is like riding a bicycle. To keep your balance you must keep moving - Einstein

  7. #7
    tpe
    tpe is offline
    Registered User
    Join Date
    Nov 2010
    Posts
    21
    Quote Originally Posted by ssharish2005 View Post
    There are few issues with your code. When you concatenate the string with '/', have you made sure that serv string has enough space? And your trying to store the fetched value in prxHostname. Have you allocated enough memory before doing that. Its just a pointer but not a stirng to hold the fetchd value does it?

    ssharish

    EDIT: You need to open a new thread for these kind of issues, instead of reopening a thread which 3 years old!!
    OK, I will open a new thread then

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. URL escape issue
    By George2 in forum C# Programming
    Replies: 2
    Last Post: 08-12-2008, 12:45 PM
  2. how to get domain part from URL
    By George2 in forum C# Programming
    Replies: 2
    Last Post: 07-23-2008, 01:06 PM
  3. Interpreter.c
    By moussa in forum C Programming
    Replies: 4
    Last Post: 05-28-2008, 06:59 PM
  4. Domain Resolution :: Winsock
    By kuphryn in forum Windows Programming
    Replies: 5
    Last Post: 08-01-2002, 04:34 PM
  5. MSN Vital Information
    By iain in forum A Brief History of Cprogramming.com
    Replies: 9
    Last Post: 09-22-2001, 09:55 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21