Thread: Reading HTML files and Images from Website

  1. #1
    Registered User
    Join Date
    Jul 2005
    Posts
    2

    Reading HTML files and Images from Website

    Maybe I am looking in the wrong place, but I was having trouble googleing this information.

    Basically, what I want to do is connect to a webserver and get a pages html and save it locally (also want to do this with images on the given page). Anyone point me in the right direction?

    Thanks.

  2. #2
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Try google or yahoo or another such search engine.

    [edit]
    Or post this question in the Networking board.
    [/edit]
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  3. #3
    Just kidding.... fnoyan's Avatar
    Join Date
    Jun 2003
    Location
    Still in the egg
    Posts
    275
    Hi
    Nowadays i am working on such an application that retrives the HTML pages from a web server This is your luck day, but for images, i have no idea!
    Just use ordinary socket functions. And don't forget to set the server's port to 80. So, your sockaddr_in structure will seem like this

    char *msg="GET /\n";
    struct sockaddr_in server;
    ....
    ....
    server.sin_family=AF_INET;
    server.sin_port=htons(80); /* for HTTP */
    server.sin_addr.s_addr=inet_addr("server_IP_addres ");
    memset(&(server.sin_zero),'\0",8);

    connect(....);

    send(sockfd,msg,strlen(msg),0);

    and save the received data into a ASCII file which has "html" extension. This is the way I use....
    The "GET /" specifies the page that you want to get.For example "GET /foo/tux.html" will get the http://www.your_server.com/foo/tux.html. Get more information about HTTP specifications from google.

  4. #4
    Skunkmeister Stoned_Coder's Avatar
    Join Date
    Aug 2001
    Posts
    2,572
    Free the weed!! Class B to class C is not good enough!!
    And the FAQ is here :- http://faq.cprogramming.com/cgi-bin/smartfaq.cgi

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > Anyone point me in the right direction?
    http://www.gnu.org/software/wget/wget.html

    Or if you want to write some code yourself
    http://curl.haxx.se/

    Just some extra points for not mentioning your OS and Compiler in your vacuous request.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Registered User
    Join Date
    Jul 2005
    Posts
    2
    Thanks for the help everyone, I would prefer not to use curl or wget as Im trying to make something thats not completely operating system specific....

    Ill give fnoyan's socket example a try...thanks again.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Problem reading tiff image files?
    By compz in forum C++ Programming
    Replies: 9
    Last Post: 10-30-2009, 04:17 AM
  2. Problem reading 8 bit greyscale images !
    By Clueless@work in forum C Programming
    Replies: 12
    Last Post: 07-01-2007, 11:07 AM
  3. Reading color tiff images?
    By compz in forum Game Programming
    Replies: 1
    Last Post: 11-21-2003, 12:48 AM
  4. Storing images or files inside executable?
    By Nutshell in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 03-30-2002, 08:31 PM