Thread: List of URLS

  1. #1
    Registered User
    Join Date
    Mar 2007
    Posts
    109

    List of URLS

    does anyone have a list of about 2000 urls, i need them for a project but searching google and copying and pasting each url is taking forever. Any help would be grateful.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    So write a simple program to programmatically extract URLs from an HTML page.

    Or maybe do the same from your browser cache.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Math wizard
    Join Date
    Dec 2006
    Location
    USA
    Posts
    582
    In HTML, URLs for different webpages have

    href="The URL"

    For images and other items, it's

    src="The URL"

    Sometimes the href or src part doesn't have quotes and other times, there's no "http://www.examplewebsite.com/" beginning where "examplewebsite" is the website's URL. Mailto's should be ignored - those are for E-mails. This should help you get started. Using copy/paste, doing 2 every minute will take all day long to do for 2000.

  4. #4
    Registered User
    Join Date
    Mar 2007
    Posts
    109
    i'd like to create a program to do it just have no idea what i'm doing maybe someone can hook me up or something? It would be appreciated beyond belief.

  5. #5
    Mayor of Awesometown Govtcheez's Avatar
    Join Date
    Aug 2001
    Location
    MI
    Posts
    8,823
    I bet you could write 2000 random words surrounded by www. and .com and get damn near 2000 links.

  6. #6
    Registered User
    Join Date
    Mar 2007
    Posts
    109
    yea the point is i don't want to type all of that

  7. #7
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    > does anyone have a list of 2000 urls?
    Um, just look in a Web directory.
    for instance, http://dir.yahoo.com/
    You should be able to find all sorts of sites there.

  8. #8
    Registered User
    Join Date
    Mar 2007
    Posts
    109
    yea but i still have to copy and past it would be much better if i could just have a script or a list and not have to copy and paste it

  9. #9
    Mayor of Awesometown Govtcheez's Avatar
    Join Date
    Aug 2001
    Location
    MI
    Posts
    8,823
    Jesus Christ, since your lazy ass posted this thread, you could have done this 3 times.

  10. #10
    l'Anziano DavidP's Avatar
    Join Date
    Aug 2001
    Location
    Plano, Texas, United States
    Posts
    2,743
    Here....some people might get mad at me for basically giving you the apple here...but anyways...For my Internet Programming course this past semester, as the first half of our first project, we had to write a web client that would accept a web address, download that page, and then download all pages linked to that page.

    The code is fairly well documented, so if you can read code, and my own documentation, you should be able to understand what is going on. It does not do exactly what you want to do, but you can make it do what you want to do by seriously changing only about 5 lines of code. It should compile fine using g++. If you can't understand the code, then you need to study socket programming a bit, and then come back....but honestly you should only need to change a few lines of code in the main() function.

    Oh..the main() function is in client.cpp

    It will not run in Windows...only in Linux (and possibly Mac)...I use the Unix Socket API.

    This should give you a start...now go for it.

    Oh...change the extension from .txt to .zip...it is a compressed zip folder with my code.

    Like I said, my code doesn't do what you want to do, but it does something similar, and if you can understand code, you should be able to make it do what you want it to do by changing less than 5 lines.
    My Website

    "Circular logic is good because it is."

  11. #11
    Mayor of Awesometown Govtcheez's Avatar
    Join Date
    Aug 2001
    Location
    MI
    Posts
    8,823
    He's probably going to ask you to write, compile, and release a windows-compatible version.

    Better start a sourceforge group to help Captain Dumbass finish his homework.

  12. #12
    Registered Abuser
    Join Date
    Jun 2006
    Location
    Toronto
    Posts
    591
    I lol'd.

  13. #13
    Registered User divineleft's Avatar
    Join Date
    Jul 2006
    Posts
    158
    Quote Originally Posted by @nthony View Post
    I lol'd.
    .

    seriously LOL
    Gentoo Linux - 2.6.22.1
    GCC version 4.2.0

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Link List math
    By t014y in forum C Programming
    Replies: 17
    Last Post: 02-20-2009, 06:55 PM
  2. urgent help please...
    By peter_hii in forum C++ Programming
    Replies: 11
    Last Post: 10-30-2006, 06:37 AM
  3. instantiated from here: errors...
    By advocation in forum C++ Programming
    Replies: 5
    Last Post: 03-27-2005, 09:01 AM
  4. Contest Results - May 27, 2002
    By ygfperson in forum A Brief History of Cprogramming.com
    Replies: 18
    Last Post: 06-18-2002, 01:27 PM
  5. singly linked list
    By clarinetster in forum C Programming
    Replies: 2
    Last Post: 08-26-2001, 10:21 PM