List of URLS

**DarkDot** · 05-07-2007

does anyone have a list of about 2000 urls, i need them for a project but searching google and copying and pasting each url is taking forever. Any help would be grateful.

**Salem** · 05-07-2007

So write a simple program to programmatically extract URLs from an HTML page.

Or maybe do the same from your browser cache.

**ulillillia** · 05-07-2007

In HTML, URLs for different webpages have

href="The URL"

For images and other items, it's

src="The URL"

Sometimes the href or src part doesn't have quotes and other times, there's no "http://www.examplewebsite.com/" beginning where "examplewebsite" is the website's URL. Mailto's should be ignored - those are for E-mails. This should help you get started. Using copy/paste, doing 2 every minute will take all day long to do for 2000.

**DarkDot** · 05-07-2007

i'd like to create a program to do it just have no idea what i'm doing maybe someone can hook me up or something? It would be appreciated beyond belief.

**Govtcheez** · 05-07-2007

I bet you could write 2000 random words surrounded by www. and .com and get damn near 2000 links.

**DarkDot** · 05-07-2007

yea the point is i don't want to type all of that

**whiteflags** · 05-07-2007

> does anyone have a list of 2000 urls?
Um, just look in a Web directory.
for instance, http://dir.yahoo.com/
You should be able to find all sorts of sites there.

**DarkDot** · 05-07-2007

yea but i still have to copy and past it would be much better if i could just have a script or a list and not have to copy and paste it

**Govtcheez** · 05-07-2007

Jesus Christ, since your lazy ass posted this thread, you could have done this 3 times.

**DavidP** · 05-07-2007

Here....some people might get mad at me for basically giving you the apple here...but anyways...For my Internet Programming course this past semester, as the first half of our first project, we had to write a web client that would accept a web address, download that page, and then download all pages linked to that page.

The code is fairly well documented, so if you can read code, and my own documentation, you should be able to understand what is going on. It does not do exactly what you want to do, but you can make it do what you want to do by seriously changing only about 5 lines of code. It should compile fine using g++. If you can't understand the code, then you need to study socket programming a bit, and then come back....but honestly you should only need to change a few lines of code in the main() function.

Oh..the main() function is in client.cpp

It will not run in Windows...only in Linux (and possibly Mac)...I use the Unix Socket API.

This should give you a start...now go for it.

Oh...change the extension from .txt to .zip...it is a compressed zip folder with my code.

Like I said, my code doesn't do what you want to do, but it does something similar, and if you can understand code, you should be able to make it do what you want it to do by changing less than 5 lines.

**Govtcheez** · 05-07-2007

He's probably going to ask you to write, compile, and release a windows-compatible version.

Better start a sourceforge group to help Captain Dumbass finish his homework.

**@nthony** · 05-07-2007

I lol'd.

**divineleft** · 05-07-2007

Originally Posted by @nthony

I lol'd.

.

seriously LOL

Thread: List of URLS

Thread Tools

Search Thread

Display

List of URLS

Similar Threads

Link List math

urgent help please...

instantiated from here: errors...

Contest Results - May 27, 2002

singly linked list