I am looking to extract some data from a few websites for the purpose of some statistics work. I know I could do that by looking that the websites in question and then copy past the information I want into a file, but its not the most interesting thing in the world to do, so I though this would be a great chance for me to bring out my programming skills.
So I was wondering if there is anyway I can write a script (language not very important, like learning new things) and just give it a start webpage and it will do what I want it to. All I need is being able to scan through the source of the webpage looking for keywords and save them to an text file. It would also be nice if I could open new links and have it running as a recursive processes. And the last thing which I would like, but it is not that important is the ability to save images as well.
Any ideas where I should start? Just to be clear, I not asking for anyone to write any code, I just asking for a push in the right direction when it comes to choosing what do use. I would also like to do it myself, even though I am sure there are lots of programs that already does this for you.