Thread: Web Spider

  1. #1
    Registered User
    Join Date
    Feb 2009
    Posts
    329

    Web Spider

    Hi,

    I am starting to think about my final year University project and have the idea of writing a web spider. Would this be a good fit for C++?

    The basic premise would be a program that web managers can utilise to monitor all interal and external links within their domain, and report on the health of those links at any given interval.

    Is this something that is suited to C++ or more so with C#? Also, would you think this is of sufficient difficulty and challenge for a final year project?

    I am fairly comfortable with both C++ and C#, however haven't done much web programming.

    Thanks,

    Darren.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    Difficult - you write everything in C++ starting with the socket code, writing the HTML parser and all the other bits that go along with it.

    Easy - you write everything in C# using an HTML component which provides you with a handy DOM interface to the HTML. The project is reduced to little more than a 1-liner "for each link on page do" loop and a bit of stats gathering.

    What do YOU find easy or difficult?
    What does your academic institution regard as easy or difficult?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Feb 2009
    Posts
    329
    I think that, based on your summary, if I did it in C++ rather than C#, it may raise negative questions from my University why I didn't use the 'right tool for the job' and there is 'no need to re-invent the wheel', etc, etc, they are pretty keen on that. I think with the libraries already in C#, that it sounds a little too easy for a final project and would therefore not provide enough substance for me to discuss the project in the dissertation.

    Ah well, I think trying to figure out a final year project is the most difficult thing i've had to do whilst at Uni!

    Thanks for the response Salem.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    I don't know - it's hard to advise on a project when we know so little about your skill or the demands of your faculty.

    Being genuinely unique "never been done before" is pretty hard at the undergraduate level. It's always possible to find some similarity with something that has already been done.

    Tough choice - use the right tools and be accused of lack of depth, or be accused of not using the right tools for the job.

    It sounds like a nice idea - perhaps expand on it by adding more features
    - which domains are "stable" and which are "volatile". Eg, knowing www.crap-news.com links always seem to expire within a month vs. www.betternews.com links that last for years.
    - which links get used most (integrate with server / firewall / gateway logs).

    Why try to maintain a link on a page no-one ever clicks on anyway?
    Last edited by Salem; 09-28-2010 at 04:48 AM. Reason: Even my madeup links are live
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Feb 2009
    Posts
    329
    Thanks Salem , I will give it some thought. I am in the lucky position of just starting a 12 month industrial placement prior to my final year at Uni, so I have a little time to think things through and learn new skills. Ideally I want to start actually coding it within the next 3-4 months, so I will keep reading and searching for ideas and/or expansions to this idea.

    Cheers.

  6. #6
    'Allo, 'Allo, Allo
    Join Date
    Apr 2008
    Posts
    639
    Don't know if it'll be useful, but what I did was listen to interviews with prominent people in my areas of interest and see what ideas they had for what's going to happen in the future.

    For instance, I got my idea from watching an interview with the lead dev on Microsoft's driver framework when it was first released. The interviewer asked him if managed languages would ever be involved on the kernel level, I can't remember what the response was but that became my project. 14-15 months later, I took my laptop and a Windows XP VM into a meeting with the profs marking my dissertation and demonstrated the Kenco language (named after the coffee, since I was in Java class at the time), the compiler, a runtime and a short proof of concept.

    So yeah, I'd say make sure it's a project you'll enjoy because you're going to be seeing quite of lot of it.

  7. #7
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    Your industrial placement will be a good source for ideas

    With the added bonus of a job offer if you make a decent attempt.

    Makes a change from the usual "gimme project title" types who roll through with 1-post and 24 hour deadlines
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  8. #8
    Registered User
    Join Date
    Feb 2009
    Posts
    329
    Quote Originally Posted by adeyblue View Post
    Don't know if it'll be useful, but what I did was listen to interviews with prominent people in my areas of interest and see what ideas they had for what's going to happen in the future.

    For instance, I got my idea from watching an interview with the lead dev on Microsoft's driver framework when it was first released. The interviewer asked him if managed languages would ever be involved on the kernel level, I can't remember what the response was but that became my project. 14-15 months later, I took my laptop and a Windows XP VM into a meeting with the profs marking my dissertation and demonstrated the Kenco language (named after the coffee, since I was in Java class at the time), the compiler, a runtime and a short proof of concept.

    So yeah, I'd say make sure it's a project you'll enjoy because you're going to be seeing quite of lot of it.
    There's some good advice there. It is difficult trying to find something at undergraduate level that's innovative and progressive. They tend to be more so at masters or PHD level.

    Thanks for the advice.

  9. #9
    Registered User
    Join Date
    Feb 2009
    Posts
    329
    Quote Originally Posted by Salem View Post
    Your industrial placement will be a good source for ideas

    With the added bonus of a job offer if you make a decent attempt.

    Makes a change from the usual "gimme project title" types who roll through with 1-post and 24 hour deadlines
    Cheers.

    As a mature student (34 upon graduation), I feel that I need to make my CV stand out more than most. One way of doing that is having a project on there that catches the eye. Hopefully, will think of one over the next few months.

    Cheers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. MS Web Services Question
    By mercury529 in forum Windows Programming
    Replies: 0
    Last Post: 11-14-2006, 06:36 PM
  2. Consuming same Web Service multiple times
    By cfriend in forum C# Programming
    Replies: 2
    Last Post: 01-10-2006, 09:59 AM
  3. SWEBS Web Server
    By nickname_changed in forum A Brief History of Cprogramming.com
    Replies: 6
    Last Post: 09-22-2003, 02:46 AM
  4. Further developing C for the web
    By bjdea1 in forum C Programming
    Replies: 24
    Last Post: 12-25-2002, 01:49 PM