![]() |
| | #1 |
| Banned Join Date: Jun 2005
Posts: 594
| Beginners Contest #2 For those who wanted more!! i think it will make the competition more agressive. If this is a big success i will continue to post competitions. Contest #1 will be judge on the following criteria, 1. How well it conforms to C++, (this means i dont want to see character array where strings would be better used) 2. Length of code. (shorter being the best) 3. Comments Object of this contest : Parse a file(html) doesnt need to be in html format and the filename should be provided by the user, commandline acceptable but not required. The user should beable to select which file type he wants removed from the file for later use. Pretend this is a file you are given, which contains, in html format many links to picture and video and music. Code: <a href="http://www.image.com/images/blah.jpg">words</a> <a href="http://www.image.com/images/blah.bmp">words</a> <a href="http://www.image.com/images/blah.gif">words</a> <a href="http://www.image.com/video/blah.wma">words</a> <a href="http://www.image.com/video/blah.mpg">words</a> <a href="http://www.image.com/video/blah.avi">words</a> <a href="http://www.image.com/video/blah.asf">words</a> <a href="http://www.image.com/music/blah.mp3">words</a> <a href="http://www.image.com/music/blah.mp3">words</a> <a href="http://www.image.com/music/blah.wma">words</a> links and some of the links wont start with "a href=" some will be "img src=" along with a few other so it may take some reasearch if your not fimilar with html. YOUR job as the coder, is to accept a user input or from a config file, file extention such as ".jpg" , ".gif", ".zip", ".mp3", ".avi" and so on. Once you have collected the file extentions you must sort out the links in the file so that only the links to the extention supplied by the user still exsist. You can either overwrite the file with the links that were pulled or you can put them in a new file. Once you have completed a request by the user to save the links with the following extention for example (.mp3 , .jpg) your output file if the above was your file would look like this : Code: http://www.image.com/images/blah.jpg http://www.image.com/music/blah.mp3 http://www.image.com/music/blah.mp3 All of this is free to interuptation. That of course means, as long as you get the links the user wants its entirly up to you on how to make it happen. That is you must still stay in the bounds of the criteria to win. I know this next part my neglect some of you, but you are required to have below 800 posts at tiem of submission to compete in this contest. Due Date August 9th (Yep thats one week) Good Luck to those who participate, as a side note please message me or post in c++ board question you have when you start coding this project, especially if you get discouraged to compete in it. P.S. Many of you might be wondering what the point of this program is, well for one i come to many site that have alot of picture or video on it but there a thousand individual links and i have to click each one, well now i can just get all the links i want real fast then use a mass downloader. I just wrote this program a few days ago myself, so i though was a good idea to pass it on. |
| ILoveVectors is offline | |
| | #2 |
| Registered User Join Date: Mar 2003
Posts: 134
| Could you please give a sample run of the program ? |
| noob2c is offline | |
| | #3 |
| Banned Join Date: Jun 2005
Posts: 594
| Pretend this is the command prompt, (This is what it should look like at a minimum, that is if you dont use a config file to get your file types). Code: Please Enter a filename : Please Enter the file type you wanted saved : Completed! code, and your output should be a file holding the links to the specified file type. Im going to attach the one i wrote, it does a little more then what im asking here and im still working on it, i cant post the code for contest reason of course, but i will include a readme, and an example file. So you will be wanting to look in the file.html after you run the program for the results. also mine ask for a little more information because it takes some stuff into account that i didnt ask for here, so when it asked you for the url, you can just type junk there it wont affect the out put of file.html for the input here. enter file.html for the filename of course. here is the link to the archive file |
| ILoveVectors is offline | |
| | #4 |
| Registered User Join Date: May 2003
Posts: 2,787
| so let me get this straight - the program asks the user what file they want (filename and extension), and it searches the HTML file given it on the command-line (or hard-coded) for it? a few more technical questions: do we have to account for single-quotes as well: Code: <A HREF='test.jpg'>bleh</A> Code: <A HREF="test.jpg">test.jpg</A> Code: <A HREF="http://www.mythingy.com/images/image5.jpg">jgieopa</A> as opposed to <A HREF="../images/image5.jpg">jvieopaj</A> /me wishes C++ had some standard regex syntax...
__________________ Join is in our Unofficial Cprog IRC channel Server: irc.phoenixradio.org Channel: #Tech Team Cprog Folding@Home: Team #43476 Download it Here Detailed Stats Here More Detailed Stats 52 Members so far, are YOU a member? Current team score: 1223226 (ranked 374 of 45152) The CBoard team is doing better than 99.16% of the other teams Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374) Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT Last edited by major_small; 08-02-2005 at 09:16 PM. |
| major_small is offline | |
| | #5 | |||
| Banned Join Date: Jun 2005
Posts: 594
| Quote:
Code: yes Quote:
i was thinking about that my program accounts for that, but since this was beginners and i didnt know how many people had experience with html i didnt want a lot of rules to turn people off, so no my test file will assume the entire link is there. the test file will nto contain any links to ftp, or irc, or aim or any other protocols that you can link to only http, and also of course the type taht display images and what not such as a href , img src , embed src, and the variations of that you should be extracting links from. Does that clear your question or do you have more? Btw i thank you for your quesiton they will help other people be more clear on the goal, however you do realize you cant compete in this one. | |||
| ILoveVectors is offline | |
| | #6 | |
| Registered User Join Date: May 2003
Posts: 2,787
| Quote:
I'll just post my solution after the contest close just for kicks
__________________ Join is in our Unofficial Cprog IRC channel Server: irc.phoenixradio.org Channel: #Tech Team Cprog Folding@Home: Team #43476 Download it Here Detailed Stats Here More Detailed Stats 52 Members so far, are YOU a member? Current team score: 1223226 (ranked 374 of 45152) The CBoard team is doing better than 99.16% of the other teams Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374) Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT | |
| major_small is offline | |
| | #7 |
| Banned Join Date: Jun 2005
Posts: 594
| well you know what submit your solution to me anyways, i usually like your work alot, so if there arent many posted for this contest ill put you in the running. |
| ILoveVectors is offline | |
| | #8 |
| Registered User Join Date: Mar 2003
Posts: 134
| in this step : Please Enter the file type you wanted saved : the user can specify more than one file type right ? and can we ask the user to enter the file types in a particular manner say separated by space or by a comma ? |
| noob2c is offline | |
| | #9 |
| Banned Join Date: Jun 2005
Posts: 594
| yes you can interupt it freely, you can do it anyway you want as long as the end result is the same, meaning the correct links are in the file that the only important part. |
| ILoveVectors is offline | |
| | #10 |
| Registered User Join Date: May 2003
Posts: 2,787
| well, since this contest is over, I'm posting the code I came up with (even though I'm not in the running) Code: #include <iostream>
#include <fstream>
#include <string>
int main(int argc,char*argv[])
{
std::string filename;
std::string line;
int index;
if(argc>2)
{
std::cout<<"\nUsage:\n\tGetIt\n\tGetIt <source>\n";
exit(0);
}
else if(argc==2)
{
filename=argv[1];
}
else
{
filename="default.html";
}
std::ifstream infile(filename.c_str());
std::cout<<"Enter the Filename (including extension): ";
getline(std::cin,filename,'\n');
while(getline(infile,line,'\n'))
{
index=line.find(filename);
if(index>-1)
{
line=line.substr(line.find_last_of("\"\'",index)+1);
line.at(line.find_first_of("\"\'"))='\0';
line=line.c_str();
std::cout<<line<<std::endl;
break;
}
}
infile.close();
return 0;
}
Code: <a href="http://www.image.com/images/blah.jpg">words</a> <a href="http://www.image.com/images/blah.bmp">blah.bmp</a> <a href="http://www.image.com/images/blah.gif">words</a> <a href='http://www.image.com/video/blah.wma'>blah.wma</a> <a href="video/blah.avi">blah.avi</a> <a href="http://www.image.com/video/blah.asf">words</a><a href="http://www.image.com/music/blah.mp3">words</a> and here's our <a href="http://www.image.com/music/blah.mp3">new mp3</a> for your listening enjoyment <a href="http://www.image.com/music/blah.wma"><img src="http://www.image.com/images/blah.jpg"></A> <a href="ftp://ftp.image.com/music/blah.cpp"></a> <embed src="music.ogg" width="20%" height="5%"></embed>
__________________ Join is in our Unofficial Cprog IRC channel Server: irc.phoenixradio.org Channel: #Tech Team Cprog Folding@Home: Team #43476 Download it Here Detailed Stats Here More Detailed Stats 52 Members so far, are YOU a member? Current team score: 1223226 (ranked 374 of 45152) The CBoard team is doing better than 99.16% of the other teams Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374) Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT Last edited by major_small; 08-09-2005 at 05:26 PM. Reason: syntax highlighter tripped up by escape characters >.< |
| major_small is offline | |
| | #11 |
| Weak. Join Date: Apr 2005
Posts: 163
| Mine. Code: #include <iostream>
#include <string>
#include <algorithm>
#include <vector>
#include <fstream>
using namespace std;
struct url {
string address;
string extension;
};
//predicate for the find_if function in url_end
//returns true if the char is either a single or double quote, false otherwise
bool quotes ( char c ){
return ( c == '\'' || c == '\"' );
}
//function for find the end of the url, simply looks for the single or double quote
string::iterator url_end ( string::iterator a, string::iterator b ){
//take two iterators which delimit a string
//a will already be at the position of h
string::iterator i = a;
//i will be at the posisition of a " or '
i = find_if( a, b, quotes );
//iterates through the string until "quote" returns true
return i;
}
//looks for the beginning of the protocol http
string::iterator url_beg ( string::iterator a, string::iterator b){
//take two iterators which delimit a string
string link = "<a href="; //present in any link
//seach for "link" string to make sure we found a link and not <img src> or anything else
string::iterator i = search( a, b, link.begin(), link.end() );
if ( i != b ){
//i will the at < in <a href=", return 9 places past it
return i + 9;
}
else return b;
}
//looks for the file extension in the url
string extensions ( string url ){
//accepts a string (the url)
string::iterator j = url.end();
//iterate backwards through url until you encounter a . as in music.wav
while ( *(j) != '.' ){
j--;
}
//create a string delimited by u and url.end()
string d = string( j, url.end() );
return d;
}
//find the urls whithin the string
vector<url> find_urls ( string& link ){
typedef string::iterator iter;
iter a = link.begin(), b = link.end();
//this will hold all of the urls
vector<url> urls;
url add;
//continue until a reaches the end of the string
while ( a != b ){
//set a to the beginning
a = url_beg ( a, b );
//if a doesn't equal the end of the string, a link was found
if ( a != b ){
//creat an iterator to delimit the end of the url
iter c = url_end ( a, b );
//create the string
string d = string ( a, c );
add.address = d;
add.extension = extensions ( d );
urls.push_back( add );
//set a to equal c so we can look through the rest of the string
a = c;
}
}
return urls;
}
int main(){
string link, in, in_file;
cout << "Enter path of the file: ";
getline( cin, in_file );
cout << endl << "Results will output to a file called results.txt" << endl;
ifstream file( in_file.c_str() ); // for testing
ofstream out( "results.txt" );
while ( getline( file, in ) ){
//a string is read into in, and it is added to link to make one big string
link = link + in;
}
//find urls in the link string, put them into a vector
vector<url> urls = find_urls ( link );
if ( urls.empty() ){
cout << "No urls found.";
cin.get();
return 0;
}
else{
cout << "Please enter the extensions you wish to keep ( ex .mp3 ):" << endl;
string ext;
vector<string> extensions;
//input extensions you want to keep
while ( cin >> ext ){
extensions.push_back( ext );
}
vector<url>::iterator i = urls.begin();
//run through each url
while ( i != urls.end() ){
vector<string>::iterator j = extensions.begin();
//check to see if the extension of the current url matches any of
//the ones typed by the user
while ( j != extensions.end() ){
//if we found a match
if ( *j == (*i).extension ){
out << (*i).address << endl;
//no need to walk through the rest
j = extensions.end();
}
else{
//otherwise, check the other extension the user typed
j++;
}
}
//check next url
i++;
}
}
return 0;
}
Last edited by dra; 08-09-2005 at 06:20 PM. |
| dra is offline | |
| | #12 |
| Banned Join Date: Jun 2005
Posts: 594
| i am always sad when i have to block major_smalls from a contest, cause he jsut a good contestant, and always has lovely code. i didnt test your code major, but by looking at it i dont think it does what it suppose to. needless to say dra was the only competiter so it goes unspoken that your the winner .thank youf or competeting, btw overall i thought your code was lovely and was almost exactly what i was wanting to see. here in a day when i finish some changes i had planned for me code i will post it to show you what insipred this competetion. btw i hoep you two will compete in one of the other competetions i jsut posted there welcome to all levels. |
| ILoveVectors is offline | |
| | #13 | |
| Registered User Join Date: May 2003
Posts: 2,787
| you should test it... it does exactly what it's supposed to (unless I read it wrong), except in one case, which I could have easily fixed, but didn't because I wasn't in the competition ![]() Quote:
__________________ Join is in our Unofficial Cprog IRC channel Server: irc.phoenixradio.org Channel: #Tech Team Cprog Folding@Home: Team #43476 Download it Here Detailed Stats Here More Detailed Stats 52 Members so far, are YOU a member? Current team score: 1223226 (ranked 374 of 45152) The CBoard team is doing better than 99.16% of the other teams Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374) Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT Last edited by major_small; 08-10-2005 at 12:06 AM. | |
| major_small is offline | |
| | #14 |
| Banned Join Date: Jun 2005
Posts: 594
| you read wrong a little bit, i think, you should be finding all of a specific file type not so much a specific filename.type a user should enter .jpg and the fiel shoudl return all links to a .jpg file reguardless of wether its a href or img src; so if the file type asked was .jpg in the following Code: <a href="http://www.image.com/images/blah1.jpg>words</a> <a href="http://www.image.com/images/blah2.jpg>words</a> <a href="http://www.image.com/images/blah3.jpg>words</a> <a href="http://www.image.com/images/blah4.jpg>words</a> <a href="http://www.image.com/images/blah5.jpg>words</a> <a href="http://www.image.com/images/blah6.jpg>words</a> <a href="http://www.image.com/images/blah7.jpg>words</a> <a href="http://www.image.com/images/blah8.jpg>words</a> <a href="http://www.image.com/images/blah9.jpg>words</a> <a href="http://www.image.com/images/blah10.jpg>words</a> <a href="http://www.image.com/images/blah1.avi>words</a> <a href="http://www.image.com/images/blah2.avi>words</a> <a href="http://www.image.com/images/blah3.avi>words</a> <a href="http://www.image.com/images/blah4.avi>words</a> <a href="http://www.image.com/images/blah5.avi>words</a> <a href="http://www.image.com/images/blah6.avi>words</a> <a href="http://www.image.com/images/blah7.avi>words</a> the following would be returned with the choice of .jpg as extention Code: <a href="http://www.image.com/images/blah1.jpg>words</a> <a href="http://www.image.com/images/blah2.jpg>words</a> <a href="http://www.image.com/images/blah3.jpg>words</a> <a href="http://www.image.com/images/blah4.jpg>words</a> <a href="http://www.image.com/images/blah5.jpg>words</a> <a href="http://www.image.com/images/blah6.jpg>words</a> <a href="http://www.image.com/images/blah7.jpg>words</a> <a href="http://www.image.com/images/blah8.jpg>words</a> <a href="http://www.image.com/images/blah9.jpg>words</a> <a href="http://www.image.com/images/blah10.jpg>words</a> Last edited by ILoveVectors; 08-10-2005 at 12:11 AM. |
| ILoveVectors is offline | |
| | #15 |
| Registered User Join Date: May 2003
Posts: 2,787
| oh I see... you were looking for file extensions, not individual files... meh.. it would have been the same amount of code
__________________ Join is in our Unofficial Cprog IRC channel Server: irc.phoenixradio.org Channel: #Tech Team Cprog Folding@Home: Team #43476 Download it Here Detailed Stats Here More Detailed Stats 52 Members so far, are YOU a member? Current team score: 1223226 (ranked 374 of 45152) The CBoard team is doing better than 99.16% of the other teams Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374) Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT |
| major_small is offline | |
![]() |
| Thread Tools | |
| Display Modes | |
|
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Beginners Contest, Others our Welcome | ILoveVectors | Contests Board | 42 | 08-02-2005 06:53 PM |
| Expression Evaluator Contest | Stack Overflow | Contests Board | 20 | 03-29-2005 10:34 AM |
| WANTED: Contest Master | kermi3 | A Brief History of Cprogramming.com | 15 | 01-23-2003 10:15 PM |