C Board  

Go Back   C Board > Community Boards > Contests Board

Reply
 
LinkBack Thread Tools Display Modes
Old 08-02-2005, 07:39 PM   #1
Banned
 
Join Date: Jun 2005
Posts: 594
Beginners Contest #2 For those who wanted more!!

Im only offering up one contest choice at a time this time around,
i think it will make the competition more agressive. If this is a big
success i will continue to post competitions.

Contest #1 will be judge on the following criteria,

1. How well it conforms to C++, (this means i dont want to see
character array where strings would be better used)
2. Length of code. (shorter being the best)
3. Comments

Object of this contest :

Parse a file(html) doesnt need to be in html
format and the filename should be provided by the
user, commandline acceptable but not required.

The user should beable to select which file type
he wants removed from the file for later use.

Pretend this is a file you are given, which contains, in html
format many links to picture and video and music.

Code:
<a href="http://www.image.com/images/blah.jpg">words</a>
<a href="http://www.image.com/images/blah.bmp">words</a>
<a href="http://www.image.com/images/blah.gif">words</a>
<a href="http://www.image.com/video/blah.wma">words</a>
<a href="http://www.image.com/video/blah.mpg">words</a>
<a href="http://www.image.com/video/blah.avi">words</a>
<a href="http://www.image.com/video/blah.asf">words</a>
<a href="http://www.image.com/music/blah.mp3">words</a>
<a href="http://www.image.com/music/blah.mp3">words</a>
<a href="http://www.image.com/music/blah.wma">words</a>
Now obviously there will be more stuff in the html file then just
links and some of the links wont start with "a href=" some will
be "img src=" along with a few other so it may take some reasearch
if your not fimilar with html. YOUR job as the coder, is to
accept a user input or from a config file, file extention such as
".jpg" , ".gif", ".zip", ".mp3", ".avi" and so on. Once you have
collected the file extentions you must sort out the links in the
file so that only the links to the extention supplied by the user
still exsist. You can either overwrite the file with the links that
were pulled or you can put them in a new file.

Once you have completed a request by the user to save the links
with the following extention for example (.mp3 , .jpg) your
output file if the above was your file would look like this :

Code:
http://www.image.com/images/blah.jpg
http://www.image.com/music/blah.mp3
http://www.image.com/music/blah.mp3

All of this is free to interuptation.
That of course means, as long as you get the links the user wants
its entirly up to you on how to make it happen. That is you must
still stay in the bounds of the criteria to win. I know this next
part my neglect some of you, but you are required to have below
800 posts at tiem of submission to compete in this contest.

Due Date
August 9th (Yep thats one week)
Good Luck to those who participate, as a side note
please message me or post in c++ board question you
have when you start coding this project, especially if you get
discouraged to compete in it.

P.S.
Many of you might be wondering what the point of this program
is, well for one i come to many site that have alot of picture or
video on it but there a thousand individual links and i have
to click each one, well now i can just get all the links i want
real fast then use a mass downloader. I just wrote this program
a few days ago myself, so i though was a good idea to pass
it on.
ILoveVectors is offline   Reply With Quote
Old 08-02-2005, 08:02 PM   #2
Registered User
 
Join Date: Mar 2003
Posts: 134
Could you please give a sample run of the program ?
noob2c is offline   Reply With Quote
Old 08-02-2005, 08:30 PM   #3
Banned
 
Join Date: Jun 2005
Posts: 594
Pretend this is the command prompt,
(This is what it should look like at a minimum, that is
if you dont use a config file to get your file types).

Code:
Please Enter a filename :
Please Enter the file type you wanted saved :
Completed!
your input should be a file containing html
code, and your output should be a file
holding the links to the specified file type.


Im going to attach the one i wrote, it does a little more
then what im asking here and im still working on it,
i cant post the code for contest reason of course,
but i will include a readme, and an example file.
So you will be wanting to look in the file.html
after you run the program for the results.
also mine ask for a little more information because
it takes some stuff into account that i didnt ask for here,
so when it asked you for the url, you can just type junk
there it wont affect the out put of file.html for the input
here. enter file.html for the filename of course.

here is the link to the archive file
ILoveVectors is offline   Reply With Quote
Old 08-02-2005, 08:44 PM   #4
Registered User
 
major_small's Avatar
 
Join Date: May 2003
Posts: 2,787
so let me get this straight - the program asks the user what file they want (filename and extension), and it searches the HTML file given it on the command-line (or hard-coded) for it?

a few more technical questions: do we have to account for single-quotes as well:
Code:
<A HREF='test.jpg'>bleh</A>
and I'm guessing we have to worry about things like this as well:
Code:
<A HREF="test.jpg">test.jpg</A>
also, what about the pathnames themselves... do we have to take into account that most webmasters use relative pathnames, or can we assume they hardcode the full path to each file?
Code:
<A HREF="http://www.mythingy.com/images/image5.jpg">jgieopa</A>
as opposed to
<A HREF="../images/image5.jpg">jvieopaj</A>
one other thing: what about protocols? can we assume that it's all going to be done over HTTP, or do you want to account for FTP or anything else as well?

/me wishes C++ had some standard regex syntax...
__________________
Join is in our Unofficial Cprog IRC channel
Server: irc.phoenixradio.org
Channel: #Tech


Team Cprog Folding@Home: Team #43476
Download it Here
Detailed Stats Here
More Detailed Stats
52 Members so far, are YOU a member?
Current team score: 1223226 (ranked 374 of 45152)

The CBoard team is doing better than 99.16% of the other teams
Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374)

Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT

Last edited by major_small; 08-02-2005 at 09:16 PM.
major_small is offline   Reply With Quote
Old 08-02-2005, 09:27 PM   #5
Banned
 
Join Date: Jun 2005
Posts: 594
Quote:
Originally Posted by major_small
so let me get this straight - the program asks the user what file they want (filename and extension), and it searches the HTML file given it on the command-line (or hard-coded) for it?

a few more technical questions: do we have to account for single-quotes as well:
Code:
<A HREF='test.jpg'>bleh</A>
and I'm guessing we have to worry about things like this as well:
Code:
<A HREF="test.jpg">test.jpg</A>
also, what about the pathnames themselves... do we have to take into account that most webmasters use relative pathnames, or can we assume they hardcode the full path to each file?
Code:
<A HREF="http://www.mythingy.com/images/image5.jpg">jgieopa</A>
Code:
yes

Quote:
Originally Posted by major_smalls
as opposed to <A HREF="../images/image5.jpg">jvieopaj</A>
Quote:
Originally Posted by major_smalls

one other thing: what about protocols? can we assume that it's all going to be done over HTTP, or do you want to account for FTP or anything else as well?

/me wishes C++ had some standard regex syntax...

i was thinking about that my program accounts for that, but since
this was beginners and i didnt know how many people had experience
with html i didnt want a lot of rules to turn people off, so no
my test file will assume the entire link is there. the test file
will nto contain any links to ftp, or irc, or aim or any other
protocols that you can link to only http, and also of course
the type taht display images and what not such as

a href , img src , embed src, and the variations of that you should
be extracting links from.

Does that clear your question or do you have more?

Btw i thank you for your quesiton they will help other people
be more clear on the goal, however you do realize you
cant compete in this one.
ILoveVectors is offline   Reply With Quote
Old 08-02-2005, 09:40 PM   #6
Registered User
 
major_small's Avatar
 
Join Date: May 2003
Posts: 2,787
Quote:
Originally Posted by ILoveVectors
Btw i thank you for your quesiton they will help other people
be more clear on the goal, however you do realize you
cant compete in this one.
hah... party pooper

I'll just post my solution after the contest close just for kicks
__________________
Join is in our Unofficial Cprog IRC channel
Server: irc.phoenixradio.org
Channel: #Tech


Team Cprog Folding@Home: Team #43476
Download it Here
Detailed Stats Here
More Detailed Stats
52 Members so far, are YOU a member?
Current team score: 1223226 (ranked 374 of 45152)

The CBoard team is doing better than 99.16% of the other teams
Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374)

Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT
major_small is offline   Reply With Quote
Old 08-02-2005, 10:32 PM   #7
Banned
 
Join Date: Jun 2005
Posts: 594
well you know what submit your solution to me anyways,
i usually like your work alot, so if there arent many posted
for this contest ill put you in the running.
ILoveVectors is offline   Reply With Quote
Old 08-03-2005, 07:34 AM   #8
Registered User
 
Join Date: Mar 2003
Posts: 134
in this step :

Please Enter the file type you wanted saved :


the user can specify more than one file type right ? and can we ask the user to enter the file types in a particular manner say separated by space or by a comma ?
noob2c is offline   Reply With Quote
Old 08-03-2005, 09:13 AM   #9
Banned
 
Join Date: Jun 2005
Posts: 594
yes you can interupt it freely, you can do it anyway you want as
long as the end result is the same, meaning the correct links
are in the file that the only important part.
ILoveVectors is offline   Reply With Quote
Old 08-09-2005, 05:20 PM   #10
Registered User
 
major_small's Avatar
 
Join Date: May 2003
Posts: 2,787
well, since this contest is over, I'm posting the code I came up with (even though I'm not in the running)
Code:
#include <iostream>
#include <fstream>
#include <string>

int main(int argc,char*argv[])
{
	std::string filename;
	std::string line;
	int index;
	
	if(argc>2)
	{
		std::cout<<"\nUsage:\n\tGetIt\n\tGetIt <source>\n";
		exit(0);
	}
	else if(argc==2)
	{
		filename=argv[1];
	}
	else
	{
		filename="default.html";
	}

	std::ifstream infile(filename.c_str());

	std::cout<<"Enter the Filename (including extension): ";
	getline(std::cin,filename,'\n');

	while(getline(infile,line,'\n'))
	{
		index=line.find(filename);	
		if(index>-1)
		{
			line=line.substr(line.find_last_of("\"\'",index)+1);
			line.at(line.find_first_of("\"\'"))='\0';	
			line=line.c_str();
			std::cout<<line<<std::endl;
			break;
		}
	}

	infile.close();
	return 0;
}
and my (better) test input file:
Code:
<a href="http://www.image.com/images/blah.jpg">words</a>
<a href="http://www.image.com/images/blah.bmp">blah.bmp</a>
<a href="http://www.image.com/images/blah.gif">words</a>
<a href='http://www.image.com/video/blah.wma'>blah.wma</a>

<a href="video/blah.avi">blah.avi</a>

<a href="http://www.image.com/video/blah.asf">words</a><a href="http://www.image.com/music/blah.mp3">words</a>

and here's our <a href="http://www.image.com/music/blah.mp3">new mp3</a> for your listening enjoyment

<a href="http://www.image.com/music/blah.wma"><img src="http://www.image.com/images/blah.jpg"></A>

<a href="ftp://ftp.image.com/music/blah.cpp"></a>

<embed src="music.ogg" width="20%" height="5%"></embed>
__________________
Join is in our Unofficial Cprog IRC channel
Server: irc.phoenixradio.org
Channel: #Tech


Team Cprog Folding@Home: Team #43476
Download it Here
Detailed Stats Here
More Detailed Stats
52 Members so far, are YOU a member?
Current team score: 1223226 (ranked 374 of 45152)

The CBoard team is doing better than 99.16% of the other teams
Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374)

Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT

Last edited by major_small; 08-09-2005 at 05:26 PM. Reason: syntax highlighter tripped up by escape characters >.<
major_small is offline   Reply With Quote
Old 08-09-2005, 06:00 PM   #11
dra
Weak.
 
dra's Avatar
 
Join Date: Apr 2005
Posts: 163
Mine.

Code:
#include <iostream>
#include <string>
#include <algorithm>
#include <vector>
#include <fstream>

using namespace std;


struct url {
           string address;
           string extension;
       };


//predicate for the find_if function in url_end
//returns true if the char is either a single or double quote, false otherwise
bool quotes ( char c ){

		      return ( c == '\'' || c == '\"' );
}
//function for find the end of the url, simply looks for the single or double quote
string::iterator url_end ( string::iterator a, string::iterator b ){  
                           //take two iterators which delimit a string
                           //a will already be at the position of h 
                           
                           string::iterator i = a;
                           //i will be at the posisition of a " or '
                           
                           i = find_if( a, b, quotes );
                           //iterates through the string until "quote" returns true
                           return i;
                 }
//looks for the beginning of the protocol http
string::iterator url_beg ( string::iterator a, string::iterator b){
                           //take two iterators which delimit a string

                           string link = "<a href="; //present in any link

                           //seach for "link" string to make sure we found a link and not <img src> or anything else
                           string::iterator i = search( a, b, link.begin(), link.end() );
                          
                           if ( i != b ){
                                //i will the at < in <a href=", return 9 places past it
                                return i + 9;
                           }

                           else return b;
                 }
//looks for the file extension in the url
string extensions ( string url ){
                    //accepts a string (the url)
                    string::iterator j = url.end();
                    
                    //iterate backwards through url until you encounter a .  as in music.wav                
                    while ( *(j) != '.' ){
                            j--;
                    }
                    //create a string delimited by u and url.end()
                    string d = string( j, url.end() );
                    return d;
       }  
//find the urls whithin the string
vector<url> find_urls ( string& link ){
                        
                        typedef string::iterator iter;
                               
                        iter a = link.begin(), b = link.end();
                        //this will hold all of the urls
                        vector<url> urls;
                        
                        url add;
                        //continue until a reaches the end of the string
                        while ( a != b ){
                                //set a to the beginning
                                a = url_beg ( a, b );
                                //if a doesn't equal the end of the string, a link was found
                                if ( a != b ){
                                     //creat an iterator to delimit the end of the url 
                                     iter c = url_end ( a, b );
                                     //create the string
                                     string d = string ( a, c );
                                     
                                     add.address = d;
                                     add.extension = extensions ( d );
                                     urls.push_back( add );

                                     //set a to equal c so we can look through the rest of the string
                                     a = c;

                                 }
                         }
                         
                         return urls;
              }

int main(){

         string link, in, in_file; 
         cout << "Enter path of the file: ";
         getline( cin, in_file );
         cout << endl << "Results will output to a file called results.txt" << endl;
         ifstream file( in_file.c_str() );  // for testing
         ofstream out( "results.txt" );

         while ( getline( file, in ) ){
                 //a string is read into in, and it is added to link to make one big string
                 link = link + in;

         }
         //find urls in the link string, put them into a vector
         vector<url> urls = find_urls ( link );
         
         if ( urls.empty() ){ 
              cout << "No urls found.";
              cin.get();
              return 0;
         }

         else{
              cout << "Please enter the extensions you wish to keep ( ex .mp3 ):" << endl;
              
              string ext;
              vector<string> extensions;
              //input extensions you want to keep
              while ( cin >> ext ){
                      extensions.push_back( ext );
              }
              
              vector<url>::iterator i = urls.begin();
              //run through each url
              while ( i != urls.end() ){
              
                      vector<string>::iterator j = extensions.begin();
                      //check to see if the extension of the current url matches any of
                      //the ones typed by the user
                      while ( j != extensions.end() ){
                              //if we found a match
                              if ( *j == (*i).extension ){ 
                                   out << (*i).address << endl;
                                   //no need to walk through the rest
                                   j = extensions.end();
                              }
                              
                              else{
                                  //otherwise, check the other extension the user typed
                                  j++;
                              }
                      }
                      //check next url
                      i++;
              }
         }
         return 0;
    }
Nowhere near as short. haha.

Last edited by dra; 08-09-2005 at 06:20 PM.
dra is offline   Reply With Quote
Old 08-09-2005, 09:46 PM   #12
Banned
 
Join Date: Jun 2005
Posts: 594
i am always sad when i have to block major_smalls from
a contest, cause he jsut a good contestant, and always has
lovely code. i didnt test your code major, but by looking at
it i dont think it does what it suppose to.

needless to say dra was the only competiter so it goes
unspoken that your the winner .

thank youf or competeting, btw overall i thought your code
was lovely and was almost exactly what i was wanting to see.

here in a day when i finish some changes i had planned for me
code i will post it to show you what insipred this competetion.

btw i hoep you two will compete in one of the other competetions
i jsut posted there welcome to all levels.
ILoveVectors is offline   Reply With Quote
Old 08-10-2005, 12:01 AM   #13
Registered User
 
major_small's Avatar
 
Join Date: May 2003
Posts: 2,787
you should test it... it does exactly what it's supposed to (unless I read it wrong), except in one case, which I could have easily fixed, but didn't because I wasn't in the competition

Quote:
jshao@MCP ~/Programming/C++/HTMLFind $ ./GetIt test.html
Enter the Filename (including extension): blah.jpg
http://www.image.com/images/blah.jpg

jshao@MCP ~/Programming/C++/HTMLFind $ ./GetIt test.html
Enter the Filename (including extension): blah.bmp
http://www.image.com/images/blah.bmp

jshao@MCP ~/Programming/C++/HTMLFind $ ./GetIt test.html
Enter the Filename (including extension): blah.gif
http://www.image.com/images/blah.gif

jshao@MCP ~/Programming/C++/HTMLFind $ ./GetIt test.html
Enter the Filename (including extension): blah.wma
http://www.image.com/video/blah.wma

jshao@MCP ~/Programming/C++/HTMLFind $ ./GetIt test.html
Enter the Filename (including extension): blah.avi
video/blah.avi

jshao@MCP ~/Programming/C++/HTMLFind $ ./GetIt test.html
Enter the Filename (including extension): blah.asf
http://www.image.com/video/blah.asf

jshao@MCP ~/Programming/C++/HTMLFind $ ./GetIt test.html
Enter the Filename (including extension): blah.mp3
http://www.image.com/music/blah.mp3

jshao@MCP ~/Programming/C++/HTMLFind $ ./GetIt test.html
Enter the Filename (including extension): blah.cpp
ftp://ftp.image.com/music/blah.cpp
__________________
Join is in our Unofficial Cprog IRC channel
Server: irc.phoenixradio.org
Channel: #Tech


Team Cprog Folding@Home: Team #43476
Download it Here
Detailed Stats Here
More Detailed Stats
52 Members so far, are YOU a member?
Current team score: 1223226 (ranked 374 of 45152)

The CBoard team is doing better than 99.16% of the other teams
Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374)

Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT

Last edited by major_small; 08-10-2005 at 12:06 AM.
major_small is offline   Reply With Quote
Old 08-10-2005, 12:07 AM   #14
Banned
 
Join Date: Jun 2005
Posts: 594
you read wrong a little bit, i think, you should be finding all of a
specific file type not so much a specific filename.type


a user should enter .jpg
and the fiel shoudl return all links to a .jpg file
reguardless of wether its a href or img src;



so if the file type asked was .jpg in the following

Code:
<a href="http://www.image.com/images/blah1.jpg>words</a>
<a href="http://www.image.com/images/blah2.jpg>words</a>
<a href="http://www.image.com/images/blah3.jpg>words</a>
<a href="http://www.image.com/images/blah4.jpg>words</a>
<a href="http://www.image.com/images/blah5.jpg>words</a>
<a href="http://www.image.com/images/blah6.jpg>words</a>
<a href="http://www.image.com/images/blah7.jpg>words</a>
<a href="http://www.image.com/images/blah8.jpg>words</a>
<a href="http://www.image.com/images/blah9.jpg>words</a>
<a href="http://www.image.com/images/blah10.jpg>words</a>
<a href="http://www.image.com/images/blah1.avi>words</a>
<a href="http://www.image.com/images/blah2.avi>words</a>
<a href="http://www.image.com/images/blah3.avi>words</a>
<a href="http://www.image.com/images/blah4.avi>words</a>
<a href="http://www.image.com/images/blah5.avi>words</a>
<a href="http://www.image.com/images/blah6.avi>words</a>
<a href="http://www.image.com/images/blah7.avi>words</a>

the following would be returned with the choice of .jpg as extention

Code:
<a href="http://www.image.com/images/blah1.jpg>words</a>
<a href="http://www.image.com/images/blah2.jpg>words</a>
<a href="http://www.image.com/images/blah3.jpg>words</a>
<a href="http://www.image.com/images/blah4.jpg>words</a>
<a href="http://www.image.com/images/blah5.jpg>words</a>
<a href="http://www.image.com/images/blah6.jpg>words</a>
<a href="http://www.image.com/images/blah7.jpg>words</a>
<a href="http://www.image.com/images/blah8.jpg>words</a>
<a href="http://www.image.com/images/blah9.jpg>words</a>
<a href="http://www.image.com/images/blah10.jpg>words</a>

Last edited by ILoveVectors; 08-10-2005 at 12:11 AM.
ILoveVectors is offline   Reply With Quote
Old 08-10-2005, 12:27 AM   #15
Registered User
 
major_small's Avatar
 
Join Date: May 2003
Posts: 2,787
oh I see... you were looking for file extensions, not individual files... meh.. it would have been the same amount of code
__________________
Join is in our Unofficial Cprog IRC channel
Server: irc.phoenixradio.org
Channel: #Tech


Team Cprog Folding@Home: Team #43476
Download it Here
Detailed Stats Here
More Detailed Stats
52 Members so far, are YOU a member?
Current team score: 1223226 (ranked 374 of 45152)

The CBoard team is doing better than 99.16% of the other teams
Top 5 Members: Xterria(518175), pianorain(118517), Bennet(64957), JaWiB(55610), alphaoide(44374)

Last Updated on: Wed, 30 Aug, 2006 @ 2:30 PM EDT
major_small is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Beginners Contest, Others our Welcome ILoveVectors Contests Board 42 08-02-2005 06:53 PM
Expression Evaluator Contest Stack Overflow Contests Board 20 03-29-2005 10:34 AM
WANTED: Contest Master kermi3 A Brief History of Cprogramming.com 15 01-23-2003 10:15 PM


All times are GMT -6. The time now is 05:13 AM.


Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.0 RC2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22