It needs two parts, first part (program or the script) for the downloading a file and the program for the processing and saving
for the first part you can use php (but you should read help for it because it has very many differences from C, so there is hard to get any character from a string because it is not an array of chars and so on)
but it is flexible for the work with the web and can [ process+ ] save data to the file
bash works only under linux (only in bash at any system, what it will use ? wget ? wget is not uniform too) and php can be both in linux and windows (for windows installation takes ten megabytes)
then, when you have good data on your disk, you can apply main program to them
so, if you use wget it can have sets, this sets can be changed by the user at any time and so ? your program will stop, because will not find the downloaded file (i.e. we need to make check for the setting of wget)
you know it will save the pictures, ok do you want pictures ? no ? so you should include picture exclusion to the script, and .js and .css
you can make only the tree from the file and then load it with processing