I need to write an application which should be able to download the content of a .html page. This .html page contains a lot of JS code, which uses some document.write and other stuff.
so how can i "render" a html page (execute JS) ?
You can use WebBrowser.
Typically you don't need to display it at all, it can run on the background execute all the JS code and then you can try to find the url.
This is not that straightforward and there are some tricks and things you need to solve on the long run. Try searching for examples on the WebBrowser class and you can go from there.
i tried that and somehow, the WebBrowser object does not seem to contain any data.
it has null's and ""s everywhere.
WebBrowser web = new WebBrowser();
// breakpoint here
what am i doing wrong?
I just checked out these JS interpreters. they seem to be able to evaluate functions like "x*2" but cannot execute a whole HTML page?
any suggestions? what interpreter to use?
Parse the HTML, ripping out the JS and feed it to your JS engine.
It's all that any browser would do for you anyway.
The reason is that with Thread.Sleep() you will simply block the current thread, so it is not that reliable.
Originally Posted by Devils Child
You have to do something like this:
The first part, the most important, while ensure that the Document is completed, thus the page loaded. BUT, sometimes it loads more than once. That is why I have an additional delay. Note that the Delay function is executed from another Thread (delayThread) so it doesn't stop current thread.
private void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
completedLoading = true;
public void LoadPage(string url, bool wait = true)
completedLoading = false;
delaying = true;
delayThread = new Thread(new ParameterizedThreadStart(delayDel));
public static void Delay(object sec)
Thread.Sleep((int)sec * 1000);
delaying = false;
Just a note, I had problems finding links even using the method above. For example I was trying to get links from search engine pages. For most of the part I could, but still somethings were missing, so it might not always be possible to do what you want.
that webbrowser example of you seems to be the way to go, but it uses 100 % CPU and somehow does not load anyway?
The page should load. That is what DocumentCompleted event is there for. I am assuming you included also
somewhere in your code before.
browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(browser_DocumentCompleted );
Yes, when you have a while loop you will use 100% of your CPU. You can change this if you want by any method you want, but in my case there was no reason to optimize.