Parsing HTML to get script variable value

Very simple example of how this could be easy using a HTMLAgilityPack and Jurassic library to evaluate the result:

var html = @"<html>
             // Some HTML
               var spect = [['temper', 'init', []],
               ['fw\/lib', 'init', [{staticRoot: '//'}]],
             // More HTML

// Grab the content of the first script element
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
var script = doc.DocumentNode.Descendants()
                             .Where(n => n.Name == "script")

// Return the data of spect and stringify it into a proper JSON object
var engine = new Jurassic.ScriptEngine();
var result = engine.Evaluate("(function() { " + script + " return spect; })()");
var json = JSONObject.Stringify(engine, result);




Note: I am not accounting for errors or anything else, this merely serves as an example of how to grab the script and evaluate for the value of spect.

There are a few other libraries for executing/evaluating JavaScript as well.

Leave a Comment