These functions require Swing API.
parseHTML() reads a HTML document and executes callback functions, defined in handlers, on the HTML elements.
When handlers is a Map or a pnuts.lang.Package, it can have zero or more key-value mappings. The key element should be one of the following function names and the value should be the function.
handleText(char[] data, int position)handleStartTag(HTML.Tag tag, Map attributeSet, int position)handleEndTag(HTML.Tag tag, int position)handleSimpleTag(HTML.Tag tag, Map attributeSet, int position)handleError(String errorMessage, int position)
javax.swing.text.html.HTML.Tag class and javax.swing.text.html.HTML.Attribute class can be accessed with the short names; Tag and Attribute. They provides static fields that represent the tags and the attributes of HTML documents.
For example, the following script extracts links in a HTML document.
use("html.parser")
function htmllinks(url) htmllinks(url, reader(url))
function htmllinks(url, r) {
w = list()
callback = $(
function handleStartTag(tag, aset, i) {
if (tag == Tag::A){
ref = aset.get(Attribute::HREF)
if (ref != null) push(w, getURL(url, ref))
}
}
)
parseHTML(r, callback, true)
w
}