Saturday, April 4, 2009

Abandoned trying to work with webkitgtk

The biggest problem when using webkit is trying to access the DOM structure of the rendered webpage. When using webkit for practical purposes you only have two choices, QT and GTK. The third option working with the ObjectC bindings unless you are on MacOSX is just not practical.

Both QT and GTK are nice GUI, QT has the upper hand in that thier documentation is much nicer. With GTK you have the advantage that it compiles with pure C, so it is much easier to make other language bindings with it. Initially when webkit was released with QT they said that they would provide in the next release the ability to directly access the underlying DOM structure with the API. But now they have said that they are considering it. There is a way to access the DOM structure but it is only by using javascript.

When using webkitgtk there is a patch that you can apply to access the DOM structure directly. The problem is I wish to use python. But I can not get the pywebkitgtk to compile. This is the ultimate solution though.

My solution is to now make a python -> javascript bridge on QT. Javascript object members and properties can be examined at run time. The problem is that QWebFrame::evaluateJavaScript only returns a QVariant. QVariant can only hold basic types such as ints,strings etc. But this is not a problem. Have the bridge examine the result. If it is a basic type just pass it back. If it is a javascript object pass back the object using JSON with all the properties and members contained. Basically this is possible to do because of the "in" operator in javascript.

I have a proof of concept code with pyqt and I will post it once I fine tune it. It seems to work very well since webkit is much faster then internet explorer and firefox.

No comments:

Post a Comment