Thursday, March 26, 2009

Parsing large XML document - STAX

We came across a scenario where the size of the XML content to be parsed/transformed varied between 1 MB to 10MB. We need to compute data from the available XML, which requires traversing the whole XML document. Till then we used XSLT, where transformation is given more importance and less importance to computing derived data from the XML.

I tried different options before settling down with STAX.

Santiago Pericas-Geertsen's discusses about DOM Vs JAXB in this blog.
http://weblogs.java.net/blog/spericas/archive/2005/12/dom_vs_jaxb_per.html

The following document discusses about DOM, SAX, STAX and compares different STAX implementation
http://java.sun.com/performance/reference/whitepapers/StAX-1_0.pdf