and now it's StAX's turn


It seems I need some bug spray or something...

I might be wrong (corrections and comments most welcome!) but I think I've found a bug in StAX 1.0.

The bug is as follows: when parsing an element of the form:

<id>&lt;code01&gt;</id>
Which should return
<code01>
when calling getElementText() (or when parsing based on CHARACTER event types) StAX actually returns:
<code01
To prove it, I wrote this small test program that uses both StAX and kXML 2 (which implements the Common XML Pull Parsing API) to parse the same XML document (included in the program as a String, and read through a StringReader).

This bug is a deal-breaker for my use of StAX, and what's much worse is that I have no way of looking at the code to fix it (and yes, I've tried parsing after that element to see if there's more text, but it seems that StAX is just making the "&gt;" at the end disappear). Yes, StAX was supposed to be hosted at codehaus now, but when I go to the site there's nothing there in the way of sources, the JSRs don't include sources (they reference private BEA packages) and there is no indication of when or where this might change.

So I guess I'll have to switch everything to one of the other parsers, just as kXML. Oh, well.

Categories: soft.dev
Posted by diego on July 12 2004 at 6:48 PM
Comments (please see the comments & trackback policy).

Diego,

As I mentioned in my post re open sourcing of StAX, the source code is in the CVS. Here are the details:

cvs -d:pserver:anon@cvs.stax.codehaus.org:/scm/stax login
cvs -d:pserver:anon@cvs.stax.codehaus.org:/scm/stax checkout stax

The bug you mentioned is pretty bad though. Someone should writea unit test suite for StAX, possibly leveraging existing SAX or DOM test suites.

Posted by: Don Park at July 13, 2004 12:14 AM

Diego, it would be good to report this problem on Stax reference implementation Bugzilla (at http://www.extreme.indiana.edu/bugzilla/), if it's not there yet. It certainly is a nasty bug, and hopefully can be fixed quickly.

That said, one thing worth noting is that StAX is an API like DOM and SAX, so that the reference implementation is not the only implementation existing.
You can find other implementations at:

http://stax-utils.dev.java.net/

Granted, at this point there's just one more implementation (plus, I'm the author of the other one, so I don't want to comment too much on it... but I'm pretty sure you won't hit the specific problem you mentioned), but you may want to give the alternative implementation a try. And I do know there are at least 2 other implementations being worked on which hopefully get released soon: if so, they will get added to stax-utils page.

Finally, now that the ref. impl. sources are available from codehaus I expect obvious bugs to be weeded out fairly fast.

Posted by: Cowtowncoder at July 15, 2004 8:16 PM

Copyright © Diego Doval 2002-2007.
Powered by
Movable Type 3.35