Now blogging at diego's weblog. See you over there!

now with Atom support: java google feeds bridge


After last week's experiment of a Google-RSS bridge in Java, I took the next step and decided to check out how hard it was to generate a valid Atom feed as well. The result is an update on the Google bridge page and new code.

The idea, as before, was to write code that could generate valid feeds with as little dependencies as possible (It might even be considered a quick and dirty solution). For reference I re-checked the Atom Wiki as well as Mark's prototype Atom 0.2 feed. In the end, it worked. Adding support for Atom begged for some generalization and refactoring (which I did) but aside from that adding Atom support took a few minutes. Here are some notes:

  • ISO 8601 Dates are terrible. I'd much rather Atom had used RFC 822 dates, which are not only easier to generate but way more readable. It's true, however, that once you have the date generator working it doesn't matter. But boy are they a pain. I put forward my opinion on that when date formats were being discussed, but I didn't get my way.
  • I was confused at the beginning regarding the entry content, particularly because of the more stringent requirements that the feed puts on content type. For example, the content of an entry must be tagged with something like "<content type="text/html" mode="escaped" xml:lang="en">". Now, I must be honest here: about a month ago there was a huge discussion on the Wiki about whether content should be escaped, or not, and how, but I didn't think it was too crucial since, on the parser side, which I added way back when in July to clevercactus, it's pretty clear that you get the content type and you deal with it. But I was sort of missing the point, which is generation. When generating... what do you do? Do you go for a particular type? Is it all the same? Would all readers support it? The pain of generating multiple types would seem to outweigh any advantages...Hard to answer, these questions are, Master Yoda would say. So I went for a basic text/html type enclosed in a CDATA section. (Btw, enclosing in CDATA doesn't seem to be required. The Atom feed validator was happy either way).
  • Another thing that was weird was that the author element was required, but that it could go either in the entry or the feed. I understand the logic behind it, but it's slightly confusing (for whatever reason...)
Overall, not bad. But Atom, while similar to RSS, is more complex than RSS. While I have been able to implement a feed that validates relatively easily, it concerns me a bit that I might be missing something (what with all those content types and all). Maybe all that's needed is a simple step-by-step tutorial that explains the "dos and don't dos" for feed generation. Maybe all that's needed is a simple disclaimer that says "Don't Panic!" in good H2G2 style.

Is it bad that Atom would need something like a tutorial? Probably. Is it too high a price to pay? Probably not. After all, more strict guidelines for the content are good for reader software. I thought "maybe if there's a way to create a simple feed without all the content-type stuff..." but then everyone would do that, and ignore the rest, wouldn't they.

Of course, maybe I misunderstood the whole issue... comments and clarifications on this area would be most welcome.

I guess there's no silver-bullet solution to this. The price of more strict definitions is loss of (some) simplicity. The comparison between a language with weak typing (say LISP) and one with strong typing (say, Java) comes to mind when comparing RSS and Atom in this particular sense. I think that I would go with RSS when I can, since it will be more forgiving... on the other hand I do like strong typing. But should content be "strongly typed"? I'll have to think more about this.

Interesting stuff nevertheless.

PS: there's a hidden feature for the search. It's a hack, yes. It might not work forever. Still worth checking the code for it though :-)

Categories: soft.dev
Posted by diego on September 5 2003 at 10:26 PM

Copyright © Diego Doval 2002-2011.
Powered by
Movable Type 4.37