| d2r diego's weblog |
on rss autodiscoveryAfter reading Jeremy's post on creating a sort of "auto discovery for RSS 2.0" (and agreeing that it was a great idea) I thought, what the hell. Why not try it on for size? Here it goes, then... The idea is basically that a site could publish a centralized directory of all the feeds it serves. This would allow auto-discovery by aggregators and suggestions to users when new feeds come online on sites they already watch. Jeremy was proposing to do it on OPML, and Russ floated the idea of using RSS directly, based on this, which I liked. To get some actual idea of how something would look like, I decided to whip up possible versions of this both in OPML and in RSS. So I spent some time looking at both the OPML and RSS specs and thinking how they would be used in this case (the use, of course, from my point of view which is that of a user and someone who would have to add this to an aggregator :) -- it's possible that I missed something that would be obvious to a content producer). The results of my little experiment are here: in OPML and in RSS, for a fictitious news site "News4Humans". RSS, being a richer format than OPML (and possibly more generic as far as content is concerned), has no problem accomodating all the elements. There are a few elements that I added to the OPML version to mirror the data exposed; even though they are not included in the OPML spec it's ok since the spec does not preclude adding new elements. That said (and as Dave mentioned specifically regarding the issue of recursive inclusion), everyone would have to agree on them or they would be useless--possibly an addendum to the spec would be useful as well. One of the main differences is in structure. OPML supports recursivity, RSS does not. So where OPML can define the category and the feeds for that category as sub-tags, RSS needs to use the category tag, essentially making the structure flat. This seems to be fine to me, unless "deeper recursivity" (or is it recursiveness?) is needed--but I can't think of a news site with more than one level down from the main category at the moment, so I let it stand. Second, the OPML version contains two additional tags: link to specify the feed on feeds'... well, link :) (to match the same tag of RSS, which could potentially be useful for redirects) and dateCreated, a per-entry element, the idea with this tag being that the aggregator can record when was the last time the feed on feeds was checked, and diff against this date to very easily find out which ones were added since the last check (of course, keeping a full list and doing a diff on that is possible, but then again if there are, say, 50 feeds, and the user subscribes only to one, the aggregator would have to keep all 50 to do a proper diff against number 51, which seems kind of wasteful. RSS, incidentally, supports this functionality by its own basic item date tag. And, again, the date could be useful for redirects if necessary: changing the date on an item you already knew about implies that it has moved. As I noted in the comments on Jeremy's entry, I sort of instinctively thought that RSS was a better idea. The OPML version however looks enticingly simple and still functional. Hm. Surprising. Anyway, which one do you like best? Update: As I mentioned in the comments (replying to Zoe's idea of establishing hierarchy through multiple files) the issue of hierarchy is not terribly clear with RSS. I see two ways of doing it:
Update #2: Another advantage of using RSS as-is that I keep forgetting to mention is that the language for the feed can be specified. You could automagically define that you only want to see feeds in a certain language and the aggregator could automatically disregard anything else. The same functionality for OPML would require adding a tag for that purpose. Update #3: I've just noticed that, in the RSS spec, the category element has a domain attribute. If the domain is used to point to sub-category feeds, then hierarchy can be achieved cleanly. Therefore a simple solution that doesn't require any changes to the RSS spec (and as far as I can see doesn't bend its meaning either) could be as follows:
Posted by diego on September 13 2003 at 12:55 AM Comments (please see the comments & trackback policy).
Lets stick to RSS! Less is more! As you mention on Jeremy's blog: "There's a certain symmetry to it " And symmetry is esthetically pleasant among other things... Regarding OPML hierarchical structure vs RSS flat one, this could be "solved" by simply having your RSS feed point to another RSS feed, which in turns point to another one and so on to... which is a nice way to create a lightweight (and dynamic) directory as it doesn't require specifying the entire hierarchy up front... Good work :) Posted by: Zoe at September 13, 2003 9:12 AMZoe, thanks. Interesting idea that of using different files to express the hierarchy. I was wondering what could be done with RSS if more that one level was necessary. Now, what if say, two levels are necessary for one category but only one for another? How would we differentiate between links to other sub-feeds-on-feeds and actual feeds? Maybe using the GUID or one of the optional subelements of the item tag? Posted by: Diego at September 13, 2003 10:30 AMRegarding "differentiation"... a simplistic scheme would be to agree on a "standard" extension for the RSS link... perhaps "rss.xml"... or alternatively... add an attribute to the RSS's link tag in the same fashion as HTML's link tag are annotated for "auto discovery"... something like: <link rel="index" type="application/rss+xml">/url/to/directory/feed/rss.xml</link> The relation attribute could tell what the purpose of the link is. And the type its format. Or something like that. Posted by: Zoe at September 13, 2003 11:10 AMRegarding using the source tag as a back pointer... very good point :) This could indeed be (ab)use to provide a mechanism for traversing the hierarchy back and forth. Posted by: Zoe at September 13, 2003 11:19 AMI considered that and thought that solution would be ok. The problem is that the link element in RSS doesn't allow attributes, so this use would effectively create non-valid feeds. I'd much rather use a namespace (judiciously!) to add an extra tag or two that would do the trick, but I'd rather avoid that too. Maintaining simplicity should be a goal for this methinks. :) Posted by: Diego at September 13, 2003 11:21 AMRegarding RSS 2.0 rumored frozen state... recent events have proven beyond reasonable doubts that if enough pressure is applied any supposedly inflexible structure will bend... Regarding namespace, they overly complicate things... and have an unfortunate tendency to get ignored altogether :/ Regarding the use of the category's domain attribute to express a "virtual hierarchy"... this sounds fine... perhaps there could be a way to leverage that... but it doesn't seem to address the semantic of what the domain is pointing too... would it be the content of the directory... or its structure? And is it pointing to anything at all...? An additional complication (?) is that there could be several category tags... what would that mean? Multiple hierarchies? And how does one know that the domain has any meaning beside being a random collection of strings? Perhaps there is a simple answer to all those questions which is presently escaping me... Posted by: Zoe at September 13, 2003 11:50 AMHere's my attempt at an OPML feed-list for multi-blog sites using MovableType: http://www.movabletype.org/support/index.php?act=ST&f=14&t=27995 Strangely I researched this before I saw your recent thoughts on this issue. Group-conciousness at work eh? By the way, consider "index.opml" in the root of the site for auto-discovery. It seems logical to me. Posted by: -lc- at September 23, 2003 9:25 AMCopyright © Diego Doval 2002-2007.
|
