Now blogging at diego's weblog. See you over there!

the power of simplicity


and this from the it-seems-obvious-in-retrospect dept...

hit by a virtual hammer

This rant has been brewing in my head for a couple of weeks now, but every time I started writing it fizzled out, for whatever reason. I can see that this might seem obvious to a lot of people. It wasn't to me though! :)

The thought process behind this started when I spent a couple of hours writing my google-rss bridge. While I had written an RSS reader component, I had never written an application that created RSS. I was pretty astonished at the power of a standard that can be used so easily both ways. It was like being hit by a virtual hammer. I thought about it further when I added Atom support to it, and I mentioned some of these issues in the entries in a roundabout sort of way...

What jelled today in my head was the distinction of three elements of the process that weblogs and RSS make possible.

The three elements are: Content creation, publishing, and access.

The magic is that each step can happen in a decentralized fashion. All tied together through the thin ice of a few XML and HTML tags.

HTML is very similar (and similarly disruptive), but less oriented towards decentralization because it has evolved to be oriented towards display rather than automated consumption (something that, at least theoretically, CSS was supposed to fix).

Understanding that split between creation and publishing is what brought everything together for me. Where I write the content (creation) has nothing to do with where it resides (publishing). The last part, access, is obviously separate. The other two weren't, at least not to me, before today.

And why is the creation/publishing split important? Because it's what drives the full decentralization of the process, and the one that makes simplicity a lot more relevant than before. Without full decentralization, simplicity is a lot less powerful. Decentralization+simplicity means that everyone's invited to the game. After all, if creation and publishing are together, if you need expensive or complex centralized infrastructure (and let's face it, infrastructure to publish web content is no child's play) to set up a content system, no matter how simple the content format or protocols themselves are, it will still have limited impact.

the price of complexity

Complexity plus its associated cost and monopolies (or oligopolies) go hand in hand, since they constitute one of the most important barriers of entry. But weblogs and feeds, as the web itself, have split the lever of power: now the glue that ties the components together is as much a point of control as actually creating the clients or the servers themselves. In the web in particular, as better development tools for both clients and servers have evolved, the format itself became the most important element that brought complexity into the equation. And the reason, I think, is the separation that happened split between creation, publishing, and access.

Consider, first, HTML. In the days of HTML 2.0, it was relatively trivial to write a web browser. The biggest problem in writing a browser was not, in fact, in parsing or displaying HTML: it was in using the TCP/IP stacks that at the time were difficult to use. Over time, the shift of complexity into HTML has brought us the situation that we have today, where writing a standards-compliant browser requires huge investment and knowledge, and the earlier barrier of entry (the network stack) is now easy to use and readily accessible. Sure, there are many web browsers in the market today. But power is not distributed evenly. HTML 4.0 raised the bar and in fact IE 4 won over many people simply because it worked better than Communicator 4 (I was one of those people).

What I realized today is: there's a huge side effect that the format has on content access: monopolizing the market for access becomes easier the more complex the content format is.

My point: This should give pause to anyone in the "go-Atom-crowd", including me.

Keeping the barrier of entry low applies in more than one case, of course, but here it's crucial because weblogs, creating RSS feeds and accessing them, etc., is just so damned easy today. This allows developers to concentrate on making the tool good rather than dealing with the format.

Dave has said things along these lines repeatedly, but honestly I hadn't fully understood what he meant until now.

Let's see if we're all on the same page. The message is: It is no coincidence that basically every single RSS reader out there is high-quality software.

Big and small companies, single developers, groups, whatever.

A simple statement, with profound implications.

Back to Atom.

I am not implying that the (slight) additional complexity found on Atom will make it fail. I am saying that its increased flexibility brings on complexity that also increases the barriers of entry for using it with the consequent loss of vitality on the area. Without proper care, these barriers can slowly chip away at the ease with which tools can be created, and in the process split the fields into incomplete or low-quality software used for tinkering and mainstream software available for general users, with a lot of entries in the first category and a few on the second.

and why is this important?

This stuff matters. There are many examples today of how weblogs are changing things, from influencing politics to breaking down proprietary software interfaces and affecting how the spread of news itself happens. In my opinion a big part of that is because weblogs really, finally, put the power of publishing on individuals' hands, something that "the plain web" had promised but had failed to do (after all, here's-a-picture-of-my-dog-type-homepages were around for quite a while without anything interesting happening). But if barriers are raised, the Microsofts and the AOLs suddenly have a fighting chance.

Jump to the future: Microsoft announces support for Atom, built into a new IIS content-management system. Great! Says everyone. Then you look at the feed itself and you discover that every single entry is published using content type "application/ms-word-xml". This wouldn't be new. Already Microsoft claims to great effect that Office supports XML but everyone knows that trying to parse a Word XML document is literally impossible. XML is too generic to be taken over though, in a sense it was designed for that, as a template for content formats. HTML wasn't, but it was subverted anyway. RSS is, quite purposefully I think, holding out. With Atom coming up, there's a chance it might happen.

I hate to point out problems without also proposing at least one possible solution. So: An extremely simple way of getting around this problem would be to specify that text/html content is required on a feed. If it's not, then your feeds don't validate. That simple.

Call it the anti-monopoly requirement. :-) The same focus on simplicity should be, IMHO, the drive of every other feature.

Another example: Today, with email, Microsoft has used MIME to great effect to screw up clients that are not Outlook or Outlook Express. It couldn't happen with simple plain text. But MIME allowed it. The result: people can send each other Microsoft-generated HTML that only the Outlook+IE combination can display without hacks.

We can't allow that to happen.

Some might argue (quite persuasively) that it doesn't really matter whether content-creation is slightly more complex. To that I'd say: it's just my opinion, but I think it does. It could also be argued that this is all simply a matter of evolution, it was bound to happen, etcetera. But it wasn't "bound to happen". We are making it happen. It's in our hands.

If the rise of the web was a lost opportunity in this sense, well, amazingly, we have been given a second chance.

Let's not blow it.

Categories: soft.dev
Posted by diego on September 10 2003 at 9:03 PM

Copyright © Diego Doval 2002-2011.
Powered by
Movable Type 4.37