Now blogging at diego's weblog. See you over there!


I had to do it. I just went across the road to Nancy Hands and had a Pint of Guinness. Sooo good....

Categories: personal
Posted by diego on September 30, 2003 at 7:08 PM

...and done!

Yay! thesis submitted!!

So left everything ready, slept a couple of hours and went out in the morning to submit the thesis. Before that (at around 5:30?), something interesting that happened was my alarm clock breaking. I had bought the clock two years ago almost to the day when I just got to Dublin, and today the night light, that is supposed to turn on when you change a setting and then off after a couple of seconds, just stayed on. Which meant that the battery wouldn't have lasted long. Funny how the clock that I got when I started the thesis stops working the exact day that I'm submitting it. Anyway, I ended up using the alarm on the cell phone. Not that I needed it much... (the alarm I mean).

So I got to campus, rain was on and off throughout (the weather seems to have returned to normal a bit, at least for now). Bound the two copies of the thesis and got to the Graduate Studies office around 10:15 (they are open only between 10:00 and 12:00 and 2:00 and 4:00--go figure-- and considering that I could only submit it today getting the timing right is about as difficult as planning a space launch. I didn't get ESA involved though).

So, submitted it. Got a yellow slip that I had to fill out by hand with my student info, thesis info, etc. The woman receiving it sent me to some Fees Office to which I've never been before outside of campus. Got there. Guard on the entrance. Can I help you? Yes, I'm looking for the fees office, I say. Okay, take a seat. I complied. Waited. Nothing seemed to happen. The guy was sitting there and not calling anyone, or doing anything in particular, and I started to wonder how I was supposed to know when it was my turn.

As it turns out it was basically Token Ring (haven't they heard Ethernet won?). They had one "pass" card that you got and went up to the office. Then once the previous person was done they gave the card back to the guard, who gave it to you, and up you went. And so on. So I went up. Got the piece of paper stamped. (This whole runaround with the paper-stamping business really put in perspective the fact that my work is on self-organization of large scale networks!). Went back to the GSO. By then it was pouring, but I almost didn't notice. Gave the woman the piece of paper. She folded it, put it in an envelope.

"Well, that's it! Congratulations!" she said.

"That's it?" I said, thinking, no rite of passage? Nothing? Then I said: "This is a bit ... anticlimatic..."

"Yes. Heh." She said. "Well, congratulations again. Go have a pint."

Ah, right. Ireland. That's the rite of passage. :-))

Categories: personal
Posted by diego on September 30, 2003 at 4:53 PM


... to submit my thesis. In the end a bit more work than I thought due to ... unforseen circumstances. All printed. Will bind it shortly before submission. Crossing my fingers so that they don't come up with any strange objection to the format, the font, or something of the sort. Now off for a couple of hours of sleep--and then into the city. Almost there! More later.

Categories: personal
Posted by diego on September 30, 2003 at 4:42 AM

sunday entertainment: the Riemann Hypothesis

Working on some stuff on n-dimensional topologies I remembered the Riemann Hypothesis. I'm not sure when I first read about it... but it keeps popping up in my head (simply because it's a challenge--not that I have any delusions that I can actually solve it! :)). So as a way of relaxing a bit from work, here's some background on that. Not that it's something useful, but it should be at least entertaining.

A couple of years ago, a $1,000,000 prize was created for whoever came up with the proof. What I didn't know was that they actually had seven "millenium prize problems" of which the Riemann Hypothesis was one. Here is the list of all seven problems. One million per proof. Not bad!

Going back to the Riemann Hypothesis. Proving it would be interesting for number theory and the distribution of prime numbers, but it would have no direct practical applications whatsoever. Otherwise, I can see the headlines (maybe for The Onion?):

UN Security Council Declares Understanding the Distribution of Prime Numbers is Priority One - Troops To Be Deployed in the Real Region of C with R > 1 - President declares "we will hunt down these prime number folks and we will characterize them. If you ask me, there's something evil about a number being divisible only by itself and one."
Joking aside, the new mathematical techniques that usually have to be invented to solve these open problems do have applicability. But I digress.

The Hypothesis is generally considered one of the most important unproven hypothesis in mathematics, and I've read somewhere that some people think it was even more important than Fermat's famous Last Theorem. To put things in context, Fermat's Last Theorem stated that given:

xn + yn = zn

there are no integer solutions for n > 2 and x,y,z != 0. The statement of Fermat's Last Theorem is relatively simple and self-contained. the Riemann Hypothesis is not. Consider, from the prize page:
The Riemann hypothesis is that all nontrivial zeros of the Riemann zeta function have a real part of 1/2
Sure it reads easier than Fermat's theorem, but it gets away with that by putting all the complexity in the definition of the Riemann zeta function.

What I find it fascinating how you actually get to it. After some reading, it would appear to go like this: Riemann was trying to derive a formula that would calculate the number of primes lower than a given boundary number n. In doing this, he started to look at an infinite series based on a complex number, s. That series is defined by the Riemann zeta function:


which converges. Okay, given that function, the Riemann Hypothesis is saying that all the nontrivial zeros exist only at values of s that have a Real component of 1/2.

And this is one of the things that I find great about mathematics: you start pulling out threads and in the end you are left with just one or two hypothesis. If you prove those, everything else falls into place, domino-style. And what you're actually proving often appears to have no relationship at all with what you were originally interested in!

For more there's this cool page in MathWorld with a lot of interesting information on the Hypothesis and some of the attempts to solve it so far.

As I said, maybe not useful, but at least entertaining. :)

Categories: science
Posted by diego on September 28, 2003 at 3:56 PM

reading the matrix tea-leaves

Another break and I've been watching the recently-released trailers and tv-spots for The Matrix Revolutions (funny: "released". Software is data. Data is software. Ahem. Moving on...)

So I thought I'd write down a few wild, ridiculously early speculative comments, going back to all what was discussed following my review of Reloaded. Since this is based on about 3 minutes of video, I don't expect my speculations to be too accurate, but what the hell.

First, a couple of complaints. It seems that overbearing language is still an overriding necessity. The trailer has this set of sequences where everyone says things like "I believe in him. He believes in us. I don't believe in The One. I believe in him. We have to believe." And on and on and on, in quick succession. Jeez, I get the point, enough already with the belief. Is this a movie, or a new religion?

Then there is a short scene where Neo is having a nice chat with what seems to be a personification of the Matrix itself, telling "it" that Agent Smith is out of control. Looks bad. Baad bad bad bad. What is up with the Matrix antropomorphising itself? The whole idea of Matrix-like concepts is that these systems are meta-human-conscious, and as such not only they exist on a different plane so to speak, but also cannot relate to the human mind. Plus, this whole notion of a piece of software running around "out of control" within a system as refined as the Matrix is patently ridiculous (and besides, the Matrix needs a puny human to find out that there's a problem?). I wonder how, or even if, they are going to explain that. Maybe they'll say that the Matrix subcontracted systems development to Microsoft, and there was a buffer overflow in port 25 or something of the sort... but come on, even msblaster was stopped, wasn't it? Should we be looking forward to Norton Antivirus--The Matrix Edition?

Another thing that is apparent is that the Oracle is gone. Or rather, there's a different actress in her place. Younger. Similar-looking. I knew that the actress that played the Oracle in the first two movies had, sadly, died mid-shooting of the sequels, but I wasn't sure if she had finished her role in the story. It seems she didn't. I'd bet the new Oracle is supposed to be the old Oracle except younger, through some twist of plot... I mean, fate. As far as other characters are concerned, it appears that the Merovingian is back, and it has an ace hidden somewhere. Which would fit with the speculation that it was a Neo in a previous iteration of the Matrix...

Third, and more importantly, it seems to me that that the matrix-within-matrix theory is a more than likely plot-twist. Characters in the trailers for Revolutions never say "Matrix" and "Real World" (as in the previous two movies, particularly in the first). Now they speak of "Two Worlds" (big difference). Plus, there's a new effect similar to the "Matrix view of the world" (that coolish classic-CRT-green) and this one is in tones of yellow. Example: this is a frame in which Neo is sitting on a chair surrounded by Agent Smith copies, as seen with the now classic inside-the-Matrix perception:


And these are two frames, the first one is apparently what Neo sees in the "real" world when he looks closely at the man who was "infected" by Agent Smith at the end of Reloaded and the second is Neo walking through... well, a corridor of some sort :):


So the "real world Matrix" is yellowish? Hmmm...

Another item of note is yet more "borrowing" of concepts from Anime, with the concept of robotic body armors that looks a lot like that of Macross (aka Robotech):

(Incidentally, James Cameron used a similar concept for Aliens, but I hadn't connected it to Anime until now).

Regardless. It will definitely be entertaining, we just have to hope the story doesn't screw up the experience (I am optimistic).

Now to wait until November 5...

Posted by diego on September 27, 2003 at 12:14 AM

atom is atom is atom

On a break I wandered off to the Atom Wiki and looked at the naming process . And, surprise, a message from Morbus to the atom-syntax list seems to have finally opened up the floodgates. From the Wiki: Let's just use Atom:

We could go through the voting process for the hundredth time. We could say that, come September the 30th, this format is called Nota or or Zing, or whatever wins this vote... but with under 20 votes per name? We could go through all the rigmorale of properly vetting the candidate that actually wins, just to find that really it's no good anyway, and have to start voting for the [too much]+1th time.

But let's not, okay? Instead, Just Use Atom.

Yes. Most definitely. I agree. Not just because of the seemingly never-ending nature of the naming process, but also because the other names currently being proposed are dreadful.

Hopefully this will also re-ignite the "closure process" that is necessary for several parts of the Atom spec, which have been iterating with no end in sight.

Categories: technology
Posted by diego on September 26, 2003 at 11:13 AM


The novel catch-22 by Joseph Heller is a masterpiece of literature, funny, and simply a great piece of writing.

It's also about a concept that, sooner or later, we all encounter in our daily lives.

The concept of catch-22 is, in a sense, another expression of the chicken and egg problem. The book makes references to this constantly. Example: During war, you've got, sometimes, pilots flying missions that no sane person would agree to. Therefore, the pilots must be, in some sense, "insane". On the other hand, no one would give control of an airplane to an insane person...

Or: there's an officer in the book that only agrees to meet with you in his office, but only if he's not in his office. If he's in the office, you can't meet him. If he isn't there, then you can.

And so on, and so forth. :)

Today, I had my own tiny catch-22. It's like this: at Trinity, you cannot do a PhD thesis by research in less than two years. It's the minimum time: a year in the Masters program, then another year in the PhD program.

The deadline for submitting the thesis is September 30, right before the new academic year begins in October. If you are doing it in two years, you cannot submit the thesis before September 30, because the minimum required period wouldn't have passed. On the other hand, you cannot submit it after September 30 either, because then you'd be into the new academic year which means you'd have to pay fees for the whole year.

There's usually the possibility of getting a short extension to the deadline (generally a month).

Now here's the thing: you can only get an extension after you've completed the minimum period for a PhD. That is, two years. Getting an extension in, say, your 3rd year is a simple process. Getting it when you're doing it in two years (when one would argue that a small extension would make the most difference) is simply not possible. All the deadlines come together.


In fact, submitting a thesis after only two years is only possible during the day of September 30 of the second year. Not before, not after. Well, not after, that is, if you don't want to pay for another year's worth of fees.

My thesis supervisor had suggested I request the extension so that we could iterate a couple of times more over the text. No can do. Everything will have to happen this week. By next Tuesday, it will all be over.

Should be fun! :-)

Categories: personal
Posted by diego on September 23, 2003 at 4:12 PM

Karlin on AdSense

Karlin writes about Google AdSense in yesterday's Guardian:

I don't understand this smoked-salmon-socialist approach to personal websites. Many, if not most, of these people are the same ones who, until Adsense, loathed banner ads, not to mention the dreaded pop-up.

Now that they have caught the scent of cash, bloggers are more than happy to slap the damn things prominently on their webpages in the hope that their readers will do what they never do - click on through. As one blogger confessed on his blog, he hopes others don't use the banner-ad blocking software he uses, the faster to bring him his Google dosh.

I must admit that my immediate gut-reaction was similar, but it didn't last. I ended up taking the more err... capitalist approach of saying that, well, if people accept the ads on weblogs they read, then so be it. I also disagree with the blanket statement "ads are bad for weblogs". Her note of people that have turned around 180 degrees (people that previously derided news sites and such that used ads, to suddenly getting ad-religion) is good though. Ah, the ironies of technology.

Ads are appropriate in some contexts, and they do help people pay for their hosting and maybe make a little money on the side too. After all writing a weblog can take its effort. And if people are making some money with their ads, then it follows that readers must be interested in them, no?

Categories: technology
Posted by diego on September 19, 2003 at 6:17 PM

what's the solution again?

I was just watching this video of Steve Ballmer talking (supposedly) about how Microsoft is going to solve their security problems.

The summary goes something like this:

Blahblah blah security problems ... blahblah blah hackers get the information from our patches [yeah right!] ... blahblah blah to solve this we need innovation blahblah blah ... and customer education... blahblah blah ... innovation blahblah blah we believe viruses should be stopped before they get to the computer [read: we are not going to fix those memory overflows or our engineering processes, we are just going to give you chain and a couple of locks to put around your house... and if you don't like living in a cage, well, too bad.] blahblah blah ... innovation blahblah blah the whole industry needs to innovate blahblah the solution is innovation... blahblah blah innovation blah blah .... innovation [and it goes on like this]...

So, that's great! Apparently the solution to insecure runtime environments is innovation!

What's the URL for that? Or do I get it on a CD or what?

Seriously, though, I thought it was the performance of a spin-addicted politician, rather than a CEO of a technology company.

I would challenge anyone to explain in two short sentences what, exactly, Ballmer said.

You can't, because he didn't say anything. Pure rethoric. Lots of obvious points ("we need to improve the entire patch management process [...] we have to continue to improve". Yeah, no kidding, Steve). Again, pure rethoric. No content.

I suddenly remembered this excellent, excellent article by Cringely from a couple of weeks ago: The Innovator's Ball. Note this quote:

[...] there is another issue here, one that is hardly ever mentioned and that's the coining of the term "innovation." This word, which was hardly used at all until two or three years ago, feels to me like a propaganda campaign and a successful one at that, dominating discussion in the computer industry. I think Microsoft did this intentionally, for they are the ones who seem to continually use the word. But what does it mean? And how is it different from what we might have said before? I think the word they are replacing is "invention." Bill Shockley invented the transistor, Gordon Moore and Bob Noyce invented the integrated circuit, Ted Hof invented the microprocessor. Of course others claimed to have done those same three things, but the goal was always invention. Only now we innovate, which is deliberately vague but seems to stop somewhere short of invention. Innovators have wiggle room. They can steal ideas, for example, and pawn them off as their own. That's the intersection of innovation and sharp business.

Yes, Microsoft is an innovator and I don't think that is good.


I can't help but compare it to McNealy's keynote the other day. While McNealy is a bit dry as a speaker, he actually talks about solutions. He doesn't descend into useless generalities (I can imagine see Ballmer talking about famine problems in Africa: "eating some food every day is good to stay alive... we need more innovation... people shouldn't have problems to get food.... innovation... we need to improve things ... innovation..."). McNealy doesn't say "let's get more customer education." He doesn't imply that "the way to fix viruses is to hide your computer in the closet and disconnect it from the Internet". He doesn't utter the word innovation every two milliseconds.

It's sad that Microsoft, instead of using their tremendous resources (both human and financial) to actually fix problems and invent new stuff and create new ways of thinking, are more interested in spinning the situation and proposing that somehow the best way to create security is not to fix the obvious and widespread problems in the architecture of Windows, but rather to "not fix the backdoor, but secure the front door" (whatever that means--If the "front door" is also running Windows we'd have a problem again, wouldn't we?).

What's next? Force everyone to stand heavy weaponry next to their Ethernet cards, you know, just in case? (With "customer education" of course: "if your PC is attacked, shoot the cable immediately!" and so on...)


Categories: technology
Posted by diego on September 19, 2003 at 12:54 PM

java xml pull and push: a comparison

Yesterday Russ pointed to an article at O'Reilly's about StAX (XML Streaming API), a Java API that allows parsing of XML through a pull-mechanism, currently in the final laps of the JSR process as JSR 173. I was immediately intrigued. While many find it common to use additional APIs to solve some problems, I tend to prefer removing layers of complexity and abstraction that aren't absolutely necessary, using only JDK-standard (or standard extension) classes as much as possible. Since this API appears to be not only frozen, but also on track to be a standard extension (and hopefully it will be included in the next JDK release!), I decided to give it a try.

While the benefits in simplicity of parsing are quite obvious, I was a little weary of the performance of this package, since it's a reference implementation and it is almost certainly not fully optimized (one of the main uses that I'd give it is to parse RSS, and I was already looking at how to improve performance of RSS checking--but that's another story). So I created two parsers, one using StAX and one using plain SAX, and compared them in terms of usability and performance.

I ran the tests agains an RSS 2 feed of 500 KB, essentially 20 copies of this morning's RSS feed for my weblog.

The results are pretty surprising. StAX wins by a mile. Check out the results:

SAX results

Start element count = 2109
Characters event count = 4303
Time elapsed = 140 (msec)
StAX (Streaming) results
Start element count = 2109
Characters event count = 4236
Time elapsed = 63 (msec)
A couple of notes: "characters event" count refers to the time that an element is of type characters in StAX or when the characters method is called in SAX by the parser. For an entry that contains text (e.g., CDATA), multiple character calls may be received, depending on new lines, etc. Apparently StAX splits the text a bit differently than plain SAX, since it finds more character elements, but that's ok (to parse those you only need to keep state and append the new characters value to the current element you're parsing). Both element counts are the same, which only proves that both are parsing the same structure properly. The results, are, of course, consistent over several runs, with the expected slight differences. (And, btw, I tried testing memory usage but the variability in initial free memory, etc, was too big to be able to measure which one is better. At a minimum, they appear to be equivalent in that sense).

Here is the code used in the tests: for the TestSAXParser class and the TestXMLStream class. Note: to run the code you'll need to download the current specification of the JSR. In that ZIP file there's the spec itself (a PDF) and a JAR file, jsr173.jar, which contains a number of classes, API docs and such. Of those jars, the only ones necessary to run the example (ie., that must be added to the classpath) are jsr173_07_api.jar and jsr173_07_ri.jar, that is, the API and the reference implementation respectively.

As it is plain to see, the StAX code is a lot simpler than the SAX code, because it doesn't require a wrapper to make it look more event-based. Add to that the fact that StAX code runs at more than twice the speed of SAX, and, as I said, StAX wins hands down. No contest.

Cool eh?

Posted by diego on September 19, 2003 at 10:40 AM

update on rss autodiscovery

In response to my previous entry that further explored some of the issues we face on RSS autodiscovery, Tima has posted examples of how my original mockups would look in WSIL. Looks interesting--a bit more complex, but it's a recognized standard.

It seems to me that a the main element that would have to be decided at this point is whether to go with OPML, RSS, or WSIL (the decision of "stakeholders" in this process---such as Jeremy, Dave, and others members of the community, particularly those that would either create the content or write the aggregators---being the most important IMO).

This format could be a big help in simplifying the subscription of news feeds for users. Hopefully we will be able to get it done quickly!

Posted by diego on September 18, 2003 at 6:48 PM

google code jam 2003

Coming in October: Google Code Jam 2003. The proverbial carrot-on-a-stick is, in this case, $10,000 for a first prize and the possibility of an interview at Google. Will certainly be something to check out--it's always fun to solve interesting problems under time-pressure!

Posted by diego on September 18, 2003 at 10:17 AM

sun's new strategy

I spent some time yesterday watching the presentations and keynotes for Sun's NC Q3, which happened in sync with SunNetwork in San Francisco...

The highlights were McNealy's and Schwartz's keynotes, in which they finally presented the big picture of the strategy that Sun has been embarking on for the last year or so, bringing together all the stuff we've been hearing about for the last few months: Orion, N1, the MadHatter desktop, etc.

The strategy is, actually, simplicity itself (Here's coverage from Wired and eWeek. Essentially Sun is switching to offer a single integrated (sorry, "integratable" as they say, which means they're integrated but you can integrate your own stuff if you want) products, for servers, desktops, and developer environments, with flat annual subscription model for each, each comprising a different "Java System". $100 for the server stack per employee. $60 for the desktop per employee. And an additional $5 per employee for development environment (or a one-time $1895--I'm not terribly clear about these two options but anyway). That's it.

With that, you get everything, the software, training, migration, support, setup, and the ability to run this to any scale you want, no limits. That is, if you have 200 employees you pay $20,000 for all your server software needs per year and that lets you serve an infinite (well, theoretically at least) number of customers. Sounds good eh?

There's more. The license agreement for this thing is three pages. Yes. Three. Isn't that great?

It's not clear to me, however, what's the lower limit for the licenses, if there is one. Somehow I can't believe that you'd get all that for $100 per year per employee if you only have two employees, but who knows.

The other note of interest is that, of course, you have to run this somewhere so you'll need to get a ton of hardware. Guess from whom. Heh.

I wondered at times if this was all a clever ploy to sell more metal, but the strategy is reasonable, and it would set a good precedent for properly-priced, simple-licensed software subscription services. (Oh, and btw, why use employee count for licenses? Because it's one of the few measures of a company that doesn't require extensive audits to be determined).

They also announced that Sun would indemnify anyone using the Java Desktop System (Mad Hatter). This was an obvious reference to SCO's FUD, but it was unclear if it went beyond that (say, Star Office crashes and you lose a crucial document...)--I'd say it doesn't, and it only applies to litigation related to UNIX licensing. Again, who knows.

And for those who are skeptical that Microsoft might be too entrenched on the desktop, there's some anectodal evidence that the recent (okay, long-time) security problems of Windows at all levels might be beginning to make a dent. There's a story in Today's WSJ (subscription required) that talks about a number of companies that are considering switching out of a Microsoft environment on the basis of security problems alone. I'm sure that cost will also start playing into the picture, at least in part, with Sun's new offering.

This info was spread across both keynotes. Schwartz's presentation, aside from the details, went on to some demos (one of which bombed on stage). In the Mad Hatter demo, Schwartz showed the features of a standard desktop, using Mozilla, Start Office 7, etc. Nothing groundbreaking here, except that it was Star Office 7 (which is about to be released I guess...)

Much, much better than the Mad Hatter demo was a demo of a new user interface futuristic project titled Looking Glass. Now, it was totally unclear what relation, if any, this had with everything else that came before (Schwartz said that the project was open source, but after a bit of googling all I could find that was remotely concrete was this news item on it, about another demo given by Schwartz a month ago). This demo was a nice demonstration of 3D on windowed user interfaces, with perspectives, transparency (layers--you can put one window behind another and see through the first), etc. and see one through the other as though the window in front is translucent. To make space on the screen Schwartz rotated several windows back and forth along the Z axis (like opening and closing a door, performing 180 and 360 degree vertical rotations of the windows, etc. All of this while some of the windows where playing MPEG video --and the icons in the taskbar showed a minimized version of the icon. Yes, it sounds awfully familiar to OS X (except for the 3D). As I said, it was impressive but not clear at all what Looking Glass had to do with anything else.

Overall though, the strategy is reasonable and they don't even need to sell me on the idea of simplifying software and licensing. The idea of different Java Systems for different usage needs is also very cool.

The question is, will they pull it off? We'll know in a year or two.

Posted by diego on September 18, 2003 at 10:09 AM

paper rocks

Today I completed the final draft of my dissertation (deadline for submission is September 30).

At peace. For a few hours at least. :)

The past few weeks have meant a lot more juggling than usual between the thesis and clevercactus, as I double-checked the theory and performed additional simulations and such. I had started the previous revision cycle at the beginning of August, after the public release of cc beta2, and back then I had to reorganize the document structure that I had completed by the beginning of July (refactoring). What's interesting about this (yeah, I was wondering too) is that I ended up doing the core of the document-refactoring work on paper. For the math part, I would have expected it, since when working with equations nothing beats pen and paper. But for the document outline itself...

Yes, I know there is outliner software. Yes, I tried it. Yes, it didn't work.

And no, it wasn't a problem with the software, but rather with the hardware.

That is, the main reason why it didn't work is because, to be able to connect all (in my mind) the pieces into a... err... "coherent narrative", I really needed to look at everything at once. And given that the final document is 3 parts, 12 chapters and two appendixes, well, doing it on-screen would mean a lot of scrolling up and down or a really tiny font. I couldn't deal with either. So what I did was write down different parts of the contents on paper and organized them and reorganized them on the wall (invisible tape to the rescue) , redoing some of them, until the structure was finished...

For example, this is how part of my living room wall looked a couple of weeks ago (since then, most of those papers have migrated over to a wall my room, where I can look at them sitting on the computer).

I don't know if it was an efficient solution or not, but it worked pretty well for me. And I have to say, I learned all over again (as it happens every few months when I pick up pen and paper) that when you can't just press Backspace to delete what you just wrote you think a lot more carefully about what you're going to write. It forces in a sense to pace yourself since you can't write as quickly as you think, whereas in the machine you can type before you've even thought about what you're saying! Okay, okay, maybe I'm exaggerating, but you get the point. :-) All of this was, in this case, was a big bonus, and a big help in the process.

Conclusion: barring blackboards (even the fancy ones that transmit data to a PC, or print the image on the board), wallscreen (preferrably touch sensitive) displays, or holographic projectors that can express my thoughts, paper is still the best choice. Portable. Low power (very!) and user-friendly.

So today when I finished (sometime ago) I went out for a Pint of Guinness at Nancy Hands (a great pub across the street--okay, one of the four great pubs across the street :)) and watched the sun setting filtered through the hues of the pub's windows. Then back home and cooked a good, wholesome, healthy dinner: cheeseburgers and fries. Ah. Heaven.

Now off to disconnect my brain for a while. A sort of Pavlovian self-reward. Watch a movie. Or something. Read a book. Ah, yes. I'm reading this incredibly great book on the history of the KGB called The Sword and the Shield: The Secret History of the KGB but I obviously haven't been paying much attention to it since after a few fits and starts I really started it about a week ago and I'm still only 150 pages into it. The book is based on actual KGB files stolen by an ex-KGB agent (okay, he wasn't an ex-agent when he stole them!) and smuggled out of Russia after the Wall fell. It covers basically everything from its post-revolutionary beginnings (Lenin's Cheka) to today's SVR (the resulting foreign-intelligence arm of the KGB after it was split in the early 90's along CIA-FBI or MI5-MI6 lines)--although information on the SVR is understandably thin, the book constantly traces past events back to the present, along with the "official" SVR story surrounding them. Who needs spy novels when you can read about the real thing?

Anyway. Back on Planet Earth tomorrow.

Categories: personal
Posted by diego on September 16, 2003 at 8:35 PM

dublin skies


When I was coming to Dublin I was "warned" to no end by friends and acquaintances about the "terrible weather" that was expecting me in Ireland in general and in Dublin in particular. "Awful weather" they'd say. "It rains all the time" they'd say. "It's always cloudy" they'd say. When I got here, I was warned again by both Irish and non-Irish alike.

Well, I, for one, like rain a lot. I like the cold too. I like sunny days. And hot summer days. In fact, I like equally both rainy and sunny days precisely because their opposite exists. Variability lets us appreciate what's good (and bad) about the alternatives. Without changes, there would be nothing to like about, say, sunny days. And actually, one of the things that sometimes (only sometimes) irritated me a bit about California (and the Valley in particular) was that the weather was... well, always so damn beautiful. Clear skies and beautiful days were so common that you just stopped appreciating them (hard rain, however, when it did arrive, was just great).

Anyway, what I found after living here is that I really, really like the constant change of Dublin climate. Dublin is right smack in the middle of the Gulf Stream (which crosses the Atlantic) and weather "moves" incredibly fast: what is mostly low grassland can't slow down the currents, so it's very windy (okay, in the Winter, with the cold, I do suffer it more, but so what). I have quite literally seen the weather go from sunny (no clouds) to cloudy, to dark clouds, to rain, to hail, on to snow (!), then back to cloudy, then no clouds again---all in the space of twenty minutes! Isn't that great? (This was in February last year, btw).

And, constantly shifting cloud-cover makes for absolutely astonishing sunrises and sunsets. Like the image above (click on it to see a full-size version).

I took that picture last saturday at dusk, from my balcony, and I tell you: If the price for having sunsets like that every other day is sometimes-unpredictable-and-maybe-irritating weather, let me say: I pay it gladly.

Categories: personal
Posted by diego on September 15, 2003 at 2:59 PM

microsoft-motorola announcement coming up


ms-moto-att.jpg The Wall Street Journal is reporting today that Microsoft and Motorola are set to announce their partnership tomorrow:

[T]he partnership also highlights the challenges Microsoft faces in that quest as its new partner loses market share amid stiff competition in the mobile-phone business.

The companies expect to announce that Motorola will begin building phones that run on Microsoft's Mobile Windows software, according to executives at the companies. The phone, a "clamshell" style unit, will be the first phone sold in the U.S. that runs the Microsoft software.

AT&T Wireless Services Inc. plans to offer the unit to its subscribers beginning later this year, according to an executive at the mobile operator.


With Motorola as a partner, Microsoft now has support from the world's second-largest handset maker, behind market leader Nokia Corp. of Finland. To date, Microsoft has relied heavily on HTC Corp. of Taiwan to manufacture its phones, which are mostly sold in Europe.

Before Motorola, the only other major handset maker to announce support for MS's Smartphone has been Samsung. Samsung, however, has pushed back the release of their MS-powered phone repeatedly (according to them, due to technical problems). I wonder if Motorola is going to run into the same problems--probably not, since the MS mobile phone software has gotten relatively decent. As the article notes, apparently the announcement will include AT&T Wireless as a carrier, but Orange had also been rumored to be ready to deploy Motorola-MS phones.

We'll have to wait until tomorrow to see if Motorola's phone will feature any of the good stuff we've gotten used to with Symbian (such as Bluetooth). Based on the picture above, it seems that at least it will have a built-in camera with what I expect will be support for Windows Media and such. Various news sites are reporting, however, that it doesn't have Bluetooth, Java, or a camera. Strange.

Categories: technology
Posted by diego on September 15, 2003 at 2:13 PM

ping yourself!

Unbelievable. How a solution can be staring you right in the face and still not see it.

Call it a blinding flash of the obvious.

One of the problems I tend to think about when writing follow-ups to things I've already discussed is that, while you can point back to a previous entry, you don't really want to go back and start modifying the original entry to point back to the follow up, allowing me to follow the evolution of an idea for example. Now, I think there are plugins for MT that let you specify "related" entries, but I've never gotten around to investigating them (I tend to prefer the minimal amount of plugins and extensions necessary, both on software I use and software I write).

Now, as I was posting my previous entry on RSS discovery, I just realized that I could simply ping the entry through trackback and so provide a pointer to the follow up. Simple. Effective. To the point.

I can come up with all sorts of reasons why I've never seen this before (like "Oh, well, you've always thought of trackback as a useful tool to connect different weblogs"), but the reality is that this is so incredibly obvious that... I can't understand at all why I didn't see it before. :-)


Isn't trackback great?

Categories: technology
Posted by diego on September 14, 2003 at 6:50 PM

rss autodiscovery, take 2

Being a weekend, not a whole lot has happened, but there have been several good comments on the topic of rss autodiscovery, which has made me think further about the choices we face in making it a reality...

As a recap, Jeremy proposed coming up with an OPML-based standard for specifying lists of feeds on sites, and Russ brought up the idea of using RSS, which some, including me, liked. I followed that up with a couple of mockups with both OPML and RSS so that we could compare that pros and cons or each, along with some comments on the apparent tradeoffs for each.

A note, before going on. The second I read Jeremy's post I thought: "but don't we already have RSD?" I immediately checked and got my answer, but I thought that, for completeness, I'd include it here. The answer is: no. RSD is intended for autodiscovery of APIs that will be used to access/modify the content programmatically. Similar solutions that have been proposed for Atom also deal with APIs rather than feeds. The bottom line is that re-using current autodiscovery techniques/specs from APIs would imply re-spec'ing them, at least partially, which brings us back to square one.

Using either RSS or OPML seems to me like a good solution that will get things done. It might not be the most perfect solution, but it will work. There seems to be some resistance to using OPML. The main basis for this resistance is that OPML can't be validated (or easily used) because its spec is relatively loose. However, I think that if OPML was used for this, it have to be specified properly; which means that what could (and could not) be done would be known and therefore it could allow validation. The fact that the current iteration of OPML cannot be validated is not enough grounds to reject it out of hand, in my opinion. A few small improvements in the spec of OPML or this new OPML-derived format would do the trick.

In summary, my (possibly narrow-minded) view is that all we're doing is agreeging on using a number of tags and the structure of a document. Any solution will look similar to any other, and I think it's eminently useful to base things on a format that can already be parsed by most, if not all, aggregators, as is the case with both RSS and OPML.

There are other possibilities though. Tima put forward the idea of using WSIL (also echoed in this lockergnome entry). As I don't know the intrincacies of the format I can't come up with an example that will re-write in WSIL either of my two mockups and be sure that it won't be broken, and for comparison purposes we need, I think, to be looking at exactly the same content. Conclusion: if Tima or someone else would have a bit of time to re-write my mock-up structure using WSIL, it would be most welcome!

Regardless of format, the main issue that seems to me would drive how the format is used is how hierarchy in the feed is handled. Hierarchy will be necessary to provide the structure used by many news sites (e.g. "Technology/Mobile Technology/Phones"). So, with a heavy emphasis on how hierarchy would be represented, here's a summary of the issues in choosing one format or another as far as I can see:

  • Regarding hierarchy, OPML is clearly a winner here since it is designed to support hierarchies. OPML would, however, properly spec'ing a couple of elements to represent the data that we'd like to represent. OPML, for example, has been variously used to specify links with url, htmlUrl, as well as others like href as this example from Philip Pearson demonstrates (in fact, Philip was actually using OPML to provide a feed directory there). That would be the extent of the work required for an OPML implementation.
  • RSS, on the other hand, is not "naturally" geared towards dealing with hierarchical content: the structure of the information represented is flat. This can be solved in one of two ways:
    1. It is possible to create an implied hierarchy within the file by using category names. All the feeds for a site would be on a single file, and hierarchy would be specified by using a forward slash "/" between category levels. Pros: simple, and it doesn't stretch the use of RSS beyond its single-file origin, and it simplifies checking for new feeds on a given "watched" site since a singet GET is required. Cons: it would be a semantic convention, rather than syntactical, which makes it harder to verify properly.
    2. The alternative is to specify sub-feed sets through the use of the domain attribute in category elements. That is, whenever a category in an entry includes a domain, then the entry is defined as pointing to another feed-of-feeds subset, rather than to a particular feed itself. A backpointer to the original "parent" feed set can be defined by using the source element on RSS entries, which gives us the good side-effect of making the hierarchy fully traversable from any starting point. Pros: the connections between feed sets and their children would be syntactically defined, thus making it easier to validate and verify, all without having to bend in any way the definition of what an RSS feed is. Cons: it makes the structure a bit more difficult to maintain (multiple files) and to access (multiple gets) which also impacts the ease of the process of validation a bit.

In his entry, Tima mentions that WSIL describes hierarchy through the use of multiple files, much like the second RSS alternative mentioned above.

A final element that would also have to be agreed upon is how this master file is usually found. Jeremy, in his original posting, proposed using a standard location similar to robots.txt, and with a standard name, like feeds.opml or rss.opml which sounds quite reasonable.

Okay, so what would be the steps necessary to be able to spec this? A possible outline would be:

  • Define which format would be used, based on pros and cons.
  • For the format used, define the structure and the meaning of the tags used.
  • Agree on a standard location for the top feed-of-feeds set.
  • Formalize the results in a spec.
How does that sound? Did I miss anything?

Posted by diego on September 14, 2003 at 6:30 PM

on rss autodiscovery

After reading Jeremy's post on creating a sort of "auto discovery for RSS 2.0" (and agreeing that it was a great idea) I thought, what the hell. Why not try it on for size? Here it goes, then...

The idea is basically that a site could publish a centralized directory of all the feeds it serves. This would allow auto-discovery by aggregators and suggestions to users when new feeds come online on sites they already watch.

Jeremy was proposing to do it on OPML, and Russ floated the idea of using RSS directly, based on this, which I liked.

To get some actual idea of how something would look like, I decided to whip up possible versions of this both in OPML and in RSS. So I spent some time looking at both the OPML and RSS specs and thinking how they would be used in this case (the use, of course, from my point of view which is that of a user and someone who would have to add this to an aggregator :) -- it's possible that I missed something that would be obvious to a content producer).

The results of my little experiment are here: in OPML and in RSS, for a fictitious news site "News4Humans".

RSS, being a richer format than OPML (and possibly more generic as far as content is concerned), has no problem accomodating all the elements. There are a few elements that I added to the OPML version to mirror the data exposed; even though they are not included in the OPML spec it's ok since the spec does not preclude adding new elements. That said (and as Dave mentioned specifically regarding the issue of recursive inclusion), everyone would have to agree on them or they would be useless--possibly an addendum to the spec would be useful as well.

One of the main differences is in structure. OPML supports recursivity, RSS does not. So where OPML can define the category and the feeds for that category as sub-tags, RSS needs to use the category tag, essentially making the structure flat. This seems to be fine to me, unless "deeper recursivity" (or is it recursiveness?) is needed--but I can't think of a news site with more than one level down from the main category at the moment, so I let it stand.

Second, the OPML version contains two additional tags: link to specify the feed on feeds'... well, link :) (to match the same tag of RSS, which could potentially be useful for redirects) and dateCreated, a per-entry element, the idea with this tag being that the aggregator can record when was the last time the feed on feeds was checked, and diff against this date to very easily find out which ones were added since the last check (of course, keeping a full list and doing a diff on that is possible, but then again if there are, say, 50 feeds, and the user subscribes only to one, the aggregator would have to keep all 50 to do a proper diff against number 51, which seems kind of wasteful. RSS, incidentally, supports this functionality by its own basic item date tag. And, again, the date could be useful for redirects if necessary: changing the date on an item you already knew about implies that it has moved.

As I noted in the comments on Jeremy's entry, I sort of instinctively thought that RSS was a better idea. The OPML version however looks enticingly simple and still functional. Hm. Surprising.

Anyway, which one do you like best?

Update: As I mentioned in the comments (replying to Zoe's idea of establishing hierarchy through multiple files) the issue of hierarchy is not terribly clear with RSS. I see two ways of doing it:

  • One, as Zoe proposed, using files. This would require that we agree on a convention that says, for example, that if the item has only a link and nothing else (allowed by the RSS spec--all items are optional) then the link is to a sub-directory. This is feasible and would imply, on the client that subscribes, a multi-step process to obtain the full list.
  • Two, the creation of a "virtual" hierarchy by way of category names. Already the mockup is using category to specify the main topic to which the feed belongs. If the category is, for example News/Sports and there's another category News/Politics then the hierarchy is implicit in a single RSS file, even though the actual structure is flat. This would require a single GET but a bit more processing on the data once received.
I prefer option two since option one, while enticing, implies that we would be giving two different meanings to the tag link, something that's never desirable, but it's possible that I missed something there... Also, if the "hierarchy through files" method was chosen, the connections could be made two way between the files, which is nice, by using RSS's source sub-element for item, so a feed can be traced back to its "parent" feed.

Update #2: Another advantage of using RSS as-is that I keep forgetting to mention is that the language for the feed can be specified. You could automagically define that you only want to see feeds in a certain language and the aggregator could automatically disregard anything else. The same functionality for OPML would require adding a tag for that purpose.

Update #3: I've just noticed that, in the RSS spec, the category element has a domain attribute. If the domain is used to point to sub-category feeds, then hierarchy can be achieved cleanly. Therefore a simple solution that doesn't require any changes to the RSS spec (and as far as I can see doesn't bend its meaning either) could be as follows:

  • When an item contains a link, that item points to an actual news feed.
  • When there's no link, then the category must have a domain, which points to the sub-tree. description and title in that case are the desc and title of the subfeed's category, respectively
How does that sound?

Posted by diego on September 13, 2003 at 12:55 AM

you like your myths rare or well done?

This pisses me off slightly, which accounts for the more sarcastic tone. In case you were wondering. :-)

I was just reading Vasanth's entry "study reveals not-so-hot java" in which he happily perpetuates myths that for some reason keep sticking to the Java platform. The only point that is half-true in his list is that Java is not managed by an open standards body. I know that many people are not happy with the JSRs, but it's half way there, and Eclipse keeps gaining momentum (memory refresh: Eclipse started about three years ago).

As for his other "points": I'd suggest this: the next time you hear someone say things like "Java is slow on the desktop" your should ask:

  • compared to what? Assembly code? and
  • Where, exactly is your proof? How about a few examples of non-performing Java desktop applications?
I have a number of examples that actually prove quite the opposite to the slow-on-the-desktop claim.

"Write once, run anywhere not true" he says. Really? Then how is it possible that I could write a client application that was deployed successfully on everything between Windows, Linux, MacOS, and even OS/2? Without a single line of platform-dependent code? Is it magic? I can't remember any chanting being involved...

And as far as those much discussed "scalability problems", hey, isn't eBay's use of J2EE enough proof that Java can scale?

Case closed.

Posted by diego on September 12, 2003 at 8:35 PM


AIBO-ERS7As usual, for cultural (rather than technological) reasons, Japan is on the forefront of these kinds of things:

Mrs Tanaka is 84. Today, as usual, she wakes just before 7am, slips on her dressing gown and flips a switch to start water boiling for her first green tea of the day. She's about to get dressed when she pauses. She turns to the low table near the door, where a soft toy sits incongruously, and greets it in her distinctive west-Japan accent.

"Good morning Teddy. How are you today?" "Pretty good, thanks Tanaka-san," comes the reply. "Have you remembered to take your pills? It's the pink ones this morning," the robot bear continues.

A scene from AI 2 or a vision of a slightly over-cooked future nanny state? Actually, it's here and now in Japan.

And, yes, I'm one of those that would get an AIBO if I could...

Btw, I just realized that William Gibson, for all his understanding of Japanese culture and his uncanny ability to "see beyond", has never made much of a deal of personal robots in his novels or stories--even if one might argue that the trend is just too new, it precedes both All Tomorrow's Parties and Pattern Recognition, so a tiny mention would've been expected. Or maybe by now they are too mainstream to qualify for the wonderful techno-kitchness of some of his characters. I wonder.

Categories: technology
Posted by diego on September 12, 2003 at 8:12 PM

simplicity applied

I was going off the deep end in Win32 (yes, I know...) to finish some tests I need to do for my thesis research, and I decided to get some instant gratification by doing something simple: update my templates since there were a couple of weblogs in my blogroll that had recently changed location, and check out my feeds to see what, exactly, was being generated. I had been using the default movable type templates (which in my installation, an upgrade from 2.4 or something, where RSS 0.91 and RDF). In the process I discovered that my original 0.91 feed did not validate due to the date format, which was not RFC 822 (as the RSS spec requires), but ISO 8601...

So I went looking and I found that Movable Type now has a template for RSS 2.0 feeds. Nice! Grabbed it updated, and tested through the feed validator. It worked.

So far so good.

Then I read Sam's great presentation on RSS at Seybold, and there he had a mention of "Funky" feeds.

I remembered that a big argument had started a few weeks ago in this regard. At the time the discussion had turned ugly so fast that I simply stayed away, and didn't even follow it that much.

But now I was intrigued. So I started looking at the RSS 2.0 feed that was being generated by MT, and I understood what the discussion was about.

What was happening was that MT's RSS 2.0 template was using Dublin Core elements to replace elements for which RSS 2.0 had equivalents.

Aha! It wasn't clear to me why this was being done. The feed was valid, true, but somehow it didn't feel quite right... I felt it was like using JNI to access C code for, say, calculating the tangent of a value when using java.lang.Math would suit just fine.

If RSS 2.0 had the elements, then why replace them with something else? I revisited the discussion a bit and saw that Mark had argued that DC elements were more of a standard than RSS 2.0 equivalents, which was a fair point but still didn't quite explain why you'd require aggregators to deal with additional namespaces when you could get away with simply using "built-in" tags. Besides, it was Mark's opinion, rather than MT's, so as reasonable as his argument was it didn't definitely explain why MT was going in a certain direction. Furthermore, I didn't quite agree with the logic; as much as I like the idea of Dublin Core, I'd prefer to go with built-in elements any day of the week (as Atom has done, btw, in not using DC elements even when it could have done so). I now had the opportunity to follow up on what I had been talking about a couple of days ago regarding simplicity, with something small but concrete.

Okay, so I started investigating more and trying to change the feed template into pure RSS 2.0 (no namespaces). Everything seemed to be going fine until I hit the pubDate and lastBuildDate elements. MT was using, for example, dc:date. When I tried to take the date "out of the namespace" it didn't work, even if I changed the formatting to match that of RFC 822. Why? Because MT does not have a tag to generate RFC 822 timezones. The only tag to generate a timezone included in MT is $MTBlogTimezone$, which generates ISO 8601-style timezones.

Things now started to make sense. MT didn't have a tag for that, hence the best way to generate a valid feed was to use an ISO 8601 date, which can only be included if you're using Dublin Core elements, rather than the RFC 822 elements that the feed requires. And after you include one namespace, well, why not do it all on namespaces, since the line has been crossed so to speak. At first I thought that this "line crossing" had been because of the use of category in an entry through dc:subject tags, but rechecking the RSS spec I saw that RSS 2.0 has a category tag for items as well as for feeds, so that wasn't it. It was only the date that was bringing this whole cascade of namespaces tumbling in. That's my theory anyway. :-)

Regardless of why this was happening, I was sure there must be a solution. The Movable Type tutorials at were empty, so no luck there. One googling, though, turned up John Gruber's RFC 822 plugin for MT. John's plugin adds the $MTrfc822BlogTimeZone$ tag, which is all that was missing to generate the correct date. Great! Now I had all I needed.

The result is this template which depends on John's RFC plugin and generates valid RSS 2.0 with the tags that I need and avoids using namespaces (maybe when adding more functionality not supported in the base spec, namespaces will be necessary, but I prefer to avoid them if possible). Now I have a pointer for both RSS and RDF feeds on the page. Still have to re-generate the whole site, though, which will take a while.


Posted by diego on September 12, 2003 at 11:46 AM


Categories: personal
Posted by diego on September 11, 2003 at 9:54 AM

the power of simplicity

and this from the it-seems-obvious-in-retrospect dept...

hit by a virtual hammer

This rant has been brewing in my head for a couple of weeks now, but every time I started writing it fizzled out, for whatever reason. I can see that this might seem obvious to a lot of people. It wasn't to me though! :)

The thought process behind this started when I spent a couple of hours writing my google-rss bridge. While I had written an RSS reader component, I had never written an application that created RSS. I was pretty astonished at the power of a standard that can be used so easily both ways. It was like being hit by a virtual hammer. I thought about it further when I added Atom support to it, and I mentioned some of these issues in the entries in a roundabout sort of way...

What jelled today in my head was the distinction of three elements of the process that weblogs and RSS make possible.

The three elements are: Content creation, publishing, and access.

The magic is that each step can happen in a decentralized fashion. All tied together through the thin ice of a few XML and HTML tags.

HTML is very similar (and similarly disruptive), but less oriented towards decentralization because it has evolved to be oriented towards display rather than automated consumption (something that, at least theoretically, CSS was supposed to fix).

Understanding that split between creation and publishing is what brought everything together for me. Where I write the content (creation) has nothing to do with where it resides (publishing). The last part, access, is obviously separate. The other two weren't, at least not to me, before today.

And why is the creation/publishing split important? Because it's what drives the full decentralization of the process, and the one that makes simplicity a lot more relevant than before. Without full decentralization, simplicity is a lot less powerful. Decentralization+simplicity means that everyone's invited to the game. After all, if creation and publishing are together, if you need expensive or complex centralized infrastructure (and let's face it, infrastructure to publish web content is no child's play) to set up a content system, no matter how simple the content format or protocols themselves are, it will still have limited impact.

the price of complexity

Complexity plus its associated cost and monopolies (or oligopolies) go hand in hand, since they constitute one of the most important barriers of entry. But weblogs and feeds, as the web itself, have split the lever of power: now the glue that ties the components together is as much a point of control as actually creating the clients or the servers themselves. In the web in particular, as better development tools for both clients and servers have evolved, the format itself became the most important element that brought complexity into the equation. And the reason, I think, is the separation that happened split between creation, publishing, and access.

Consider, first, HTML. In the days of HTML 2.0, it was relatively trivial to write a web browser. The biggest problem in writing a browser was not, in fact, in parsing or displaying HTML: it was in using the TCP/IP stacks that at the time were difficult to use. Over time, the shift of complexity into HTML has brought us the situation that we have today, where writing a standards-compliant browser requires huge investment and knowledge, and the earlier barrier of entry (the network stack) is now easy to use and readily accessible. Sure, there are many web browsers in the market today. But power is not distributed evenly. HTML 4.0 raised the bar and in fact IE 4 won over many people simply because it worked better than Communicator 4 (I was one of those people).

What I realized today is: there's a huge side effect that the format has on content access: monopolizing the market for access becomes easier the more complex the content format is.

My point: This should give pause to anyone in the "go-Atom-crowd", including me.

Keeping the barrier of entry low applies in more than one case, of course, but here it's crucial because weblogs, creating RSS feeds and accessing them, etc., is just so damned easy today. This allows developers to concentrate on making the tool good rather than dealing with the format.

Dave has said things along these lines repeatedly, but honestly I hadn't fully understood what he meant until now.

Let's see if we're all on the same page. The message is: It is no coincidence that basically every single RSS reader out there is high-quality software.

Big and small companies, single developers, groups, whatever.

A simple statement, with profound implications.

Back to Atom.

I am not implying that the (slight) additional complexity found on Atom will make it fail. I am saying that its increased flexibility brings on complexity that also increases the barriers of entry for using it with the consequent loss of vitality on the area. Without proper care, these barriers can slowly chip away at the ease with which tools can be created, and in the process split the fields into incomplete or low-quality software used for tinkering and mainstream software available for general users, with a lot of entries in the first category and a few on the second.

and why is this important?

This stuff matters. There are many examples today of how weblogs are changing things, from influencing politics to breaking down proprietary software interfaces and affecting how the spread of news itself happens. In my opinion a big part of that is because weblogs really, finally, put the power of publishing on individuals' hands, something that "the plain web" had promised but had failed to do (after all, here's-a-picture-of-my-dog-type-homepages were around for quite a while without anything interesting happening). But if barriers are raised, the Microsofts and the AOLs suddenly have a fighting chance.

Jump to the future: Microsoft announces support for Atom, built into a new IIS content-management system. Great! Says everyone. Then you look at the feed itself and you discover that every single entry is published using content type "application/ms-word-xml". This wouldn't be new. Already Microsoft claims to great effect that Office supports XML but everyone knows that trying to parse a Word XML document is literally impossible. XML is too generic to be taken over though, in a sense it was designed for that, as a template for content formats. HTML wasn't, but it was subverted anyway. RSS is, quite purposefully I think, holding out. With Atom coming up, there's a chance it might happen.

I hate to point out problems without also proposing at least one possible solution. So: An extremely simple way of getting around this problem would be to specify that text/html content is required on a feed. If it's not, then your feeds don't validate. That simple.

Call it the anti-monopoly requirement. :-) The same focus on simplicity should be, IMHO, the drive of every other feature.

Another example: Today, with email, Microsoft has used MIME to great effect to screw up clients that are not Outlook or Outlook Express. It couldn't happen with simple plain text. But MIME allowed it. The result: people can send each other Microsoft-generated HTML that only the Outlook+IE combination can display without hacks.

We can't allow that to happen.

Some might argue (quite persuasively) that it doesn't really matter whether content-creation is slightly more complex. To that I'd say: it's just my opinion, but I think it does. It could also be argued that this is all simply a matter of evolution, it was bound to happen, etcetera. But it wasn't "bound to happen". We are making it happen. It's in our hands.

If the rise of the web was a lost opportunity in this sense, well, amazingly, we have been given a second chance.

Let's not blow it.

Posted by diego on September 10, 2003 at 9:03 PM

a small news item

From NewsForge: a short (but cool!) mention of clevercactus on this article: "The Java-based clevercactus beta software looks quite promising." Thanks! :)

Categories: clevercactus
Posted by diego on September 10, 2003 at 5:48 PM

the new Swing GTK look and feel in JDK 1.4.2

I took a few minutes today to test clevercactus against the new GTK look and feel, introduced with JDK 1.4.2, in my Red Hat 9 Linux machine. My first reaction was sheer horror at seeing how awful the application looked. I think I even blacked out for a moment.

A little investigation showed what was at the root of how the app looked: the fact that the GTK does not depend on a "typical" Swing L&F but rather defines its own dynamically, based on gtkrc files and it ignores the programmatic settings you might give to your components.

Let me say that again: the Swing GTK L&F ignores the programmatic settings you give to your components.

Are you settings your own borders for, say, a panel? Gone. Different colors for menus? Poof. You prefer a different font for your lists? Sorry, can't help you. Changing the look of a button by setting setBorderPainted(false)? Bye-bye.

But no fear, all of these things are set in the gtkrc file. Therefore, whatever stuff you were doing programatically now has to be duplicated in the RC file. And there is a relatively simple way to load (ie package) your own RC file for your application. Which means that, yes, you can modify the L&F but in a non-Swing-standard way.

In the end, after some tinkering with the RC file, cc still doesn't look quite right: the default colors and fonts for lists are all wrong and I can't find which setting is responsible for that. Using the Metal L&F (or Motif) on Linux is still the only viable option until I get a decent RC file in place.

Overall, the new GTK L&F is a good addition. We just have to hope that by the time Javasoft makes it the default L&F for Linux (something that's due to happen in JDK 1.5) programmatic overrides work exactly as with the other L&Fs.

Posted by diego on September 10, 2003 at 3:21 PM

SWT: first impressions

After spending a couple of days actually using SWT and trying out things, these are my first impressions.

First, for an IDEA-junkie like me it takes a while to adapt to Eclipse. There are a few refactoring functions that just aren't there and the editor behaves just... well, weird. But that's not a huge issue.

Specifically about SWT, it is simple and works reasonably well. However, it is too simple. In fact, it is downright primitive, and it seriously changes the way you think about operating system resources (more specifically, Graphics resources). Maybe that's good, but being used to the idea of Swing, where you can create components or colors or whatever and move them around and pass them between contexts with impunity, it is, well, shocking to, for example, not be able to create a component without a parent.

This is more a change in style (application-oriented, rather than component-oriented). What's a bigger problem is how completely, utterly primitive the tools to deal with graphics are. (Yes, even more primitive than AWT). Take, for example, Fonts. You can create a font (and remember to dispose of it!!) but if you want to calculate the length of a string on that font, you're out of luck. In fact, if you want to calculate the length of anything related to fonts without referencing an existing GC (SWT's Graphics) context, you're out of luck altogether. It can't be done. (While in AWT/Swing you have Toolkit.getDefaultToolkit().getFontMetrics(font)). Even if you do get a FontMetrics with a reference to a GC, the methods you do have are simply pathetic: getAverageCharWidth(). That's it. There's another method in GC to obtain the actual length of a character (getCharWidth) and the length of a string (textExtent). Color management is also bad: essentially the only way to create colors is to use directly the RGB values -- no predefined constants for anything, not WHITE, not BLACK, and no way to do what's so useful in Swing, call a brighter() method to obtain a variant of the color. (And, again, once you create them, they have to be dispose()d of.)

Lists, Trees, Tables and TreeTables are good, and in fact they are easier to use than Swing. But they are wayyy less customizable. For example, you can't insert an arbitrary component on a table. You can only show strings (single line) or images, or other one-line components (like a combobox). More complex components are also lacking. Take, for example, rich text editing or display. While the JEditorKit in Swing might be a massive nightmare, at least it exists. SWT has no equivalent to it. JFace, which is a higher-level library built on top of SWT, is an improvement but not enough.

On the other hand, Eclipse itself is built on SWT and Eclipse does have some of these components. It's not clear, however, how to access them. Documentation is improving, but still lacking.

Now for the good points: the platform is thought as a layer on top of any OS, rather than an independent platform, so it has some simple ways of doing crucial things that the JDK should have added long, long, long ago (think 1996 :-)). Example: launching the default program for a document. In standard Java, you have to resort to ridiculous Runtime.getRuntime().exec() calls that fail half the time and have to be tested in more combinations that is possible. Eclipse, on the other hand, has a handy Program class that lets you obtain the program for a given file extension as follows:Program.findProgram (".html"); and then obtaining the icon (cool!), launching it, etc. Native browser support is currently in beta and it works relatively well, the only question that opens up is whether it's reasonable to resort to platform dependent browsers when you are bringing in all their baggage (I'm thinking of security problems mainly, yes).

And, programs in SWT look fantastic, without a lot of work (Programs in Swing can look fantastic, but only with a lot of work), . In particular if you have ClearType on Win XP, it's a huge improvement, something that can't be done in Swing at all. Even antialiasing doesn't look too good, and it's a hack to use it. Swing can use it though, as the excellent Mac OS X implementation of JDK 1.4.2 shows, so if only Sun would get really into supporting the desktop and doing a reasonable implementation of text rendering for WinXP...

For many people, I think, SWT would be a good choice. For programmers that are only now approaching Java, I get the feeling that it is definitely easier than Swing. OTOH, it's less customizable once you get a handle on it (the reverse of Swing), although I imagine that will be fixed as the platform evolves.

Overall: As Gordon Gekko says in Wall Street: "Mixed emotions Buddy... like Larry Wildman going over a cliff... in my new Maserati."

The Rich Text editor problem is probably the main issue that seems difficult to ignore at the moment, at least for me. Looking into that now (and have been for the last few hours). More later!

Posted by diego on September 9, 2003 at 12:27 PM


yes, that's it, that's a good title...

So what is happening? Well, workload is up, blogging is down, that's for sure :-). I guess one thing I've learned over time is not to go crazy when I can't blog. Blogflow returns on its own good time. (Just sitting down to write is always good however, just like with fiction).

First, I got an email from Anthony who just moved his weblog to a new location (and to MovableType!). Gotta update my blogroll... anyway, he also posted a lengthy review of clevercactus beta2. Thanks! I replied through email asking for more details on some of the problems, and left a comment on his weblog.

Second, Sam replied in a comment to my posting the other day on adding Atom support to the google-feeds bridge. Regarding RFC dates, I think he (and Murph adding to it later) makes a good point. My viewpoint is from Java, from which parsing and generating RFC 822 dates is easy, and ISO 8601 dates is hard. No big deal however since once you've got the code it's all a-ok (as much as the code is a hack in Java). If it's easier for a majority of languages, all the better. Then, regarding the content-type issue on content for a feed, Sam again makes a good point. Specifying what you mean is good. What I would add though, is that maybe a baseline content-type (say, text/plain or text/html) always be present. If that's not the case, we could easily end up with feeds being generated only in types that half the aggregators don't understand, which would be a compatibility nightmare (for the aggregator-writers :-)). In any case, thanks for the comments, Sam.

That aside, I am taking a look at (gasp!) SWT, my misgivings about going back to chasing memory leaks notwhistanding. More on that soon. :-)

Categories: personal
Posted by diego on September 8, 2003 at 5:16 PM

clevercactus bug/feature tracker online

Finally! After (quite a lot of) looking around for a good solution, in the end it was Mantis (thanks to Stefan for the pointer) that made the cut. Close second was Flyspray (thanks to David).

Bugzilla was a complete nightmare. I didn't even get past installing all the required Perl packages: the CPAN automatic module installer kept finding dependencies that it wanted to download, apparently ad infinitum, and it kept asking questions to which I had no good answer ("Use package Test::whatever to do X Y and Z NOW? [yes]").

Both Mantis and Flyspray are quite nice, but Mantis is more complete, and it includes options (that require additional packages) that I'll look at in the next few days, such as adding a forum for discussion and anonymous reporting.

The current setup allows anyone to register and report bugs, I'll leave it like that for now and change it only if necessary.

Oh, right, the link: The database is here.

Categories: clevercactus
Posted by diego on September 6, 2003 at 2:52 PM

now with Atom support: java google feeds bridge

After last week's experiment of a Google-RSS bridge in Java, I took the next step and decided to check out how hard it was to generate a valid Atom feed as well. The result is an update on the Google bridge page and new code.

The idea, as before, was to write code that could generate valid feeds with as little dependencies as possible (It might even be considered a quick and dirty solution). For reference I re-checked the Atom Wiki as well as Mark's prototype Atom 0.2 feed. In the end, it worked. Adding support for Atom begged for some generalization and refactoring (which I did) but aside from that adding Atom support took a few minutes. Here are some notes:

  • ISO 8601 Dates are terrible. I'd much rather Atom had used RFC 822 dates, which are not only easier to generate but way more readable. It's true, however, that once you have the date generator working it doesn't matter. But boy are they a pain. I put forward my opinion on that when date formats were being discussed, but I didn't get my way.
  • I was confused at the beginning regarding the entry content, particularly because of the more stringent requirements that the feed puts on content type. For example, the content of an entry must be tagged with something like "<content type="text/html" mode="escaped" xml:lang="en">". Now, I must be honest here: about a month ago there was a huge discussion on the Wiki about whether content should be escaped, or not, and how, but I didn't think it was too crucial since, on the parser side, which I added way back when in July to clevercactus, it's pretty clear that you get the content type and you deal with it. But I was sort of missing the point, which is generation. When generating... what do you do? Do you go for a particular type? Is it all the same? Would all readers support it? The pain of generating multiple types would seem to outweigh any advantages...Hard to answer, these questions are, Master Yoda would say. So I went for a basic text/html type enclosed in a CDATA section. (Btw, enclosing in CDATA doesn't seem to be required. The Atom feed validator was happy either way).
  • Another thing that was weird was that the author element was required, but that it could go either in the entry or the feed. I understand the logic behind it, but it's slightly confusing (for whatever reason...)
Overall, not bad. But Atom, while similar to RSS, is more complex than RSS. While I have been able to implement a feed that validates relatively easily, it concerns me a bit that I might be missing something (what with all those content types and all). Maybe all that's needed is a simple step-by-step tutorial that explains the "dos and don't dos" for feed generation. Maybe all that's needed is a simple disclaimer that says "Don't Panic!" in good H2G2 style.

Is it bad that Atom would need something like a tutorial? Probably. Is it too high a price to pay? Probably not. After all, more strict guidelines for the content are good for reader software. I thought "maybe if there's a way to create a simple feed without all the content-type stuff..." but then everyone would do that, and ignore the rest, wouldn't they.

Of course, maybe I misunderstood the whole issue... comments and clarifications on this area would be most welcome.

I guess there's no silver-bullet solution to this. The price of more strict definitions is loss of (some) simplicity. The comparison between a language with weak typing (say LISP) and one with strong typing (say, Java) comes to mind when comparing RSS and Atom in this particular sense. I think that I would go with RSS when I can, since it will be more forgiving... on the other hand I do like strong typing. But should content be "strongly typed"? I'll have to think more about this.

Interesting stuff nevertheless.

PS: there's a hidden feature for the search. It's a hack, yes. It might not work forever. Still worth checking the code for it though :-)

Posted by diego on September 5, 2003 at 10:26 PM

now preparing...

...for a feature-freeze in clevercactus and the subsequent bugfixing only period prior to final release. Part of that is installing a bug tracking system, and I'm now looking at bugzilla, scarab, FogBuz and JIRA. I've used scarab internally up until now, but I want to deploy this in the open and I'm not too happy with Scarab's complexity. FogBuz looks nice and its price is reasonable, although if I'm not mistaken it is fully hosted (an ASP model) for the trial and requires Windows if deployed, which wouldn't work for me. JIRA looks nice, but I wonder if I could use the trial as I want to. Bugzilla might be the way to go then; only problem is that the setup is a complete nightmare. We'll see.

Categories: clevercactus
Posted by diego on September 5, 2003 at 1:59 PM

the mobile platform wars


So it only took a couple of days since Motorola bailed out of Symbian for a rumor to surface that they would be releasing a "Microsoft-powered" phone, sold through Orange, later this month. Recently there was another rumor (or news? I can't find a link) that Psion was moving to WinCE for its mobile devices. In the meantime, Linux is making inroads into all sorts of devices. Oh, and, by the way, PalmOS is down but not out yet.

See a pattern emerging here?

Call it the mobile platform wars. In which Symbian has the tactical advantage, but is, strategically, in an entirely different position.

Symbian, by design, allows its licensees to tailor their offering heavily for different devices, which is great for licensees short-term (allows them to obtain early lock-in on features), but not so great for Symbian as a long-term platform. Long term, a platform cannot survive like that, and by extension neither can its licensees. The platform splinters irreversibly, because even though the licensees achieve short-term early lock-in on features, the platform itself has no lock-in.

Symbian won over Palm through faster innovation and larger deployments, but now Symbian is the incumbent, and the game is different. The new entrants are not competing on features, but on platform homogeneity.

It has happened before: Look at UNIX in the 80s.

In theory, you could port applications between UNIX OSes by sharing more than 80% of the code between them through a "standard" called POSIX.

In practice, almost no one did it.

And so the POSIX-UNIX "standard" allowed itself to be overtaken by both Linux and NT. Because once platforms are established, it all goes back to third-party developer support. Why? Because users care about applications and devices, not about OSes. They don't care if, say, the memory space is 32-bit flat. Developers do. Which is why third-party development should be active and growing for any long-term platform.

Which requires a vibrant community. Which requires tinkerers and small developers, as well as big developers. Which requires simplicity and portability within the platform, and a low-cost of entry (read: free, well-documented, well-supported, entry-level tools).

And is it a coincidence that, again, both a Windows variant and Linux are emerging as the greatest threat to an innovative platform? I don't think so.

Netscape, by the way, made similar mistakes with regards to developers. And they played a small but important role in their fall from grace.

But Symbian is improving, and listening. It isn't over. Yet.

Even if it comes to the worst, I can't see Symbian ceasing to exist. When you're talking about millions of devices sheer volume wins, so it's quite possible that in one or two years Nokia will just end up owning Symbian outright and it just will be Nokia vs. Microsoft.

But since Nokia is one platform and one vendor, Java would be a better choice.

Paradoxically, excellent Java support might allow Symbian to prosper by providing the best Java mobile platform around. Palm might yet come around as well and figure out that Java is their best weapon to fight the Microsoft/Linux juggernaut.

And in that case, once again, the only thing standing between us and yet another monoculture would be that sweet smell of digital coffee.

Categories: technology
Posted by diego on September 5, 2003 at 12:11 PM

one/There's a long, ragged trail of light that disappears into the distance, reflections of lamps hunching over the wet streets that seem to move away from where I'm standing, away from me. And yet there's no rain; there hasn't been a lot of rain this summer. I wonder why. Global warming, pollution. Natural cycle. It's always been this way, we just don't remember it. All of the above. Pick your favorite.

two/The sun plunged into those buildings and trees standing over the horizon a while ago: the sky is dark blue, turning darker. A blur of clouds insinuates itself in front of the moon. There are no noises, only the distant sounds of the city as it falls into a superficial sleep. Traffic. Sirens that come and go. Voices heard or imagined.

three/I suddenly realize that I've never seen a bird sleep. My mind pulls out pseudo-knowledge out of nowhere: Birds don't sleep, or rather, they sleep while flying. I could look it up, but I dont' want to. Instead, I wonder: if so, do they dream while sleep-flying? Do they dream of standing still?

four/Reading about post-modernist deconstructionists that say, for example, that everything is ideology or everything is politics, a question pops into my head: How exactly does Missy 'Misdemeanor' Elliot fit into that picture?

five/Nominee for Greatest invention of the century: the OFF button.

Categories: personal
Posted by diego on September 4, 2003 at 10:07 PM

spam gets weird

Received today:

Dimensional Warp Generator Needed

I'm a time traveler stuck here in 2003. Since nobody here seems to be able to get me what I need (safely here to me), I will have to build a simple time travel circuit to get where I need myself. I am going to need an easy to follow picture diagram for a simple time travel circut, which can be built out of (readily available) parts here in 2003. Please email me any schematics you have. I will pay good money for anything you send me I can use. Or if you have the rechargeable AMD dimensional warp generator wrist watch unit available, and are 100% certain you have a (secure) means of delivering it to me please also reply. Send a separate email to me at: [someemailaddress].
Do not reply back directly to this email as it will only be bounced back to you.

Thank You

LOL! I had received this once before a while ago. I assume that the purpose of this is to add your email address to a list for selling it later... but if they already have the email to send it to me... what are they doing? Confirming it? Who knows. Anyway. Definitely a weird and quirky spam-meme. Even weirded than the infamous Nigerian scam!

Ok back to work.

Categories: technology
Posted by diego on September 2, 2003 at 10:26 AM

time off...

...from blogging, or so it seems. Lots happening. Heads-down working on a number of things (around 6E10 things or so). Lately I've been writing longer entries, with a lot more detail, and I've liked it. Problem is, they take time (even if it is 20 minutes...). Now I don't have time, and yet I can't seem to go switch into "link-only" mode. Oh well. More comments later.

Categories: personal
Posted by diego on September 1, 2003 at 5:14 PM

Copyright © Diego Doval 2002-2011.