Now blogging at diego's weblog. See you over there!

the lying game

From an article in the New Yorker:

These days, if youíre looking for a bunch of New York writers, magazine editors and publishing types on a Friday night, track down Mr. Lethem, who has become a kind of mob boss among an ever-growing salon of poker-faced literati obsessed by the spiky parlor game they call Mafia. Thereís no money involved, everyone stays clothed, and the alcohol intake is surprisingly moderateóbut to witness Mr. Lethemís disciples in the throes of their favorite game is to know that the stakes run high.
Having played roleplaying games in the past, this sounds like a lot of fun!

Categories: personal
Posted by diego on January 4, 2003 at 4:34 PM

migrating data structures

from the why-do-we-have-so-many-problems-with-program-versions dept.

Ever upgraded from, say, version 95 to version 98 of a certain program, and found that you had to "convert the files". Ever felt the anguish of looking at that dreaded dialog window that says "Do you want to convert this file to the new version? Yes, No, Cancel" (whatever "Cancel" means). Don't you just hate that?

Yeah, me too.

Well, this past few weeks I found myself on the unenviable position of being on the other side of the fence, having to deal with a new database format for spaces and wondering what to do about it. Sure, one option was to just wipe out the old DB and start from scratch: spaces is in alpha, and people know that they shouldn't be using it as their only app yet. In fact, this is the first road I planned to take.

But.. but... as days passed and I got some feedback from users and I realized that my own data would have to be exported/imported somehow (since I am using spaces as my only PIM) I arrived at the conclusion that I had to write a migration tool.


This is not as easy as it sounds. You need to ship a program with two incompatible formats that won't step on each other, detect the old one, and make a conversion if necessary. All without bothering users and destroying information.

As I struggled trying to find a good way to do this (which happened, eventually, and alpha 1.6 was released with an automatic conversion mechanism) so that users wouldn't have to deal with a problem I had created, I realized that this was being much harder than it should be. Then I realized why. I had never done it before.

I have years of experiences with databases in real world settings, in applications deployed to thousands of users, in client and in server settings. I've had to design DBs, install them, maintain them... and yes, sometimes migrated their contents, but in a painful "by-hand" process that involved converting tables, etc, and making sure that data was not lost.

Spaces, being a consumer app, was different: the conversion had to happen automatically, and fast. It might look like a small distinction, but it's not. It's a whole different ballgame.

And, I wondered, why did this seem so strange? The answer: I had never been exposed to it.

In abstract terms, a database is a persistent collection of data structures of one sort or another. Both in Computer Science courses (at college or wherever) and in the "real world" we are taught and learn how to design the best data structures, we discuss their efficiency, their tradeoffs (size, performance, etc), their APIs. But rarely, if ever, is the topic of migration discussed. The famous Object-Relational Mapping Problem is, largely, a problem of "impedance" between different paradigms, but I've become convinced that it's also been created in part by this rigid thinking of structures as creatures that never, ever change, so when change happens we don't know what to do.

The roots of this are, I think, in the idea (also taught and practiced widely) that data structures, like programs, should be designed once, and then implemented, and that's it. There is no concept of evolvability built in how data structure design is taught (and expected to be performed).

This, of course, is just plain wrong. And APIs are not enough. Well-designed APIs are an important element on any migration (and they were what finally got me out of my self-created hole). But there is more. The API itself reflects the underlying data structure in one way or another, so the data structure itself should be analyzed, at least a bit, to understand what will be involved in migrating to a possible future (different) structure.

That is, data structures should not be designed to apply to all cases, but to migrate gracefully. Similarly to the concept of test-oriented programming (one of the components of eXtreme Programming) where you write the test first and the code later, we should work on data migration first, making sure that one version is compatible with another in some automatic way (and the migration process is already in place), and then do the change.

Just like with test-oriented programming, this builds up confidence: since you know the data won't be lost, you are free to design better data structures as you realize what's needed for every particular case.

This doesn't change the data structure design process itself, it changes what comes before (the planning process) and what comes after (the implementation). The implementation in particular requires hooks that will allow you to perform the migration easily and automatically. Every database manager should include a version manager, with hooks to define when a particular object is of a given version, and, if necessary, how to convert it.

In all cases (and this is a UI problem more than anything) the user should barely notice that something has changed. If there is a long operation involved, yes, a progress bar of some sort will be necessary. But nothing more.

It's time for programs to be "responsible" and take care of things transparently instead of involving the user in decisions over things they don't want to know about. For all of us developing applications, having more practice with data structure migration (as opposed to simply data migration, since developers deal with the code) and how to automate it would go a long way towards that.

Categories: technology
Posted by diego on January 4, 2003 at 4:32 PM

heavier than heaven

In the last couple of days I re-read Heavier than Heaven, a biography of Kurt Cobain by Charles Cross (just as I wanted back at the end of September). Such a good book. It's one of the best books I've read in a while, and certainly one of the best biographies I've come across. Sometimes he drifts off into fictional territory (in particular, when describing Cobain's actions just before he shot himself, with details such as where he sat as he wrote the suicide note, or what song of REM's Automatic for the People was playing on the stereo as he finished writing it) but it doesn't matter: it feels true. And it doesn't change the essence of the Cobain's story.

Now I should go back to finishing Molloy and Dispatches but for some reason I am tempted by The Great Gatsby. We'll see.

Categories: personal
Posted by diego on January 4, 2003 at 12:44 PM

the responsibilities of writers

An article from The Guardian. I agree with some of the things he says, but I think he ignores the side of writing that is simply entertainment (and, in these days, "commercial events" such as the release of a book by, say, Michael Crichton). Entertainment-oriented writing will be not necessarily be art, and it might not necessarily have a message beyond "some people are bad." The portion of the article I like best is when he describes how the story carries the writer forward, rather than the other way around: it's definitely true.

Categories: personal
Posted by diego on January 4, 2003 at 12:36 PM

the fires of war

From CNN:

In Kuwait, the Persian Gulf War left behind heavy environmental damage. Day vanished into night, black rain fell from the sky, and a vast network of lakes was born ... lakes of oil as deep as six feet.

Categories: personal
Posted by diego on January 4, 2003 at 12:34 AM

Copyright © Diego Doval 2002-2011.