conversation finder


My next step is to write something different to get the gears moving in my head again, and Don's conversation category idea is appealing. So, let's do that. :)

Don described it as:

A conversation aggregator subscribes to the category feed of all the participants and merge them into a single feed and publishes a mini-website dedicated to the conversation. The 'referee' of the debate or the conversation moderator gets editorial rights over the merged feed and the mini-website. Hmm. This stuff is very close to what I am currently working on so I think I'll slip this feature in while I am at it.
Since this is probably too big to do in quick & dirty fashion, I was a little worried. But last night I thought of a different approach. How about something that finds conversations, rather than is subscribed to certain categories?

After all, we already have a mechanism to define conversation thread: the permalink. Generally when you're in a cross-blog thread, you point back at the last "reply" from the other person. A cross-blog thread also has the advantage of being a directed graph, with a definite starting point. So permalinks and some kind of graph traversal thingamagic could be used to find the threads that exist, and maybe are in progress.

As Don notes, sometimes you might refer to the other party by name, or make oblique references. That could be step two, using text-based search to add some more information to the graph formation. But let's say we start with permalinks only.

Hm. Okay. So what do I need for this? First things that come to mind:

  • A crawler
  • A DB (the tables, I mean)
  • A parser (to find the set of links)
  • The algorithm to find the conversations
  • Some kind of web front end to make it more usable?
Neither the crawler nor the parser have to be super-sophisticated, so maybe they are doable in a few hours. Or a couple of days?

This sounds like a good starting point. First step should be DB & crawler. More later!

Categories: soft.dev
Posted by diego on December 1 2004 at 6:52 AM
Comments (please see the comments & trackback policy).

BottomFeeder already supports this - has for about a year now

Posted by: James Robertson at December 1, 2004 12:51 PM

James: That's cool. What I had in mind is something that would be more similar to a search engine of sorts than an aggregator, almost conversation discovery, rather than conversation tracking.

How would this be different? For example, if you just subscribe to two blogs, an aggregator is not likely to help you find long-standing conversations--the RSS feeds will only cover the last few entries, it simply won't have the data. So doing a full crawl and then inferring the conversation threads sounds interesting to me.

Now, I don't think this is going to be some kind of magical potion of an app, it's just something that would be both interesting to do and help me clear my head a bit. I'd probably do it even if it existed exactly as I imagine it, just for the heck of it. :)

Posted by: Diego at December 1, 2004 1:03 PM

It seems like you could leverage off of Technorati...

Posted by: Geof F. Morris at December 1, 2004 2:51 PM

I thought of Feedster, but Technorati could be useful as well. Either way, just for this I want to do it all top-to-bottom. More fun that way. Maybe later it can be redone using a previously existing index like Feedster. :)

Posted by: Diego at December 1, 2004 5:29 PM

Copyright © Diego Doval 2002-2007.
Powered by
Movable Type 3.35