R. Bot Olivaw
R. Bot Olivaw is the bot for the conversation engine. Here's what it does:
- It downloads only robots.txt (used later to determine where not to go in a particular site) and "text/html" content-type pages. These are the only type of pages used by the conversation engine at the moment.
- It uses ETag and Last-Modified headers to download only content that has changed (it performs a HEAD request to verify this before doing a full GET).
- It supports GZIP- and Deflate-encoded streams, to minimize bandwith usage when the site supports either of those encodings.
- It doesn't visit a particular page more than once a week.
Bot source IP address
Right now I'm using a single host (my own) to do the spidering, so the bot's requests have a source IP of 67.18.141.130. If not, then it's not me. :)
User Agent string
Here's R. Bot Olivaw's current User-Agent string:
R. Bot Olivaw/0.1 (+http://www.dynamicobjects.com/cengine/rbot.html)
The name
The bot's name is a reference to R. Daneel Olivaw, one of the main characters in Isaac Asimov's Foundation series.
Dec 2 2004