it probably wasn't google

Ok, after a bit of digging I arrived at the conclusion that the problem I mentioned yesterday, of google's clicktrack ing breaking my logfiles, was probably not google. Rather, it looks like some search engine who is taking google's results for a query, doing their own parsing, and possibly presenting them as their own. As a hint, the referers were wrong or generally empty (something I should have noticed yesterday but didn't), as was the user-agent field.

So it probably wasn't Google. Ok. But this small "incident" presents a number of interesting questions. Not just for Google (how do you stop something like that from happening? My guess is throwing lawyers at them is the only option, at most being careful about monitoring source IPs for requests... but then again if whoever is doing this is smart they could get around that too). But also for end users on both sides. On my side, this is creating a problem that I have to keep an eye on, and the person who is doing the search is looking at something that
looks legitimate but isn't. Hm. The problems of openness.

PS: Note that in yesterday's post I mentioned that webmasterworld thread that clarified that Google was tracking through JavaScript. The beta site (new design) clearly tracks directly through URLs. What is not clear is whether the beta site also has a new form of tracking, or whether the tracking will again be through javascript when the new site is release. We'll have to wait and see.

Posted by diego on March 8 2004 at 11:01 PM

