Google Accelerator Breaks The Web?

SomethingAwful is an relatively old Internet Humor site, which may not be to your tastes. That said, their complaints and article about the effects of Google Accelerator seem to be well backed up, and I haven't seen this information come through my normal channels; even some of the dedicated Google weblogs don't seem to have picked this up yet... well, at least the ones I can get through to at the moment.

Apparently the Google accelerator partially works by using your personal cookies to access a site, and someone coming in later can see what the site sent you. Because the site thinks that the cookie is adequate identification, it can send you all sorts of private information, such as message posts for some private group, or... well... anything else. This image of the various logins one person could see the results of is offered up as evidence.

It's a internet humor site, so I can't guarantee that this is legit and I'd like to see corroborating evidence before jumping to conclusions. (Hence the question mark in the post title.) But this referenced forum thread from their site seems pretty earnest, and if this were a joke, it would not be their style and involves a lot of people. (Note, if you want to try to replicate the problem, the person who created that image later in that thread mentions you should "Go to preferences, tell it never to check for updates.") Moreover, as an experienced web developer, I can see that there certainly are plausible caching schemes that would result in this particular hole occurring.

I think I can see the logic on the developer side. A cookie is part of the web page request. Technically, you can't return a page accessed with one cookie when a user with a different cookie requests it, because the requests are not identical and the results could wildly differ. So, to do this "right", you need to treat the cookie effectively as part of the URL. However, thanks to the pervasiveness of cookies used for advertising and user tracking, the vast majority of cookies are useless, and would simply break the system; visitors to CNN, for instance, get a "CNNid" cookie according to my browser, but I doubt that affects much of the site. So Google chooses to ignore that cookie, because the alternative is to request it fresh from CNN for every single user, significantly slowing them down.

However, Google can't distinguish, even in theory, between a tracking cookie and a cookie used as a login ID, so if they ignore, or even just play a little fast & loose with, the cookies, they'll leak private information like a sieve. I assume they choose this because the alternative is that they can't accelerate browsing in general as much, but I'm not sure this is technically feasible.

There is speculation on the forum that Something Awful isn't sending the proper no cache headers, so this is quite likely an interaction between a flaw in their software and a flaw in Google's.

I can't use the software myself, and the above is merely supposition to show it is a plausible accusation; it is not necessarily what is happening, nor am I claiming that it is.

This reminds me of the old Third Voice days; forget my communication ethics for a second and just look at the practical issues. When you stand between a user and the entire internet, you take on a grave responsibility. A single programming error in your service, and you may break the security not merely of your own product, but the entire Internet for every user who uses your product. Scale matters; even though other caching products probably have made this same mistake at some point, none grew large as quickly as Google's. Google is leaking out private information, and while Something Awful is concerned about their private forum accounts, the exact same hole on an insufficiently-paranoid website will leak credit card information, or equally valuable information. Third Voice also stood between the user and the web, and created a cross-site scripting vulnerability on every page on the web, no matter how well protected it was by the owners.

When you take the entire Internet under your wing, that is a massive responsibility. The Something Awful forums may not be doing everything correctly, but unfortunately, for better or for worse non-conformant websites are part of the Internet. I'm not really interested in "fault" or nailing Google; I'm interested in the larger lesson, which is that no matter how you do it, taking on the entire Internet like this is something that can cause a lot of damage and must be done extremely carefully, if at all.

I'm sure this bug can be fixed (though almost certainly at great performance cost, eliminating any Google advantage over conventional web accelerators), but what about the next? As a user, are you willing to risk that? I believe it is possible to rationally answer "yes", I'm not trying to force your hand. But it should be an informed choice, and yes, it is certainly my opinion that this is a bad trade for end users. (The balance may change once the bugs have been cleaned out, but how we will end-users know when that is?)