I think there are few people as paranoid as I am about which Web cookies are being placed in my browser. On most of my browsers, I have it set to accept only cookies from the domain I'm connecting to and allow me to review all cookies before I accept them. There are problems with this. To quote from Wikipedia's article on HTTP cookies:
Cookies have some important implications on the privacy and anonymity of Web users. While cookies are only sent to the server setting them or one in the same Internet domain, a Web page may contain images or other components stored on servers in other domains. Cookies that are set during retrieval of these components are called third-party cookies.[...]
Advertising companies use third-party cookies to track a user across multiple sites. In particular, an advertising company can track a user across all pages where it has placed advertising images or web bugs. Knowledge of the pages visited by a user allows the advertisement company to target advertisement to the user's presumed preferences.
This is a nice synopsis. According to this explanation and conventional thinking, you should just not allow third-party cookies and your privacy will be intact (or at least better protected).
Enter Google Analytics
Google Analytics (GA) is a Web tracking software which Google offers free of charge to customers. Again, Wikipedia has a pretty good article about how it works. As a GA user, one adds a little piece of "hidden" Javascript on a Web page and as people visit your Web site, cookies are placed in the visitor's browser to keep track of them as they move from page to page. GA keeps track of timestamps, IP addresses, referrers, browser type, etc. A GA customer gets access to a dashboard which tells him useful statistics about his Web site. GA has the ability to drill down to specific pages, do time analysis, do geographic distributions, and generate a variety of reports. Don't get me wrong, the "good" part of Google Analytics is the ability for marketers, and those without access to traditional Web logs (or a good analysis program) to get quality Web analytic data about their sites.
Let's get to the "bad" part. In fact, it's not only bad, it's insidious. I'll explain. Since GA cookies come from the "originating" Web site rather than from Google, they are NOT third-party cookies, but they behave much like them. They allow the local GA user to track his Web site, but in order to make the report, that information also goes to Google. As more and more sites use GA, as users use Google Apps (and therefore identify themselves and their current location on the Internet to Google), it becomes possible for one company to track the nearly complete on-line behavior of a typical Internet user. That company is of course Google, by virtue of making life easier for their GA customers. It helps them better target on-line ads and AdWords to me and my typical use of the Web. Google Analytics is both brilliant and scary at the same time.
The way I battle GA is to deny GA Web cookies while still allowing cookies which the originating site requires to do business with me. This is sometime problematic. The previous version of GA required me to block at least 4 cookies per page. Sometimes pages included multiple elements that each had their own GA javascript in them. Sites like the RealAge site which Professor Chris Long blogged about require so many cookies for so many elements that it is literally impossible to take the RealAge quiz without allowing all cookies (including many, many GA cookies). I just gave up...
The new version of Google Analytics tries to place 11 cookies/per instance of GA javascript in a Web page. This is getting ridiculous. Why not just "deny all" cookies for these sites? That, of course, works for some sites (e.g. The College of Education, Penn State Live, and various blogs by my ITS colleagues), but there are many sites using them (e.g. The Penn State Office of Human Resources) where I do need to interact with the site to do my job (as supervisor, as employer, etc.). Right now, most of these sites only use the 4 cookie version of GA. What happens when they start using the 11 cookie version? Will I persist or just give up?
As an exercise in my growing pain, browse the sites you normally would first thing in the morning, but do them with the "ask me every time" flag set on "Accept Cookies?" If you really want to see something, delete all the stored cookies in your browser and all of the sites you have told your browser to "always allow/block cookies" from. (Hint: the ones that look like "_utm"something are GA cookies -- look at your cookie cache before your delete it, do you have any?)
What can be done?
- One of the things I'm thinking about is a Firefox plug-in to intercept and deny all GA type cookies. I know there are ones which put incorrect information in the GA cookies and yield an incorrect report, but I think "poisoning" the information is wrong.
- You as a GA customer can stop using it.
- Penn State could issue a policy against the use of GA on Penn State Web sites as CIC peer The University of Indiana has.
The last suggestion is worth thinking about, but I think we can't do it unless we have an alternative -- a Penn State Web statistics/analytics service which any of us can use (regardless of "official" nature of the Web site). As far I know, Indiana does not provide a centrally supported alternative.
Should I learn to stop worrying and love Google Analytics, or should I continue my Pyrrhic battle to keep my browser GA cookie free?
It's enough to make you lose your cookies...
