Google Knows Even More about You Than You Think

Privacy policy opacity isn't limited to Google. It's so prevalent, in fact, that the Federal Trade Commission warned the industry in February that online businesses will face increased regulation unless they produce privacy statements that explain in a "clear, concise, consumer-friendly and prominent" way what data the companies collect, how they use it and how users can opt out (download PDF).

Google, however, contends that the concerns about opacity and the scope of data it collects are overblown. "I do push back on this notion that what we have is a greater privacy risk to users," says Mike Yang, product counsel in Google's legal department. Google, he says, gives users plenty of transparency and control. "There's this notion that an account has a lot more information than is visible to you, but that tends not to be the case. In most of the products, the information we have about you is visible to you within the service."

In fact, though, the data Google stores about you falls into two buckets: user-generated content, which you control and which is associated with your account; and server log data, which is associated with one or more browser cookie IDs stored on your computer. Server log data is not visible to you and is not considered to be personally identifiable information.

These logs contain details of how you interact with Google's various services. They include Web page requests (the date, the time and what was requested), query history, IP address, one or more cookie IDs that uniquely identify your browser, and other metadata. Google declined to provide more detail on its server log architecture, other than to say that the company does not maintain a single, unified set of server logs for all of its services.

Google says it won't provide visibility into search query logs and other server log data because that data is always associated with a physical computer's browser or IP address, not the individual or his Google account name. Google contends that opening that data up would create more privacy issues than it would solve. "If we made that transparent, you would be able to see your wife's searches. It's always difficult to strike that right balance," Yang says.

You do have more control than ever before. Google says it removes user-generated content within 14 days for many products, but that period can be longer (it's 60 days for Gmail). For retention policies that fall "outside of reasonable user expectations or industry practice," Google says it posts notices either in its privacy policy or in the individual products themselves.

You can control the ads that are served up, either by adding or removing interest categories stored in Google's Ads Preferences Manager or by opting out of Google's Doubleclick cookie, which links the data Google has stored about you to your browser in order to deliver targeted advertising. For more information, see "6 ways to protect your privacy on Google."

Shuman Ghosemajumder, business product manager for trust and safety at Google, says users have nothing to worry about. All of Google's applications run on separate servers and are not federated in any way. "They exist in individual repositories, except for our raw logs," he says. But some information is shared in certain circumstances, and Google's privacy policies are designed to leave the company plenty of wiggle room to innovate.

Yang points to Google Health as an example. If you are exchanging messages with your doctor, you might want those messages to appear in Gmail or have an appointment automatically appear in Google Calendar, he says.

Subscribe to the Today in Tech Newsletter

Comments