Google admits data mining student emails in its free education apps

Jeff Gould by Jeff Gould, SafeGov.org
Friday, January 31, 2014

When it introduced a new privacy policy designed to improve its ability to target users with ads based on data mining of their online activities, Google said the policy didn’t apply to students using Google Apps for Education. But recent court filings by Google’s lawyers in a California class action lawsuit against Gmail data mining tell a different story: Google now admits that it does data mine student emails for ad-targeting purposes outside of school, even when ad serving in school is turned off, and its controversial consumer privacy policy does apply to Google Apps for Education.

At SafeGov.org our work has long focused on the risks of allowing targeted online advertising into schools. This issue has come to the fore as companies like Google and Microsoft have launched a worldwide race to introduce their web application suites into as many schools as possible. In this article we review the background of this debate and then present important new evidence regarding the practices of one of the leading players, Google.

The suites in question are known as Google Apps for Education and Office 365 Education, respectively, and they include basic apps such as email, word processing, spreadsheets, live document sharing, simple web forms and messaging. Their key selling point is that they offer students something almost as good as a traditional office suite in the convenient format of a browser window, and – best of all for cash-strapped schools – they do so at no cost.

Of course as the economist said there is no such thing as a free lunch, and we must look carefully at the business motives behind these firms’ generosity. Here an important difference between the two leaders emerges. Both Google and Microsoft generate substantial revenues by selling online office suites to government and enterprises for annual subscription fees. If the firms offer essentially the same suites to schools for free, it is surely in part because they hope that when students move into the workplace they will demand the same online tools they learned to use in school. This is a business model that is honest about its intentions and serves the interests of both students and the firms. However, there is an additional component in the Google business model that involves advertising, and this is where the trouble begins.

Both Google and Microsoft offer free ad-based email services to consumers – Gmail and Outlook.com (formerly Hotmail). Google’s Gmail pioneered the technique of targeting ads to users based on profiles of their interests. Google creates the profiles with the help of sophisticated software algorithms that sift through users’ past and present emails, record the things they search for on Google’s search engine, and track the web sites they visit (via the cookies placed on many sites by its DoubleClick ad-serving subsidiary).

The activity performed by these profiling algorithms is known as “data mining”, and their power to make accurate guesses about the tastes and likely behavior of the profiled users is quite remarkable. However, not every free consumer email service uses data mining to target ads. Microsoft’s Hotmail, for example, relied solely on demographic information (such as age, gender and location) provided by users when they register. Hotmail’s successor Outlook.com continues this policy, promising that it “doesn't serve targeted ads based on email contents”. While the ad delivery methods used by the major email providers may differ, the basic idea of offering consumers free email in exchange for ads has proven extraordinarily successful. The top three providers – Google, Microsoft, and Yahoo – together count over one billion users. SafeGov does not take a position on the methods used to target ads in these services. In our view all are legitimate business models, provided that consumers are fully informed of how their data is used and have given their consent.

Whether or to what degree these last two conditions are actually met by specific services such as Gmail or Outlook.com is of course a pertinent question. Currently Google faces legal challenges to its use of consumer data mining in both the U.S. and the European Union. EU data protection authorities in particular have determined that Google fails to inform consumers properly of its conduct or obtain their consent, while a major class action law suit in California advances similar accusations. Although Outlook.com appears to have avoided such challenges to date, we should certainly expect that regulators and courts will hold it to the same high standards as Gmail.

The Google and Microsoft education suites discussed above operate under quite different rules than the firms’ ad-based consumer email services. Office 365, developed from Microsoft’s enterprise server-based software packages such as Exchange and SharePoint, was never designed to serve ads and does not have the functionality to create ad-targeting user profiles based on data mining. Microsoft’s Office 365 web site makes an entirely unambiguous pledge in this regard: “We do not mine your data for advertising purposes.”

Google Apps for Education, by contrast, has a more ambivalent policy regarding advertising. While Google pledges not to serve ads to students without schools’ permission, its Google Apps suite, which is a repurposed version of Google’s Gmail and other consumer services, was designed from the ground up to include ad-serving as well as highly sophisticated user profiling and data mining capabilities. Google explicitly offers schools the option of enabling ad serving to student users of Google Apps for Education. Although it does not yet offer to share the resulting ad revenues with schools that choose the ad-serving option, it has clearly left the door open to such revenue sharing in the future. Indeed, it is hard to see why Google would explicitly write the ad-serving option into its standard contract with schools if it did not hope one day to make ads for students a default and perhaps even mandatory feature of Apps for Education.

Targeted ads are worth more than untargeted ads, because advertisers will pay more to put their ads in front of customers who are more likely to buy. The uncanny power of Google’s data mining and user profiling algorithms to target ads effectively has made it the world’s largest advertising company. To cite just one data point, the Mountain View giant last year generated more ad revenue in the American market than the entire U.S. newspaper industry. While we take Google’s word that it does not serve ads to its student users unless it has permission from schools, an important question that until now has gone unanswered is whether the targeting algorithms that power Gmail are still running in Google Apps for Education even when ad serving is turned off. Google’s own web site once supplied an explicit and quite satisfactory answer to this question. Specifically, in a FAQ on its web site devoted to Google Apps for Education, the firm promised that:

“If you are using Google Apps (free edition), email is scanned so we can display contextually relevant advertising in some circumstances. Note that there is no ad-related scanning or processing in Google Apps for Education or Business with ads disabled.”

However, at some point during the past year the crucial second sentence in this statement was deleted from Google’s web site.

Of course it’s difficult to draw firm conclusions from fleeting changes in the wording on a vendor’s web page. Accordingly SafeGov has been searching for further evidence that would help to resolve one way or the other the question of Google’s data mining practices in Apps for Education. When a trove of court documents from a class action lawsuit against Google in U.S. Federal Court was recently made public, we decided to do a little data mining of our own, albeit with tools less sophisticated than Google’s. What we found is worthy of attention.

In a remarkable pretrial document filed by Google’s lawyers, Google explicitly admits for the first time that it scans the email of Google Apps for Education users for ad-serving purposes even when ad serving is turned off. The issue at stake in the case is whether Google has properly informed its users and obtained their consent for data mining and ad serving in Gmail and, by extension, in Google Apps for Education. In the filing in question Google’s lawyers seek to prove that email users must have consented to Google’s email scanning practices – if only “impliedly” – because these practices have been widely discussed in the press and can thus be considered to be universally known. The lawyers seek to establish this point by supplying a long list of published articles that discuss these practices.

Regarding Google Apps for Education in particular, the lawyers state that schools which contract with Google to provide Google Apps “have a contractual obligation to obtain their students’ and end users’ consent to Google’s automated scanning”. The document then goes on to list a number of examples of how educational institutions have carried out this duty to inform users and obtain their consent for scanning. Notably the Google filing cites the web site of the University of Alaska as an exemplary instance of such compliance:

The University of Alaska (“UA”) has a “Google Mail FAQs,” which asks, “I hear that Google reads my email. Is this true?” The answer states, “They do not ‘read’ your email per se. For use in targeted advertising on their other sites, and if your email is not encrypted, software (not a person) does scan your mail and compile keywords for advertising. For example, if the software looks at 100 emails and identifies the word ‘Doritos’ or ‘camping’ 50 times, they will use that data for advertising on their other sites.” Attached as Exhibit 79 is a true and correct print out of UA’s Google Mail FAQ page, which is also available at www.alaska.edu/google/faqs/general/#mail (last visited Nov. 13, 2013). [Declaration of Kyle C. Wong in Support of Google Inc.’s Opposition to Plaintiffs’ Motion for Class Certification, p. 41]

In other words, Google’s own lawyers here confirm in a sworn public court declaration that even when ad serving is turned off in Google Apps for Education, the contents of users’ emails are still being scanned by Google in order to target ads at those same users when they use the web outside of Google Apps (for example, when watching a YouTube video, conducting a Google search, or viewing a web page that contains a Google+ or DoubleClick cookie). This statement thus appears to be what American lawyers call “an admission against interest”.

Google’s data mining and ad serving practices in the versions of Google Apps it provides to public sector institutions such as government administrations and schools have long been a subject of controversy. Media and regulator interest in the issue surged in early 2012 when Google launched a sweeping consolidation of the many privacy policies governing its individual products into a single overarching document. The new unified privacy policy was intended, among other things, to facilitate Google’s ability to combine information about users extracted from its different services – such as Gmail, YouTube, Google search, etc. – into a single integrated profile of each user, thereby enabling ever more accurate – and so more profitable - ad targeting. However, Google vehemently denied that this new consumer privacy policy would apply to governments and schools. Indeed, a senior Google executive told the Washington Post that:

“Enterprise customers using Google Apps for Government, Business or Education have individual contracts that define how we handle and store their data. As always, Google will maintain our enterprise customers’ data in compliance with the confidentiality and security obligations provided to their domain. The new Privacy Policy does not change our contractual agreements, which have always superseded Google’s Privacy Policy for enterprise customers.”

But Google’s court filings in the California class action suit discussed above unambiguously contradict this statement. In one of these filings, a Google employee states that:

Google and [the University of] Hawaii executed an agreement titled “Google Apps Education Edition Agreement” on or about June 21, 2010… The agreement places the responsibility to obtain “any necessary authorizations from End Users to enable Google to provide the Services” on Hawaii, the “Customer.” The “Services” includes Gmail… The agreement also requires Google to comply with the Customer Privacy Notice… and the End User Privacy Notice…

In other words, Google here acknowledges that its standard consumer privacy policy is an integral part of its standard Google Apps for Education contract. It is still possible that, in contrast to the situation described in the Google court filing quoted above, some educational institutions have managed to strike individual agreements with Google that do indeed “supersede” the standard privacy policy. If they exist, however, Google has curiously not chosen to make any such agreements public. Indeed, there is evidence that Google imposes “gag clauses” on schools that sign contracts for its Google Apps for Education, forbidding them from disclosing the terms and conditions they have received.

In sum, then, we have learned from Google’s own statements that:

  1. Ad serving remains a standard option in Google Apps for Education,
  2. Even when ads are turned off (as they currently are by default) Google still data mines student emails for ad targeting purposes, and
  3. Google’s consumer privacy policy is incorporated in standard Google Apps for Education contracts.

It is a natural and very plausible – though of course not certain – inference from these facts that Google intends one day to make advertising a standard feature of the version of Google Apps it offers to schools.

Where is the harm in allowing targeted advertising in the online web applications that schools provide to their students? This is a vast and important question that we lack the space to address here but will investigate in future work. Suffice it to say as a starting point that SafeGov surveys of parents around the world have unfailingly shown a very high level of parental opposition to such advertising and the intrusive profiling of student online activity that makes it possible – typically in the 80% to 90% range[1]. We believe that policy makers, education authorities and data protection regulators will not choose to ignore the will of parents on this issue. Stay tuned for further research from SafeGov on this vitally important topic.


[1] For example, see results from our U.S., Australia, and Malaysia parent surveys – results from other countries are forthcoming

Read more Read more

More information

Post a comment

Sign in to comment.

Not yet registered? Join the debate