Two days before Google was set to publicly share over 100,000 images of chest X-rays obtained under a partnership with NIH, the federal health agency called Google and informed company researchers that some of the images included personal health data—leading Google to promptly cancel the project, Douglas MacMillan and Greg Bensinger report for the Washington Post.
Google's project with NIH
In 2017, Google partnered with NIH to collect 112,000 X-ray images taken from more than 30,000 patients, many of whom had lung disease, according to emails obtained by the Post through a Freedom of Information Act request.
According to MacMillan and Bensinger, Google had planned to use the images to show its artificial intelligence (AI) teaching tool, TensorFlow, could teach computers to understand which X-rays contained markings of various diseases. Google also had planned to make the raw X-ray data publicly available to other AI researchers, MacMillan and Bensinger report.
NIH shared the images with Google in the summer of 2017 and, according to emails obtained by the Post, NIH and Google employees worked together to rid the images of personal patient data. According to MacMillan and Bensinger, Google was working toward a July 21, 2017, deadline to announce the project and release the data.
However, on July 19, 2017—two days before Google was scheduled to announce the project and publish the images—NIH alerted Google that agency researchers found a number of the X-ray images still contained personally-identifiable information, including the dates on which the images were taken and jewelry that the patients were wearing at the time of the X-rays, MacMillan and Bensinger report.
Google's lawyers then raised concerns that obtaining and reviewing personal health data could create legal problems for the company, a person familiar with the project told the Post. The lawyers then spoke to a Google engineer, who subsequently emailed NIH asking its staff whether the data was HIPAA-compliant.
Google then deleted all of the images from its servers and told NIH it wouldn't be moving forward with the project, MacMillan and Bensinger report.
A 'post-mortem' review
Google mangers eventually conducted a "post-mortem" review of the project and discovered that researchers had rushed toward publicly announcing the project and, as a result, hadn't properly vetted the data or secured any legal agreements regarding the privacy of the information, MacMillan and Bensinger write. A person familiar with the project told the Post that Google hadn't consulted a health privacy expert until the final days before the planned launch of the program, MacMillan and Bensinger report.
Michael Moeschler, a Google spokesperson, said of the NIH project, "We take great care to protect patient data and ensure that personal information remains private and secure. Out of an abundance of caution, and in the interest of protecting personal privacy, we elected not to host the NIH dataset."
But while Google ultimately withdrew from the project, NIH pressed on. NIH spokesperson Justin Cohen in a statement said Google was one of a variety of cloud providers the agency worked with for hosting its X-ray scans. In September 2017, NIH finished scrubbing all of the X-ray images of personally-identifiable information and publicly released the images via the cloud-storage provider Box. According to NIH, no outside company reviewed the images before NIH released them to the public (MacMillan/Bensinger, Washington Post, 11/15).