Health Tech: How “Googling It” Can Be Used for Health Care | Shiv Gaglani

CardioSource WorldNews | Each second, there are an estimated 40,000 searches on Google, which means 3.5 billion per day, or 1.2 trillion per year. The overall number of searches is around 50% higher than that given that Google is only one of many search engines that people use (Bing, Yahoo, DuckDuckGo, etc), albeit the market leader. Given that one in 20 Google searches are health-related – or about 2,000 health-related queries a second – a treasure trove of data is being generated that can be analyzed for insights related to both public health and individual health. In this article I want to describe two innovative applications of mining search engine data. Neither of these are ready for “prime-time” but initial results are promising.

The first application is related to monitoring adverse drug reactions, or pharmacovigilance, which occurs in Phase IV of clinical research. Through a system known as the FDA Adverse Event Reporting System (FAERS), the FDA receives form-based reports from patients, physicians, and the pharmaceutical company reporting potential adverse drug reactions. More than one million adverse events are reported to the FDA each year, which can be overwhelming; as a result the system has been criticized as slow to detect safety problems.

The FDA has begun speaking with researchers at Google and Microsoft given the results of two interesting papers that showed search engine data could be mined to detect potential adverse reactions. In a 2013 paper in the Journal of the American Medical Informatics Association (White et al., 2013), researchers at Microsoft and Stanford analyzed millions of searches and showed that people who searched for Pravastatin and Paxil over a 12-month period in 2010 were also more likely to search for terms related to hyperglycemia, such as dry mouth and diabetes. A significantly higher percentage of searches for hyperglycemic symptoms co-occurred with both drugs. The following year, the adverse interaction between two drugs as a cause of hyperglycemia was publicly reported.

In another paper (Yom-Tov and Gabrilovich 2013), a former Yahoo researcher analyzed 176 billion Yahoo searches and found that search for common symptoms such as weight gain or cramps differed between people who included the name of a medication in the search. Search data were more likely to reveal reactions that “occur much later after the beginning of treatment, hence their association to the drug is often overlooked.” This is a growing field known as “infodemiology.”

The second application has the potential to affect screening of individuals for hard-to-spot diseases. In a paper in the Journal of Oncology Practice (Paparrizos et al., 2016), two of the same researchers at Microsoft report analyzing anonymized Bing search logs to identify people whose queries indicated that they had recently been diagnosed with pancreatic cancer. They then looked at these same individuals’ search queries months before they started searching more specifically about their presumed diagnoses, and found that there were patterns of queries that were likely signals. These queries, often spaced apart, were by themselves not cause for concern but in aggregate may have predicted the diagnosis - for example, the constellation of subtle symptoms such as itchy skin, back pain, light-colored stool, and jaundiced skin or scleral icterus. The research team found that they could identify 5-15% of cases while preserving low false positive rates between 1 in 10,000 and 1 in 100,000. The eventual hope is that monitoring these queries may help alert someone to seek professional medical attention.

Of course, there are a number of concerns and limitations to overcome before either of these applications become widely used. One of the most obvious ones is privacy. It’s one thing if Google monitors your searches to recommend where you should buy your next pair of jeans from; it’s a whole different story if they’re trying to determine your health status. The applications ideally would be positive but in the wrong hands could easily pose an issue.

A second issue is the noisiness of the data. For example, one of the original search engine-based health applications was the initially successful Google Flu Trends site, which helped identify areas that were seeing outbreaks of the flu based on searches, even before the CDC knew about these. However, depending on the year it sometimes failed to predict peak flu seasons because of noise in the search results, and Google eventually shut the project down.

With these limitations in mind, as researchers develop more sophisticated machine learning algorithms, one can see how these concerns may be mitigated and the applications may become more mainstream.

Learn more about ACC’s recent collaboration with Google in the winter issue of ACC’s Cardiology magazine at

Shiv Gaglani is an MD/MBA candidate at the Johns Hopkins School of Medicine and Harvard Business School. He writes about trends in medicine and technology and has had his work published in Medgadget, The Atlantic, and Emergency Physicians Monthly.

Read the full November issue of CardioSource WorldNews at

Keywords: CardioSource WorldNews, Adverse Drug Reaction Reporting Systems, Biomedical Research, Drug-Related Side Effects and Adverse Reactions, Pharmacovigilance, Public Health, Search Engine

< Back to Listings