The term “Big Data” is increasingly used in our everyday lives. But each mention of it means something different, unique to what we use it for and how we interact with it. Big Data is not information. It’s the raw resource that people can use to discover new insights. Just as raw crude needs to be refined to run a car, Big Data needs to be refined to provide useful insights. In 2001, Doug Laney, who currently works for the analyst firm Gartner, defined this raw resource in terms of its three ubiquitous attributes, “the 3 V’s” – Volume, Velocity, and Variety.
According to Eric Schmidt, the Chairman of Google, today we generate as much data in less than two years than we did from the dawn of civilization up to 2003. That’s volume. In 1 minute there are 2.4 million Google queries, 547,200 tweets, 204 million e-mails sent, and that’s just 3 categories out of the thousands of ways data is continuously generated. This is velocity and variety.
Making healthcare healthier.
According to the consultancy McKinsey & Co., healthcare represents more than 17% of U.S. GDP, almost $600 billion more than expected for a nation as big and as wealthy as the U.S. There is a lot of waste in the almost $3 trillion dollar U.S. healthcare industry and, for the first time, the new prevalence in well-integrated electronic healthcare records (EHRs) is allowing health insurers and government services such as Medicare and Medicaid to identify fraudulent practices automatically. EHRs have become the norm thanks in part to President George Bush’s plan in 2005 to computerize American’s healthcare information and President Obama’s Affordable Care Act in 2009 to incorporate incentives to share healthcare information through health information exchanges. As of 2014, these initiatives have given 76% of hospitals the ability to record and access patient data electronically, which has created a digital health map for millions of people.
Electronic Health Record Adoption 2008 – 2014
(*note: Clinician notes denote facilities with EHR systems capable of capturing patient-physician interaction via free-form text)
A recently released free mobile phone application by MicroStrategy allows anyone to look at Medicare billings by any physician in the U.S. Information is available based on the number of procedures performed and number of patients treated. Anyone with the technical expertise can analyze this data for patterns and anomalies and identify dubious practices. This is exactly how Medicare found physicians who were inappropriately prescribing well-reimbursed procedures including an ophthalmologist in Florida who billed Medicare more than $21 million in 2012 alone.
Healthcare providers are also generating substantial savings due to the increased quality of the data available. Kaiser Permanente created a new platform to ensure data is shared between all medical facilities. The integrated system has helped the company save over $1 billion from fewer required office visits and tests.
In his book, Predictive Analytics, Eric Siegel describes the breadth of uses of Big Data and predictive analysis in the healthcare industry today.
- Google Flu has shown to forecast an increase in influenza cases at hospitals 7 to 10 days earlier than the Centers for Disease Control and Prevention (CDC) by analyzing online search trends.
- Stanford University has built a predictive model that diagnoses breast cancer better than human doctors by considering a greater number of risk factors.
- The University of Pittsburgh Medical Center predicts a patient’s risk of readmission within 30 days in order to assist with the decision of release.
McKinsey & Co. estimates that increased integration and sharing of data sources will reduce healthcare costs in the U.S. by $300 billion to $450 billion, and that’s not counting the impact of yet undeveloped radical innovations and use cases.
At the individual level, devices are taking patient monitoring to new heights. A new mobile application, Ginger.io, allows physicians to track consenting patients and help them with behavioral-health therapies. Ginger.io collects data about phone calls, texts, location and even motion. Patients also have the ability to complete surveys to better contextualize the data collected about them. The application then combines patient data with research on behavioral health from the NIH to reveal new insights.
Caution: pitfalls ahead
Although Big Data and data science can help the world become a healthier place, the new opportunities are not risk-free. We need to heed the caution signs along the way.
- Privacy: data privacy continues to be a problem in healthcare. Medical data can be sent around to third parties as part of administrative processes or prescriptions. In one case, a mother and daughter’s medications were mixed up, which led to an unintentional disclosure of a medical condition.
- Data integrity: the accuracy of collected data is also a problem. Many patient histories can be subjective and a lot of information concerning prescriptions and patient visits are still entered manually which can be prone to errors. While data can give us many answers, we must also question the source to ensure reliable results.
- Education: data analytics are most effective when an industry expert understands the methods and applications of the data. In order for analytics to reach its full potential, healthcare professionals need to be trained to understand the implications behind data analysis.
- Ethics: data ethics is still a nebulous area in the data realm. Because much of the data available has come into existence recently, there aren’t many standards in place. Even though health insurers cannot use preexisting conditions to reject applicants, they still use prescription data to identify high-risk patients and set rates. Is this ethical behavior? Maybe not, but there are currently no policies in place to prevent this from happening.
So what does the future hold? Perhaps there will be a time where your social media posts about being sad will automatically trigger a notification to your doctor. Or perhaps your Fitbit data will be used to set insurance premiums. Even your diet and medicine could one day be custom tailored to your genetic makeup at the price of a generic drug today. We’ve already seen the positive impacts that Big Data analytics have had across the healthcare field and, as long as we continue to proceed with caution and foresight, the possibilities are endless for creating a healthier and happier world.
Merav Yuravlivker is co-founder of Data Society, a Washington, D.C.-based organization dedicated to democratizing data literacy by teaching everyone how to turn Big Data into Big Insights. Cameron Warren is a Data Scientist and contributor to Data Society’s educational curriculum. Data Society is proud to partner with RealWorldHealthCare.org on its Big Data in Healthcare series.
 Imler, Dr. Timothy. “Getting Down and Dirty With Big Healthcare Data.” The Huffington Post. November 28, 2015