Why Healthcare Needs the Enterprise Data Lake
Why Healthcare Needs the Enterprise Data Lake
The healthcare industry is inundated by myriad data that has the potential to make a real impact on people’s lives — but not as it exists currently.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
As the so-called “oil of the twenty-first century,” data is the crown jewel of the digital economy. The Internet of Things is poised not only to shake up individual industries but also to bring them together like never before with the promise of hyper-connected, ultra-personalized experiences. And for consumers, one particular application of IoT is perhaps the most intriguing: The impact of connected devices on health and healthcare. Indeed, McKinsey has projected a $11.1 trillion market by 2025, nearly one-third of which will be comprised of healthcare-related devices.
Health-conscious wearables have rapidly gained popularity over the last few years and are making strides when it comes to the complexity and accuracy of the data captured. These devices are monitoring everything from blood sugar to heart rate, tracking trends in medication, diet, and exercise, and communicating this information to providers to enhance and extend care beyond the doctor’s office. We can only imagine these applications growing and diversifying as technologies advance and become more affordable.
But like any conversation around data management, we know it’s not enough to simply collect massive amounts of data. Information must be captured in a way that makes it readily available and actionable for healthcare organizations and doctors, and in turn, their patients. Data volumes are exploding, the nature of data is changing, and the underlying technologies are being augmented or replaced by newer systems like Hadoop, MapReduce, and HIVE. Beyond traditional healthcare data sources like EMR, PACS, transactional databases, CRM systems, and financial and prescription data, new unstructured and semi-structured data sources are rapidly emerging. The result is that the healthcare industry has become inundated by a myriad of data sources from multiple locations, all of which has the potential to make a real impact on people’s lives — but not as it exists currently.
The best way we can hope to unleash the power of big data for healthcare is to rethink how we capture, organize, and analyze it. Healthcare CIOs are already well aware of the shifting landscape and are focusing on refining and advancing internal systems — but they must also shift their focus to include integration and leveraging a system of insights. Both providers and payers are in desperate need of a solution that can act as a common data platform and integrate data originating from multiple locations in a variety of formats, all while preserving all of the metadata associated with those data objects. In addition, media overrun and rising infrastructure costs pose a big problem as old data that is seldom used accumulates rapidly, reducing performance and even negatively impacting the accuracy of data analysis. This is where an enterprise data lake with archiving comes into play.
Think about it: Medical professionals need immediate, direct, and natural-language access to analyze all patient data in its original format, as well as intelligent tools that can provide recommendations based on all of the available data. In the case of healthcare, this data consists not only of facts and figures about the patient but also highly pertinent free-form text such as physicians’ notes, radiology reports, medical journal articles, email correspondence, images such as CAT scans or MRIs, genome files, and of course, information collected directly from wearables, respirators, blood pressure monitors, and other connected devices.
Instead of attempting to pull this data from separate sources and manually integrating and maintaining it, all of the data from these disparate sources is fed into a single enterprise data lake that is capable of reaching across multiple internal as well as public cloud systems. Here, the data is highly organized and maintained, and any kind of external analysis tool can easily be integrated to more effectively transform the information into actionable insights for the provider and patient. The beauty of this approach is that security levels can be individually maintained as appropriate to each separate database. This is critical to ensuring that patient data is managed sensitively so that organizations can adhere to the strict privacy and compliance regulations unique to healthcare. Entire patient records can be handled with complete and full control to ensure that only the right patient data is shared with the right people. In addition, old and inactive data is automatically archived, thereby combatting the high costs, potential problems, and inefficiencies of media overrun.
As the applications and capabilities of wearables continue to rise, we need a smarter, scalable way to collect, house, and manage the oceans of data that ensue. Organizations that leverage the enterprise data lake will be empowered to cut costs, streamline resources, and ultimately do more with their data. In the end, this will translate to higher satisfaction among providers and patients alike and drive more effective outcomes in patient health and wellness.
Opinions expressed by DZone contributors are their own.