Healthcare Reform and Application Performance Monitoring
Regardless of your political views, the healthcare reform is truly, and no pun intended, reforming healthcare in the United States. Everyone is probably familiar with the Affordable Care Act (ACA) of 2010, or “Obamacare” which was enacted to increase the quality and affordability of healthcare in the United States. Another legislation which affects the healthcare industry was enacted in 2009 and it is commonly known as the “Stimulus”. Among the many provisions of the “Stimulus” or “The American Recovery and Reinvestment Act (ARRA)” are new regulations around Healthcare IT (HIT), chief among those is Meaningful Use (MU).
Broken out in 3 stages, the MU programs provide financial incentives for the “meaningful use” of certified Electronic Medical Records (EHR) technology. To receive an EHR incentive payment, providers have to show that they are using their certified EHR technology by meeting certain measurement thresholds that range from recording patient information as structured data to exchanging summary care records.
The HIT Industry is Slow to Change
While the ARRA provides financial incentives to hospitals and eligible professionals to automate medical records (let’s call this the carrot), it also penalizes hospitals and eligible professionals that do not demonstrate and attest to MU by reducing Medicare and Medicaid reimbursements over time (let’s call this the stick).
MU will require change and, based on my years experience as an HIT consultant and application provider, the healthcare industry is slow to adopt change.
I once participated on a major system upgrade for a large hospital system. This upgrade was necessary for Meaning Use Attestation. The hospital was three major releases behind from the current release of their EHR software and the features of the software required for MU attestation where only available on the latest release. By the way, this upgrade latency is not uncommon in HIT.
Because of the severity of the change and the complexity of the environment, contingency plans were put in place to assure the hospital could continue to care for patients should any of the EHR components fail to upgrade properly. However, nothing could prepare the team for what happened next.
A Problem Arises
Two days after the final upgrade outage, and just as everyone was ready to head home after a number of sleepless nights, a frantic call came into the upgrade command center. The call came from the nurse shift supervisor of the emergency department (ED). If you have ever met an ED nurse, you will understand it when I say that an ED nurse is not someone you want to upset.
The vast array of people caring for patients in an emergency department can be overwhelming. In order to provide visibility and bring order into a very intense operation, the ED relies on a number of critical tools. One such tool is the Tracking Board application. The Tracking Board application provides visibility into length of stay, staff assignment, room assignment, lab order tracking, patient criticality and many more vital data points. All of which can have a major effect on patient safety – the top priority for any healthcare professional.
The ED tracking board was unusable and the ED was operating in the dark. Without the visibility provided by the Tracking Board application, the ED was at a stand still. While far from ideal, in such situations, the ED shuts down. But because this particular hospital is the only Level I trauma center in the region, this wasn’t an option. Because the software was composed of other modules that were working properly and the technical dependencies among modules, a downgrade wasn’t an option either.
The command center became a war room; Clinical analysts from the hospital, project managers from both sides of the implementation, a large ensemble of high-level clinical and executives from the hospital, the entire infrastructure team, DBA’s, interface engine administrators, and developers from 3 different continents, were all locked in and given clear instruction: “Don’t leave until this issue is resolved”.
Minutes turned into hours, then into days. The situation in the emergency room was coincidentally turning into an emergency itself. The Chief Medical Information Officer (CMIO) of the hospital brought the vendor project manager to tears and the ED nurses were gathering their torches and pitchforks and marching against the IT department. All appeared lost and after days of outage and close to 1,000 man-hours spent trying to find the root cause of the problem, everyone was ready to walk out.
The patients however, could not walk way. Many of them had life threatening conditions, and the queue outside the ED was only growing longer. Patient safety is job #1 for everyone in healthcare, and the unavailability of the ED Tracking Board application was affecting every patient’s safety!
APM Solution to the Rescue!
Clearly, it was time for an intervention. Unlike the TV reality series, this intervention didn’t come in the form of over-emotional family members, but in the form of an APM solution. AppDynamics was deployed and quickly generated a flow map of the entire application environment. Within minutes, business transactions (BT) from within the software itself and from all adjacent systems that interface with the Tracking Board application began to pour in.
A business transaction (BT) is a key feature of AppDynamics, which, in simple terms, allows the users to map the application based on how the users experiences it.
A key BT captured was one happily named “UpdateCycle”. As reported by the Tracking Board application vendor, the ”UpdateCycle” BT was responsible for querying it’s own database, the interface engine, and a variety of disconjointed data sources and update an operational dashboard displayed via digital signage throughout the ED.
As the team monitored the application, looking for clues as to why the application was failing, we noticed that the UpdateCycle transaction volume was 100x what was expected. In general, the Tracking Board dashboards update every minutes for each one of the viewers. Considering there were ~10 viewers at any given time, the system was designed to support tens of transaction per minute, and was failing because we were receiving thousands of transactions per minute.
A faulty client side configuration was overloading the server and causing it to generate slow responses back to the clients. The listener was working overtime, getting a response back every few seconds, and trying to update the ED tracking board constantly, resulting in constant updates to the signage stations and webpages, making the system inoperable.
Using the APM solution, the team was able to locate the root cause within one hour of deployment and the change itself took less than 5 minutes. The web server was restarted and all was calm in the kingdom of the ED.
4 hours was all it took to download the software, install it, allow for traffic capturing and resolution!