Tackling Privacy and Security When Building AI in Healthcare
Want to learn more about combating privacy and security when automating data and building AI in the healthcare industry?
Join the DZone community and get the full member experience.Join For Free
This above diagram shows the uses of healthcare data (from Appari and Johnson, 2010).
Getting quality data for training AI models is a challenge, but a much bigger challenge is trying to work with large quantities of data in the healthcare space. This is largely due to the privacy issues associated with medical data.
Since the privacy risks and sensitivities are greater for people’s private health information, then, for example, spending habits or TV viewing habits are often illegal to share or use such information for building an AI model. This poses a challenge for developing and using AI in healthcare.
However, there are appropriate methods for collecting data with AI in the proper legal and ethical manner, and this is what we want to cover in this article.
Higher Level of Risk
The 2018 HIMSS Cybersecurity Survey mentions how hackers and negligent insiders are a security risk for AI healthcare systems. Hackers are outside intruders, while negligent insiders are healthcare personnel who inappropriately access and use healthcare data.
A recent breach in healthcare data included unauthorized access to medical records by a pharmacist. There were also intrusions by outsiders, including a phishing attack, hackers, and even attacks from ransomware and malware. These attacks were not limited to one organization or even one country.
A significant challenge when dealing with healthcare data is the issue of privacy. Many countries have strict privacy laws and regulations that have to be adhered to when it comes to dealing with an individual patient’s healthcare information. This can make collecting healthcare information and sharing such information a significant challenge. Regulations are important for protecting the privacy of healthcare information. However, this can limit the usefulness of the data.
Regulations and privacy concerns can, therefore, make collecting healthcare data a challenging and time-consuming process. Obtaining data can be very difficult and expensive. Patients certainly have the right to privacy since healthcare data is sensitive and could certainly impact a patient’s life if it was made public. For example, a patient may not be promoted at work or receive life insurance if their medical condition is disclosed to certain individuals.
Threats to healthcare data can come from within an organization and from outside the organization. A hacker would be an example of an outside threat. This means that EHRs need to be maintained in order to be as secure as possible and prevent outside intrusions. Since most EHRs are likely to be web-based, security becomes very important. Disclosing information for secondary use in research is challenging since identifiers of individual patients has to be removed, yet the data has to remain useful. There are challenges to maintaining the privacy of data but still being able to use and collect data.
Benefits of Sharing Data
There are significant benefits to sharing healthcare data. Sharing data is beneficial for medical treatment since doctors can gain a comprehensive picture of a patient’s symptoms, prior treatment, and medication history. This is the primary use of the data for the direct benefit of the patient. Data is also very useful for the secondary use of medical research. Such research can provide insights into disease trends and indicate if treatments are effective.
Types of research that can benefit from access to large amounts of data, for example, examining if certain treatment options work for patients with certain diseases. Research can also indicate which age groups and patients seem to be more susceptible to a particular disease. Such secondary use of healthcare data is, therefore, important and will be beneficial for patients since the information gained can inform future healthcare management decisions. For instance, data may indicate when the best time to start screening for a disease is, as in what age group.
Healthcare data can be used for quality control to evaluate the effectiveness of a program or healthcare system. For example, outcomes of patients seen in an emergency situation can give insights into how well a hospital is performing. Data indicating the incidence of nosocomial (hospital-acquired) infections are also useful in quality control since it can indicate if sterilization standards within the hospital are adequate and being met. Such information is important and beneficial to all patients since a hospital needs to know how it is performing and if improvements are needed to ensure the safety of patients. There are, therefore, numerous benefits to the sharing of healthcare data.
Legal Frameworks for Handling This Tradeoff
It is crucial that security risks be evaluated and that security systems be put in place to attempt to thwart attacks from outside intruders who may send malware into a system. Internal threats to security may be harder to address since many individuals need access to data for patient care, billing, and health insurance billing practices. The increased complexity of data also makes maintaining privacy and security difficult. Rules and regulations need to be made and adhered to by the individuals who do have access to healthcare data.
Different countries have different regulations when it comes to patient privacy of healthcare records. Some of these regulations can be very strict. For instance, the United States has the HIPAA law. The Health Insurance Portability and Accountability Act (HIPAA) of 1996 has been criticized as needlessly complex. HIPAA is a federally mandated regulation that sets a minimum standard to ensure patient privacy. ALong with that, HIPAA requires that patients give consent before their information can be used, even for secondary use, such as for research purposes. Many researchers feel that the complexity of HIPAA has led to delays in conducting research and increased costs as a result.
British Columbia changed their Ministry of Health Act so that a person’s healthcare data can be used for secondary uses, such as for research studies. This has caused much debate, with some individuals concerned that this is a breach of privacy, while others believe that the benefits outweigh the privacy concerns.
Systems need to be kept secure from outside intrusion and data privacy regulations need to be followed to ensure the privacy of data.
What You Should Do
- Make sure that you understand the laws relating to the data you operate on. There are criminal and civil penalties if you don’t. Each country has their own laws relating to data privacy in healthcare. You need to be aware of your country’s laws so that you do not risk exposing yourself or your organization to criminal or civil penalties. It is, therefore, crucial that everybody in the organization is educated about the law.
- Make sure that your AI platform supports all of the security, privacy, and compliance controls that you are required to have in place. In other words, don’t be surprised to find that a data scientist copied two years worth of patient records to their laptop for convenience. The AI platform needs to be set up with security in place to avoid issues, such as people unlawfully copying data. Software programs can also be used to monitor activity on computers.
- Make sure that your AI platform, tools, and processes still enable your data scientists to be productive and get their job done despite the extra security. It’s very possible to do this right — it simply requires more work. The appropriate and correct training of individuals should enable them to use AI systems correctly and productively even with the extra security in place.
- De-identify datasets whenever possible and to the maximum extent possible. This should be done using externally verified de-identification methods. Home-grown, one-off solutions have shown to be highly susceptible to re-identification schemes. De-identifying data is crucial to ensure the privacy of patients and is an important part of privacy laws. It, therefore, needs to be taken seriously and proper robust methods used.
- Lastly, it must provide a healthcare-specific AI platform that, among other things, is externally vetted to process personal health information (PHI) by a team of practicing data scientists. This means that the system at John Snow Labs is secure and rules and regulations are adhered to.
Data privacy and security is an important component of AI systems in healthcare today. Privacy of data needs to be maintained and data needs to remain useful and accessible for both primary and secondary uses. You need to be aware of the laws and ensure that AI platforms are secure and compliant to avoid unlawful data access. There are numerous security risks, including unauthorized data access by insiders and hacking and phishing attacks from outside intruders. System security is important and de-identifying data is an important and necessary part of maintaining the privacy of data. Therefore, individuals need to be well-trained.
Opinions expressed by DZone contributors are their own.