Over 1.5 million people are diagnosed with cancer each year in America alone. But despite these huge volumes, a tiny amount register for clinical trials. Indeed, research currently relies on data from just 3% of patients.
MIT’s Regina Barzilay is hoping to rectify that. Via the MIT Stata Center, she leads a Machine Learning-based project that aims to derive insight from patient data.
She’s working with a consortium of partners, including Massachusetts General Hospital (MGH) to bring data science to clinical research.
Mining Medical Data
The project allows for a patient’s pathology reports to be searched and interpreted, with to date over 100,000 such reports extracted and analyzed using natural language processing (NLP). The accuracy of the algorithm is currently running at around 98%, with the next step being to incorporate treatment outcomes into the process.
The initiative follows on from previous work conducted by Barzilay and her team around atypias, which are used to identify patients at risk of developing cancer as they get older.
A key part of their work is to ensure that the data and recommendations that come out the other end of the model are in a format that can be interpreted and understood by people that aren’t trained data scientists, whether that’s clinicians, nurses or patients themselves. What’s more, it’s also important that the models can be explained — and they’re working on systems whereby the algorithm also explains it’s reasoning.
Prevention Rather Than Cure
The next step is to look at developing tools that help prevent illness from occurring. Mammograms, for instance, contain a lot of information — and often too much for humans to decipher. Machines have few such capacity issues and are capable of spotting patterns that humans miss. As such, the team is working on Machine Learning-based algorithms that can hunt for insights in mammogram data automatically.
Firstly, the team is computing the density and other metrics typically used by radiologists when analyzing mammogram images. Eventually, they hope to be able to spot patients at risk of developing a tumor before it becomes visible on the scan. They also believe they will be able to similarly spot patients at risk of developing a recurrence.
Such innovations are increasingly common, and will only become more so as researchers gain access to more, richer data about us and our lives, and how this contributes to our health. There have already been some fascinating developments in the field, and I feel we are at the very beginning of the journey.