Over a million developers have joined DZone.

Creating AI That Matches the Accuracy of Trained Radiologists

DZone 's Guide to

Creating AI That Matches the Accuracy of Trained Radiologists

In time-critical diagnoses like brain bleeds, early detection can significantly increase the chances of survival. This is where AI can help.

· AI Zone ·
Free Resource

Can an AI become as good as a radiologist and aid care? The answer is yes. In time-critical diagnoses like brain bleeds, early detection can significantly increase the chances of survival. There can be cases where a radiologist is not on duty, or a radiologist might miss sometimes. In such cases, AI can be of help. We have trained a deep neural network to be a second pair of eyes, as good as radiologists themselves — and they don't need to take breaks. The research we did is now available as a preprint and will be submitted to peer-reviewed journals soon.

In a previous article, we told you about our all-new NLP stack, which is felicitated by our data tagging engine and our data science team’s efforts to be in touch with the most cutting-edge tech. With this new series of articles, which try to make common sense of our research, we are trying to get you closer to what happens in the AI deployment trenches. Hope you enjoy.



Head injuries, or any other condition that can cause a brain hemorrhage or brain bleed, are serious and need to be detected as soon as possible. The preferred method of diagnosis is a computed tomography (CT) scan. A CT scan is detailed enough (though it's not as detailed as an MRI) and it’s quick (an MRI is slower). An NCCT scan (Non-Contrast CT) detects brain hemorrhages and is a 3D map of the brain where the brain can be viewed as a sequence of 2D slices. Doctors generally scroll up and down these sequence of slices to locate anomalies.


Our medical data annotation team tagged a dataset of 2D CT slice sequences. Being medical professionals, this was the more obvious modality for them to annotate, as they work with this more than anything else. For each slice, the area with an anomaly (in this case, a brain hemorrhage) was marked by loose approximate boundaries if they were even present. Over 300,000 CT slices have been tagged by our medical data annotation team in this fashion for the presence of multiple pathologies. The subset that the data science team first chose to train to make a neural network recognize was brain hemorrhage marked slices.


RADnet architecture

We defined the problem statement for the data science team as the following: Given a CT scan, the AI needs to tell whether a hemorrhage is present in a slice and tell what structures in the slice made it think so. As shown in the aforementioned GIF, you can see the approximate area the AI thinks is important in determining hemorrhages (also called Attention of the AI). It was almost a no-brainer choosing a deep neural network for the task. But what the architecture would look like was a question our data science team had to solve.

We decided to model this problem as a sequence modeling problem in which each element of a sequence was a 2D slice and might/might not have an area of interest. Convolutional networks model each image, which the tagged region of interest serving as an attention for classification, and then the representations from a DenseNet for an entire sequence are passed through a bidirectional LSTM to model context. This combination of the recurrent (LSTM) model with DenseNets with attention is what we call RADNet.

Performance Benchmarking and Result

For any automated system to be deployed for a clinical emergency set-up, reliable estimation along with high sensitivity to the level of human specialists is required. This necessitates the need to benchmark against specialists in the field. We compared the performance of our algorithm to the performance of real-world radiologists. The performance of three senior radiologists and RADnet was measured on a dataset of 77 brain CTs. RADnet demonstrated 81.82% hemorrhage prediction accuracy at CT level that is comparable to radiologists. The results are shown in this table:


The accuracy benchmarks for RADnet against the three radiologists.

RADnet achieves higher recall than two of the three radiologists, which is remarkable.

The Path Forward

The RADnet algorithm emulates radiologists’ method for diagnosis of brain hemorrhages from CT scans and is on par with radiologists in detecting anomalies. Noticeably, very high sensitivity is required to deploy automated emergency diagnostic tools. Also, there still exist so many other equally severe brain conditions that the given algorithm is unaware of.

We envision a future where similar emergency diagnostic tools can detect different anomalies from brain CT scans. We highly regard the fact that the presented solution should not be misinterpreted as a plausible replacement for actual radiologists in the field. RADnet demonstrates potential to be deployed as an emergency diagnosis tool. However, its real-world performance is still subject to further experimentation.

ai ,machine learning ,detection ,healthcare ,predictive analytics

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}