Lung cancer is a major health problem and the leading cause of death among all types of cancer. Over the last seven years, computer tomography has allowed more cancers to be diagnosed at an early stage, and the mortality rate has been reduced by 20%.
Computer tomography, or CT scanning, is the main method of lung cancer diagnosis and is becoming increasingly popular. The more it’s used for lung cancer detection, the more tomographic images are produced worldwide. The fact that tomograms are usually processed manually by radiation therapists means medical personnel are flooded by routine work. That being said, tomogram processing is not an easy task. Increasing loads of complex work can reduce the quality of diagnosis and prevent health professionals from delivering medical care in time.
Why Manual Tomogram Processing Is Flawed
Diagnosis is slowed down.
Radiation therapists are overloaded with complex manual work.
A false diagnosis caused by a human factor can take place.
Machine learning and neural networks make it possible to automatically process X-ray pictures, tomograms, MRIs, and PET images to detect diseases. Many brands and research labs are adapting machine learning algorithms to speed up diagnosis and increase the quality of medical care.
Training an Image Processing Algorithm for Cancer Detection
The goal was to teach the algorithm to find abnormal areas called nodules in lungs. These nodules can represent cancer. To train neural networks, we used a dataset of 1,000 cancer CT scans with marked-up annotated lesions from LIDC-IDR.
The source images were in DICOM format: inhomogeneous and heavy. So we first needed to standardize all the images.
Then, we normalized data to avoid network failures.
After that, we expressed CT numbers in a standardized and convenient form using the Hounsfield Unit. Here is a table of densities in HU.
From -100 to -50
From +30 to +45
From +10 to +40
From +100 to +300
From +700 to +3000
In the end, we rescaled all voxels to 1х1х1 mm to make use of a 3D CNN.
We segmented lungs to find nodules. Bright areas on the image represent either blood vessels or air. Using a -400 HU threshold, we segmented the lung structure and created a binary mask of lungs.
Training Neural Networks to Find Abnormal Areas
First, we applied image classification and separated all images into two groups: suspicious and normal. Then, we localized all suspicious areas using the neural network U-Net. See its architecture below.
We chose U-Net as one of the best networks for medical image processing. Moreover, this type of network is really good at cellular structure recognition. See the proof below.
We trained our network to detect solitary pulmonary nodules. U-Net performed very well and found almost all abnormal areas. However, it detected some areas as “abnormal” by mistake.
During the next step, we reduced the number of false positives and improved the detection accuracy using a 3D CNN.
A 3D convolutional neural network was trained to classify abnormal areas found by the first network and separate them into valid and false. As a result, the number of false positives reduced drastically. The accuracy rate was equal to 0.925.
True Positive False Positive
Localizing nodules: A = found correctly; B = found by mistake.
The entire algorithm is great at finding abnormal areas in lungs. Even false positives were not a big problem since they were mostly located very close to real cancer. Furthermore, the algorithm often classified such areas with false positives as “suspicious” so that radiology therapists could check them anyway and detect cancer.
We believe this solution can be used to help radiology therapists do less manual, routine work and speed up cancer diagnosis.