Over a million developers have joined DZone.

Multi-level Classification, Cohen Kappa, Krippendorff Alpha, and Cancer

There are two interesting methods in R to compute agreement between potential classifiers and test performance of classifiers which predict 33 cancer types

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

I was facing an interesting problem last week. Playing with data from The Genome Cancer Atlas (full genetic and clinical data for thousands of patients) I was building a classifier that predicts the type of cancer based on sets of genetic signatures.CGA_Banner

In the PANCAN33 subset there are samples for 33 different types of cancer. And the classifier shall be able to classify a new sample to one of these 33 classes. I’ve tried different methods like random forest, svm, bgmm and few others, and end up with collection of classifiers. How to choose the best one?

We need a method that computes an agreement between classifier predictions and true labels/cancer types. For binary classifiers there is a lot of commonly used metrics like precision, recall, accuracy etc. But here we have 33 classes. The confusion matrix is 33×33 cells large, a lot of numbers to compare.
Of course, there are some straightforward solutions, like fraction of samples on which classifier correctly guesses true labels. But such easy solutions suffer a lot if there is unequal distribution of classes (quite common). Such metrics may be high for dummy classifier like: always vote for most common class. It is better to avoid such metrics.

Other Measures of Agreement

Actually I used two interesting ones – Cohen Kappa and Krippendorff Alpha. They take into account the distribution of votes for each rater. Moreover Krippendorff Alpha takes into account missing data (find more information here).

Both coefficients are widely used by psychometricians (e.g. to asses how two psychiatrists agree on a diagnosis). We use them in order to estimate the performance of the classifier. Both coefficients are implemented in the irr package.

Below you will find an example application:

kappa2(cbind(predictions, trueLabels))
# Cohen's Kappa for 2 Raters (Weights: unweighted)
#
# Subjects = 3599 
#   Raters = 2 
#    Kappa = 0.941 
#
#        z = 160 
#  p-value = 0 

kripp.alpha(rbind(predictions, trueLabels))
# Krippendorff's alpha
#
# Subjects = 3599 
#   Raters = 2 
#    alpha = 0.941 


Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:
big data

Published at DZone with permission of deepsense.io Blog, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}