DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations

Blazing Through Commonly Used Statistical Performance Metrics

Performance metrics of statistical models is a great way of finding the accuracy of your predictive analytics. Read on to learn more!

Abhijit Telang user avatar by
Abhijit Telang
CORE ·
Mar. 26, 19 · Tutorial
Like (3)
Save
Tweet
Share
3.67K Views

Join the DZone community and get the full member experience.

Join For Free

Performance metrics are often used to evaluate how effective statistical models have been in predicting response variables based on the given set of observations that the models were trained on.  

In this post, we are restricting the discussion to classification problems. Examples of classification can be binary, such as:

  1. A creditworthy person being classified as either creditworthy or non-creditworthy.

  2. A consumer decision's to either switch services or stay on with you.

Your statistical model may be able to classify some response variables correctly, and perhaps some will always be missed. For if it does not miss anything, then maybe it has adapted too much for this set of observations, and may not be able to reproduce this learning with a different set of observations. This is called variance in learning, as opposed to bias. 

Bias prevents your model from becoming flexible enough to learn the nuances. On the other hand, flexibility makes it too meek to stand up and adapt to a different set of circumstances. 

So, every model will have a trade-off between Bias and Variance. You don't want a model that is too rigid (this could cause it to miss out on learning from variations) while at the same time you don't want a model that is too meek (so as to minimize errors due to any variations the model undergoes as the observation sets are changed, such as the alterations that take place when going from Training to Test).

Leaving those aspects aside, how do you measure the performance of your model in sheer quantitative terms?

Here are few measures explained. Let's assume for the sake of simplicity that the predicted classification is binary ( TRUE or FALSE, 0 or 1, POSITIVE or NEGATIVE).

Often, it is easy to visualize this using a tabular approach:

Image title

1) Precision: I know how many Positives or "1"s my model has predicted using the response variable.

The question is whether they capture all the positives out there in the original response variable?

Hence,

Precision = TP/MP, where TP is True Positives and MP is Marked Positives. 

But TP= MP-FP because my model may have misclassified some Negative values as Positive. 

Hence, 

Precision = (MP-FP)/MP  

So, precision is equal to how many positives my model thinks there are in response variable versus how many positives actually are. This tells you how precise your model's ability to recognize a given class versus other classes is ( in this case, it's positive value versus all other values).

2) Recall: When we measured the precision of our model, we did not think about the symmetry of misclassification. Recall is then used to gauge a model's ability to not only recognize a given class's state such, as positive (while adjusting for any false positive misclassifications) but also doing so in relative contrast with how many such positives actually existed (including any false negative misclassifications).

That is, when a model can misclassify a negative as a positive, it can also misclassify a positive as a negative and hence we need to quantify that aspect too.

Let's adjust the True Positive (TP) count by adding back False Negatives: TP+FN. 

Hence, 

Recall = (MP-FP)/(MP-FP+FN)

So, the difference in the denominator is about bringing in two symmetric corrections in the denominator: namely, adjusting for false positives as well as adjusting for false negatives.

This measure is also called sensitivity.  

3) Specificity: This measure is similar to precision, but the difference is that it examines the negative classification rather than the positive.

Specificity = (MN-FN)/MN  

So, you are trying to measure how precisely your model can recognize negative values. This, combined with precision, allows you to see the contribution of both positive and negative misclassifications.

4) Accuracy: Observe the numerators and denominators in the previous two metrics. These values are only concerned with one specific state of a given class in the response variable: either positive or negative.

What is so easy to miss is that when a model falsely recognizes a value as positive, it also misses an opportunity to mark this value as negative.

Similarly, when a model does not recognize a value as positive, it also ends up falsely recognizing that value as negative.

The most difficult part is to remember the symmetrical impact on misclassification.

Accuracy = ((MP-FP)+(MN-FN)) / (TP+FP+TN+FN).

Thus, the accuracy measure allows you to measure how off your model is in recognizing not only positives but also negatives. 

If this looks tedious, R has a built-in package called ROCR.

All you need is a collection of predictions and corresponding labels.

pred1 <- prediction( dfpredictmatrix$predictions, dfpredictmatrix$labels)
performance(pred1,"prec","rec")
plot(perf1)
abline(h=0.8,col="RED",lty=2)

That's it! This will allow you to examine the tradeoff between precision and recall, respectively.

Metric (unit) Precision (computer science)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Building a RESTful API With AWS Lambda and Express
  • [DZone Survey] Share Your Expertise and Take our 2023 Web, Mobile, and Low-Code Apps Survey
  • Spring Cloud
  • Apache Kafka Is NOT Real Real-Time Data Streaming!

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: