DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Getting Rid of Performance Testing Outliers With a New JMeter Plugin
  • Generate Random Test Data in PostgreSQL
  • Why Good Models Fail After Deployment
  • The New Testing Pattern: Standardizing Regression for Cloud Migrations

Trending

  • Setting Up Claude Code With Ollama: A Guide
  • Liquibase: Database Change Management and Automated Deployments
  • Java in a Container: Efficient Development and Deployment With Docker
  • Improving Java Application Reliability with Dynatrace AI Engine
  1. DZone
  2. Data Engineering
  3. Databases
  4. Simultaneous, Multiple Proportion Comparisons Using Marascuilo Procedure

Simultaneous, Multiple Proportion Comparisons Using Marascuilo Procedure

Using the Marascuilo procedure, we can pinpoint which specific sample proportions differ significantly.

By 
Abhijit Telang user avatar
Abhijit Telang
·
Aug. 20, 21 · Tutorial
Likes (4)
Comment
Save
Tweet
Share
11.7K Views

Join the DZone community and get the full member experience.

Join For Free

Often, there is a need to compare multiple proportions among samples, wherein proportions can be about product's performance or any other characteristic which can be classified among identifiable categories. Examples of binary classifications can be pass/fail or yes/no.

Just the way equality of central tendencies or variance around them cannot be assumed to be equal, equality of proportions also cannot be assumed to be equal.

Chi.SQ. test of equal proportions can be a good starting point.

For this, observations from test samples can be laid out as below, where test results compared against expected outcomes. (In this case, the expected outcome represents equality of proportion among possible test outcome categories such as Pass/Fail).


Observed Expected
Failed X1 X2
X3 X4
X5 X6
X7 X8
X9 X10
OKed X11 X12
X13 X14
X15 X16
X17 X18
X19 X20
Spreadsheet
 
Chi Sq Metric= SUM{(Observed- Expected)^2/Expected}


This metric is thus calculated and compared against critical threshold from Chi.SQ. table, which documents Chi.SQ. distribution values based on degrees of freedom. 

Spreadsheet
 
Degrees of freedom  = (number of proportions being compared for equality-1)


However, validating the null hypothesis of equality is only the first step.

Disproving the null hypothesis merely allows us to conclude that proportions are not equal. It may also be necessary to find which specific proportions stand out in their differences, compared to rest. 

For this purpose, combinations of differences among proportion pairs can be laid out . For instance, to compare among 10 proportions, you will need 45 combinations. For 5 proportions, you will need 10 combinations and so on.

5C2 Combinations
P1-P2
P1-P3
P1-P4
P1-P5
P2-P3
P2-P4
P2-P5
P3-P4
P3-P5
P4-P5

Then, for each such difference pair, critical value for Marascuilo procedure can be computed using the following formula.

Spreadsheet
 
criticalVal(i,j)=[sqrt(pi*(1-pi)/ni)+(pj*(1-pj)/nj)]*sqrt([email protected].)


Where pi*(1-pi)/ni and pj*(1-pj)/nj are variances for proportions i and j respectively, sum of which is obtained to get total variance and standard deviation, which then is multiplied by Chi.SQ. distribution value chosen to get a desired confidence level as to whether the given proportion (i,j) is indeed significantly different.

5C2 Combinations 10 Critical Value for Marascuillo procedure Whether differences between various proportions are Significant?
P1-P2 0.0500 0.10684 FALSE
P1-P3 0.0900 0.08815 TRUE
P1-P4 0.0900 0.08815 TRUE
P1-P5 0.0700 0.09813 FALSE
P2-P4 0.0400 0.06036 FALSE

Thus, using the Marascuilo procedure, we can pinpoint which specific sample proportions differ significantly.

In summary, we first looked at how to apply Chi.Sq. test to infer whether one or more proportions are equal or not. Here, we need to account for all types of evaluations or scorings ( such as {false, true} or {accept, reject}) to compute the metric score.

Moving beyond proving the existence of significant inequality, we then utilized a statistical procedure to identify the specifics as to among whom such inequality exists.

Comparison (grammar) Testing Metric (unit) Distribution (differential geometry) Hypothesis (drama) Evaluation Database Deviation (statistics) Document

Opinions expressed by DZone contributors are their own.

Related

  • Getting Rid of Performance Testing Outliers With a New JMeter Plugin
  • Generate Random Test Data in PostgreSQL
  • Why Good Models Fail After Deployment
  • The New Testing Pattern: Standardizing Regression for Cloud Migrations

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook