Simultaneous, Multiple Proportion Comparisons Using Marascuilo Procedure
Using the Marascuilo procedure, we can pinpoint which specific sample proportions differ significantly.
Join the DZone community and get the full member experience.
Join For FreeOften, there is a need to compare multiple proportions among samples, wherein proportions can be about product's performance or any other characteristic which can be classified among identifiable categories. Examples of binary classifications can be pass/fail or yes/no.
Just the way equality of central tendencies or variance around them cannot be assumed to be equal, equality of proportions also cannot be assumed to be equal.
Chi.SQ. test of equal proportions can be a good starting point.
For this, observations from test samples can be laid out as below, where test results compared against expected outcomes. (In this case, the expected outcome represents equality of proportion among possible test outcome categories such as Pass/Fail).
Observed | Expected | |
Failed | X1 | X2 |
X3 | X4 | |
X5 | X6 | |
X7 | X8 | |
X9 | X10 | |
OKed | X11 | X12 |
X13 | X14 | |
X15 | X16 | |
X17 | X18 | |
X19 | X20 | |
Chi Sq Metric= SUM{(Observed- Expected)^2/Expected}
This metric is thus calculated and compared against critical threshold from Chi.SQ. table, which documents Chi.SQ. distribution values based on degrees of freedom.
Degrees of freedom = (number of proportions being compared for equality-1)
However, validating the null hypothesis of equality is only the first step.
Disproving the null hypothesis merely allows us to conclude that proportions are not equal. It may also be necessary to find which specific proportions stand out in their differences, compared to rest.
For this purpose, combinations of differences among proportion pairs can be laid out . For instance, to compare among 10 proportions, you will need 45 combinations. For 5 proportions, you will need 10 combinations and so on.
5C2 Combinations |
P1-P2 |
P1-P3 |
P1-P4 |
P1-P5 |
P2-P3 |
P2-P4 |
P2-P5 |
P3-P4 |
P3-P5 |
P4-P5 |
Then, for each such difference pair, critical value for Marascuilo procedure can be computed using the following formula.
criticalVal(i,j)=[sqrt(pi*(1-pi)/ni)+(pj*(1-pj)/nj)]*sqrt(chi.sq.value@d.f.)
Where pi*(1-pi)/ni and pj*(1-pj)/nj are variances for proportions i and j respectively, sum of which is obtained to get total variance and standard deviation, which then is multiplied by Chi.SQ. distribution value chosen to get a desired confidence level as to whether the given proportion (i,j) is indeed significantly different.
5C2 Combinations | 10 | Critical Value for Marascuillo procedure | Whether differences between various proportions are Significant? |
P1-P2 | 0.0500 | 0.10684 | FALSE |
P1-P3 | 0.0900 | 0.08815 | TRUE |
P1-P4 | 0.0900 | 0.08815 | TRUE |
P1-P5 | 0.0700 | 0.09813 | FALSE |
P2-P4 | 0.0400 | 0.06036 | FALSE |
Thus, using the Marascuilo procedure, we can pinpoint which specific sample proportions differ significantly.
In summary, we first looked at how to apply Chi.Sq. test to infer whether one or more proportions are equal or not. Here, we need to account for all types of evaluations or scorings ( such as {false, true} or {accept, reject}) to compute the metric score.
Moving beyond proving the existence of significant inequality, we then utilized a statistical procedure to identify the specifics as to among whom such inequality exists.
Opinions expressed by DZone contributors are their own.
Comments