Simultaneous, Multiple Proportion Comparisons Using Marascuilo Procedure

Using the Marascuilo procedure, we can pinpoint which specific sample proportions differ significantly.

· Big Data Zone · Tutorial
Save
3.17K Views

Often, there is a need to compare multiple proportions among samples, wherein proportions can be about product's performance or any other characteristic which can be classified among identifiable categories. Examples of binary classifications can be pass/fail or yes/no.

Just the way equality of central tendencies or variance around them cannot be assumed to be equal, equality of proportions also cannot be assumed to be equal.

Chi.SQ. test of equal proportions can be a good starting point.

For this, observations from test samples can be laid out as below, where test results compared against expected outcomes. (In this case, the expected outcome represents equality of proportion among possible test outcome categories such as Pass/Fail).

 Observed Expected Failed X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 OKed X11 X12 X13 X14 X15 X16 X17 X18 X19 X20

``Chi Sq Metric= SUM{(Observed- Expected)^2/Expected}``

This metric is thus calculated and compared against critical threshold from Chi.SQ. table, which documents Chi.SQ. distribution values based on degrees of freedom.

``Degrees of freedom  = (number of proportions being compared for equality-1)``

However, validating the null hypothesis of equality is only the first step.

Disproving the null hypothesis merely allows us to conclude that proportions are not equal. It may also be necessary to find which specific proportions stand out in their differences, compared to rest.

For this purpose, combinations of differences among proportion pairs can be laid out . For instance, to compare among 10 proportions, you will need 45 combinations. For 5 proportions, you will need 10 combinations and so on.

 5C2 Combinations P1-P2 P1-P3 P1-P4 P1-P5 P2-P3 P2-P4 P2-P5 P3-P4 P3-P5 P4-P5

Then, for each such difference pair, critical value for Marascuilo procedure can be computed using the following formula.

``criticalVal(i,j)=[sqrt(pi*(1-pi)/ni)+(pj*(1-pj)/nj)]*sqrt(chi.sq.value@d.f.)``

Where pi*(1-pi)/ni and pj*(1-pj)/nj are variances for proportions i and j respectively, sum of which is obtained to get total variance and standard deviation, which then is multiplied by Chi.SQ. distribution value chosen to get a desired confidence level as to whether the given proportion (i,j) is indeed significantly different.

 5C2 Combinations 10 Critical Value for Marascuillo procedure Whether differences between various proportions are Significant? P1-P2 0.05 0.10684 FALSE P1-P3 0.09 0.08815 TRUE P1-P4 0.09 0.08815 TRUE P1-P5 0.07 0.09813 FALSE P2-P4 0.04 0.06036 FALSE

Thus, using the Marascuilo procedure, we can pinpoint which specific sample proportions differ significantly.

In summary, we first looked at how to apply Chi.Sq. test to infer whether one or more proportions are equal or not. Here, we need to account for all types of evaluations or scorings ( such as {false, true} or {accept, reject}) to compute the metric score.

Moving beyond proving the existence of significant inequality, we then utilized a statistical procedure to identify the specifics as to among whom such inequality exists.

Topics:
statistical analysis, big data, tutorial, marascuilo procedure

Opinions expressed by DZone contributors are their own.