DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Simultaneous, Multiple Proportion Comparisons Using Marascuilo Procedure
  • AI-Driven Test Automation Techniques for Multimodal Systems
  • Debugging With Confidence in the Age of Observability-First Systems
  • Accelerating Debugging in Integration Testing: An Efficient Search-Based Workflow for Impact Localization

Trending

  • Understanding IEEE 802.11(Wi-Fi) Encryption and Authentication: Write Your Own Custom Packet Sniffer
  • Designing a Java Connector for Software Integrations
  • Is Agile Right for Every Project? When To Use It and When To Avoid It
  • Developers Beware: Slopsquatting and Vibe Coding Can Increase Risk of AI-Powered Attacks
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. Testing, Tools, and Frameworks
  4. Testing for Equal Variance: F-Test, Bartlett, Levene

Testing for Equal Variance: F-Test, Bartlett, Levene

Learn how to test whether two datasets come from distributions with the same variance with the two-sample t-test.

By 
John Cook user avatar
John Cook
·
May. 18, 18 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
6.9K Views

Join the DZone community and get the full member experience.

Join For Free

The two-sample t-test is a way to test whether two datasets come from distributions with the same mean. I wrote a few days ago about how the test performs under ideal circumstances, as well as less-than-ideal circumstances.

This is an analogous post for testing whether two datasets come from distributions with the same variance. Statistics texts books often present the F-test for this task, then warn in a footnote that the test is highly dependent on the assumption that both datasets come from normal distributions.

Sensitivity and Robustness

Statistics texts give too little attention to robustness in my opinion. Modeling assumptions never hold exactly, so it's important to know how procedures perform when the assumptions don't hold exactly. Since the F-test is one of the rare instances where textbooks warn about a lack of robustness, I expected the F-test to perform terribly under simulation, relative to its recommended alternatives Bartlett's test and Levene's test. That's not exactly what I found.

Simulation Design

For my simulations, I selected 35 samples from each of two distributions. I selected significance levels for the F-test, Bartlett's test, and Levene's test so that each would have roughly a 5% error rate under a null scenario, both sets of data coming from the same distribution, and a 20% error rate under an alternative scenario.

I chose my initial null and alternative scenarios to use normal (Gaussian) distributions, i.e. to satisfy the assumptions of the F-test. Then, I used the same designs for data coming from a heavy-tailed distribution to see how well each of the tests performed.

For the normal null scenario, both datasets were drawn from a normal distribution with mean 0 and standard deviation 15. For the normal alternative scenario, I used normal distributions with standard deviations 15 and 25.

Normal Distribution Calibration

Here are the results from the normal distribution simulations.

|----------+-------+--------+---------|
| Test     | Alpha | Type I | Type II |
|----------+-------+--------+---------|
| F        |  0.13 | 0.0390 |  0.1863 |
| Bartlett |  0.04 | 0.0396 |  0.1906 |
| Levene   |  0.06 | 0.0439 |  0.2607 |
|----------+-------+--------+---------|

Here, the Type I column is the proportion of times the test incorrectly concluded that identical distributions had unequal variances. The Type II column reports the proportion of times the test failed to conclude that distributions with different variances indeed had unequal variances. Results were based on simulating 10,000 experiments.

The three tests had roughly equal operating characteristics. The only difference that stands out above simulation noise is that the Levene test had a larger Type II error than the other tests when calibrated to have the same Type I error.

To calibrate the operating characteristics, I used alpha levels 0.15, 0.04, and 0.05 respectively for the F, Bartlett, and Levene tests.

Heavy-Tail Simulation Results

Next, I used the design parameters above, i.e. the alpha levels for each test, but drew data from distributions with a heavier tail. For the null scenario, both datasets were drawn from a Student t distribution with 4 degrees of freedom and scale 15. For the alternative scenario, the scale of one of the distributions was increased to 25. Here are the results, again based on 10,000 simulations.

|----------+-------+--------+---------|
| Test     | Alpha | Type I | Type II |
|----------+-------+--------+---------|
| F        |  0.13 | 0.2417 |  0.2852 |
| Bartlett |  0.04 | 0.2165 |  0.2859 |
| Levene   |  0.06 | 0.0448 |  0.4537 |
|----------+-------+--------+---------|

The operating characteristics degraded when drawing samples from a heavy-tailed distribution, t with 4 degrees of freedom, but they didn't degrade uniformly.

Compared to the F-test, the Bartlett test had slightly better Type I error and the same Type II error.

The Levene test had a much lower Type I error than the other tests, hardly higher than it was when drawing from a normal distribution, but had a higher Type II error.

Conclusion

The F-test is indeed sensitive to departures from the Gaussian assumption, but Bartlett's test doesn't seem much better in these particular scenarios. Levene's test, however, does perform better than the F-test, depending on the relative importance you place on Type I and Type II error.

Testing Distribution (differential geometry)

Published at DZone with permission of John Cook, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Simultaneous, Multiple Proportion Comparisons Using Marascuilo Procedure
  • AI-Driven Test Automation Techniques for Multimodal Systems
  • Debugging With Confidence in the Age of Observability-First Systems
  • Accelerating Debugging in Integration Testing: An Efficient Search-Based Workflow for Impact Localization

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: