DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • CI/CD Integration: Running Playwright on GitHub Actions: The Definitive Automation Blueprint
  • Testcontainers With Kotlin and Spring Data R2DBC
  • Bias and Shortcut Tests for Vision Models: A Practical Test Suite From Real-World Experiments
  • When Memory Overflows: Too Many ApplicationContexts in Spring Integration Tests

Trending

  • Leveraging Apache Flink Dashboard for Real-Time Data Processing in AWS Apache Flink Managed Service
  • Spring Boot Done Right: Lessons From a 400-Module Codebase
  • Solving the Mystery: Why Java RSS Grows in Docker on M1 Macs
  • Real-Time AI Inference at Scale Using Cloud Run, GPUs, and Vertex AI
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. Testing, Tools, and Frameworks
  4. How to Interpret the Number of Spring ApplicationContexts in Integration Tests

How to Interpret the Number of Spring ApplicationContexts in Integration Tests

When optimizing Spring Boot integration tests, developers often focus on obvious metrics, but they do not always explain why an integration test suite is slow.

By 
Constantin Kwiatkowski user avatar
Constantin Kwiatkowski
·
Jun. 08, 26 · Analysis
Likes (0)
Comment
Save
Tweet
Share
234 Views

Join the DZone community and get the full member experience.

Join For Free

When optimizing Spring Boot integration tests, developers often focus on obvious metrics: total build time, test execution time, CPU usage, memory consumption, or the number of failed tests. These metrics are useful, but they do not always explain why an integration test suite is slow.  One of the most important hidden metrics in Spring Boot integration testing is the number of distinct ApplicationContext instances created during the test run, check out my other article.  

Spring’s TestContext framework can cache and reuse ApplicationContext between test classes, but only if the effective test configuration is the same. If the configuration differs, Spring has to create another context. In large enterprise applications, this can become expensive very quickly. 

  • How can the number of contexts correctly interpreted?
  • If a test suite creates two contexts, is that good?
  • If it creates six contexts, is that acceptable?
  • If it creates twenty contexts, is that already a design smell?
  • And most importantly: where should such a judgment come from?

Spring itself does not define a universal threshold for a “good” or “bad” number of cached ApplicationContext instances. However, the official documentation explicitly points out that a large number of loaded contexts can make a test suite unnecessarily slow.  This means the number of contexts is not just an implementation detail. It is a relevant diagnostic signal.

This article explains how I derived a practical interpretation table for a real-world Spring Boot integration test suite and why such a table should be understood as a case-study heuristic, not as a universal Spring Framework rule. 

Test Grouping Is a Valid Concept

General testing research supports that tests can be grouped by similarity, cost, coverage, or runtime behavior. This is highly relevant for Spring Boot integration tests. In Spring Boot integration testing, MergedContextConfiguration may be interpreted as one practical grouping dimension: tests with the same effective Spring configuration belong to the same context group. 

In this case, similarity means shared Spring test configuration. That does not mean all tests should use the same context. It means that tests should not accidentally create different contexts when they are actually testing under the same architectural conditions.

Spring’s Context Cache as a Framework-Specific Grouping Mechanism 

Spring Boot integration tests are not plain unit tests. They often require infrastructure such as dependency injection, database configuration, security configuration, web layer configuration, mock infrastructure, external API clients, messaging components, or tenant-specific setup.

Spring’s TestContext framework handles this through the ApplicationContext.

The framework can reuse a context if the effective configuration is the same. The cache key is based on configuration parameters such as configuration classes, active profiles, property sources, context customizers, initializers, and other test context settings. Spring’s documentation describes this context caching mechanism and explains that contexts can be reused when the same unique context configuration is encountered again. Let me explain.

Two tests may look similar to a developer but still produce different contexts if they use different profiles, properties, mocks, or imported configuration classes. They should normally produce separate context groups. For example, a database-focused test and a test involving an external OData destination may have different infrastructure requirements. 

In that case, a separate context is not a problem. It reflects a real test configuration group. When every test class introduces a slightly different property, mock, or configuration import without a strong technical reason. Then the number of contexts grows not because the architecture requires it, but because the test suite has configuration drift.

Why Multiple Contexts Can Be Legitimate in Enterprise Applications

Spring Boot itself supports different testing styles. The documentation describes @SpringBootTest for loading the application context through SpringApplication, and it also provides more focused test annotations for specific slices of an application. Spring Boot’s test slices include annotations such as @WebMvcTest, @DataJpaTest, @JsonTest, and others. These annotations intentionally load only selected parts of the application and import different auto-configurations depending on the target slice. 

Besides the Spring documentation, many community blogs report that different enterprise systems may have separate integration test groups, such as database-focused tests, web/controller tests, security-related tests, and so on. So, the goal should be to minimize unnecessary context fragmentation while preserving justified test configuration groups, instead of forcing the entire integration test suite into one ApplicationContext.  

From Test Grouping to a Context-Count Heuristic 

Based on this reasoning, I used the following interpretation in a case study:

  • 1-3 application contexts show excellent context reuse,
  • 4-8 are acceptable if justified,
  • 10+ should be investigated, and a signal of a fragmented test configuration. 

Let's discuss the numbers. 

1-3: The most integration tests share the same effective configuration. For example:

Plain Text
 
Context 1: default integration test context
Context 2: database-specific context
Context 3: external-system-specific context


Such a structure is usually easy to understand. It suggests that the team has standardized its test profiles, properties, and infrastructure setup. 

4-8: This is consistent with broader software-testing research, where test suites are not treated as one homogeneous block. They are often optimized, selected, prioritized, or clustered according to meaningful technical criteria such as coverage, execution cost, change relevance, or runtime behavior. For example: 

Plain Text
 
Context 1: default SpringBootTest context
Context 2: database-heavy context
Context 3: external API integration context
Context 4: security-specific context
Context 5: multi-tenant context
Context 6: messaging context
Context 7: no-external-destination context
Context 8: migration-specific context


10+: Once the number of contexts reaches double digits, investigation becomes worthwhile. This does not automatically mean the test suite is badly designed. Community articles on Spring test optimization show that a very large enterprise platform with many modules, tenant variants, data stores, messaging systems, and external integrations may legitimately require more contexts. So, the number 10+ is not firm, but suggests that the risk of accidental fragmentation becomes higher. 

Conclusion

Test grouping is a recognized concept in software-testing research. Large test suites are often optimized through minimization, selection, prioritization, and clustering. These techniques are based on the idea that tests have different costs, purposes, coverage, runtime behavior, and relevance. For Spring Boot integration tests, context reuse is a framework-specific grouping criterion. (Use the method of test grouping to create Spring application contexts)

Tests with the same effective MergedContextConfiguration belong to the same context group and can share the same cached ApplicationContext. Tests with genuinely different infrastructure needs may require different contexts.

Therefore, the goal is not to reduce every enterprise test suite to a single context. The goal is to distinguish between justified test configuration groups and accidental configuration fragmentation.

The shown numbers are a practical case-study heuristic, and not universal. 

But the underlying principle is robust: A small number of well-defined context groups is healthy, but a growing number of slightly different contexts is a performance smell.  That principle connects Spring’s TestContext cache mechanism with a broader idea from software-testing research: large test suites should be structured intentionally, not allowed to fragment accidentally.

Test suite integration test Spring Boot Testing Integration

Opinions expressed by DZone contributors are their own.

Related

  • CI/CD Integration: Running Playwright on GitHub Actions: The Definitive Automation Blueprint
  • Testcontainers With Kotlin and Spring Data R2DBC
  • Bias and Shortcut Tests for Vision Models: A Practical Test Suite From Real-World Experiments
  • When Memory Overflows: Too Many ApplicationContexts in Spring Integration Tests

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook