Building Realistic Test Data in Java: A Hands-On Guide for Developers
Learn how to build a simple API that delivers believable fake users, perfect for testing, demos, or UI prototyping. No more “John Doe” data, finally, real-feel mocks.
Join the DZone community and get the full member experience.
Join For FreeThere’s something that every backend or API developer faces sooner or later: the need for good fake data.
Whether you’re testing a new API, populating a database for demos, or simply trying to make your unit tests less “boring”, fake data is part of your daily routine. The problem? Most fake data feels… fake. You end up with “John Doe” and “123 Main Street” repeated over and over, which doesn’t look great when showing a prototype to your team or client.
So today, let’s fix that.
In this article, we’ll explore two powerful Java libraries that make generating fake yet realistic data a breeze: DataFaker and EasyRandom.
We’ll go beyond just generating names and emails — we’ll learn how to integrate both libraries inside a Spring Boot 3 project, how to combine their strengths, and how to make everything available through a REST API that returns test data.
This isn’t a theoretical overview. We’ll look at real code, and you’ll walk away knowing exactly how to reproduce it in your next project.
Why Bother Generating Fake Data?
Let’s face it: manually crafting test data is time-consuming and error-prone.
Imagine you’re developing a system for managing users. You need to test pagination, filtering, sorting, and edge cases (like missing emails or very long names). Instead of hand-writing 100 lines of sample JSON, wouldn’t it be nicer to generate it automatically and instantly?
Good fake data helps you:
- Validate your logic in a more realistic scenario
- Showcase prototypes with data that “looks real”
- Stress test APIs or UI components with variable inputs
- Automate unit tests without boilerplate “mock builders”
So instead of hardcoding “Alice” and “Bob,” we’ll let DataFaker and EasyRandom do the heavy lifting.
DataFaker: The Modern, Improved JavaFaker
If you’ve used JavaFaker in the past, DataFaker is its modern, actively maintained successor.
It’s built for recent Java versions (Java 17+), is fast, and offers hundreds of data categories — including names, addresses, finance, company information, internet data, crypto keys, and even Star Wars characters if you feel nostalgic.
Let’s see a quick example:
import net.datafaker.Faker;
Faker faker = new Faker();
System.out.println(faker.name().fullName());
System.out.println(faker.internet().emailAddress());
System.out.println(faker.address().fullAddress());
Run that, and you’ll get something like:
Matilde Marques
[email protected]
Rua do Carmo 45, 1200-093 Lisboa
Pretty cool, right? And it even looks localized if you change the locale.
Faker faker = new Faker(new Locale("pt"));
Now your data fits your language and region — an enjoyable touch for international testing.
EasyRandom: Because We Need More Than Fields
While DataFaker focuses on realistic field-level data, EasyRandom (formerly Random Beans) takes a different approach.
It’s great when you have complex Java objects — like entities or DTOs — and you want them automatically filled with random but valid values.
Think of EasyRandom as a smart “object generator” that knows how to populate your classes, including nested objects, lists, and maps.
Example:
import org.jeasy.random.EasyRandom;
EasyRandom easyRandom = new EasyRandom();
Person randomPerson = easyRandom.nextObject(Person.class);
This will create a fully populated Person instance, with random strings, numbers, and even nested attributes.
So, where DataFaker gives realism (e.g., “John Smith, [email protected]”), EasyRandom gives structure and automation (e.g., filling an entire POJO graph).
And the best part? You can combine both — letting EasyRandom create your object and then using DataFaker to polish specific fields with more believable data.
Combining DataFaker and EasyRandom: The Sweet Spot
Here’s where things get fun.
We’ll create a small Spring Boot REST API that exposes endpoints to generate fake users. Each user will have an id, fullName, email, phone, and address. We’ll use DataFaker for realism and EasyRandom for automation.
Our project structure looks like this:
src/
├─ main/java/com/example/fakedata/
│ ├─ Application.java
│ ├─ config/
│ ├─ api/
│ ├─ controller/
│ ├─ domain/
│ ├─ dto/
│ ├─ service/
│ └─ mapper/
└─ resources/
└─ static/index.html
The User Domain Class
We’ll keep it simple, using Lombok to avoid boilerplate:
@Data
@Builder
public class User {
private String id;
private String fullName;
private String email;
private String phone;
private String address;
}
And for the API responses, we’ll use a Java record for immutability and readability:
public record UserDto(String id, String fullName, String email, String phone, String address) { }
The Service: Combining Both Libraries
Here’s the core of our project:
@Service
public class DataGenService {
private final Faker faker = new Faker(Locale.ENGLISH);
private final EasyRandom easyRandom;
public DataGenService() {
EasyRandomParameters params = new EasyRandomParameters()
.seed(System.currentTimeMillis())
.stringLengthRange(5, 20);
this.easyRandom = new EasyRandom(params);
}
public User randomUserViaDatafaker() {
return User.builder()
.id(UUID.randomUUID().toString())
.fullName(faker.name().fullName())
.email(faker.internet().emailAddress())
.phone(faker.phoneNumber().cellPhone())
.address(faker.address().fullAddress())
.build();
}
public User randomUserViaEasyRandom() {
User u = easyRandom.nextObject(User.class);
if (u.getId() == null || u.getId().isBlank()) {
u.setId(UUID.randomUUID().toString());
}
u.setFullName(faker.name().fullName());
u.setEmail(faker.internet().emailAddress());
return u;
}
public List<User> manyUsers(int count, boolean easyRandomMode) {
return IntStream.range(0, count)
.mapToObj(i -> easyRandomMode ? randomUserViaEasyRandom() : randomUserViaDatafaker())
.collect(Collectors.toList());
}
}
You can see how we use DataFaker for realism and EasyRandom for structure — like a two-chef recipe: one creates the base, the other adds seasoning.
The REST Controller
Now, let’s make it accessible through a REST API.
@RestController
@RequestMapping("/api/users")
public class UserController {
private final DataGenService service;
public UserController(DataGenService service) {
this.service = service;
}
@GetMapping("/{count}")
public ApiResponse<List<UserDto>> generateUsers(@PathVariable int count,
@RequestParam(defaultValue = "false") boolean easy) {
List<UserDto> users = service.manyUsers(count, easy)
.stream().map(UserMapper::toDto)
.collect(Collectors.toList());
return ApiResponse.of(users);
}
}
And to make our API responses consistent, we wrap everything in an envelope with a timestamp:
public record ApiResponse<T>(T data, Instant timestamp) {
public static <T> ApiResponse<T> of(T data) {
return new ApiResponse<>(data, Instant.now());
}
}
That way, every API call returns data like this:
{
"data": [
{
"id": "e7b1c37a-8b20-43c1-8ff3-b4aef8d89c3a",
"fullName": "Lina Cordeiro",
"email": "[email protected]",
"phone": "+351 912 345 678",
"address": "Rua do Comércio 12, Porto"
}
],
"timestamp": "2025-10-06T13:02:45.321Z"
}
Much cleaner and easier to debug.
Why Timestamp in Responses?
Adding timestamps isn’t just for looks. It’s a simple, useful practice that improves observability.
When debugging requests in distributed systems or when clients log responses, having the server timestamp right in the payload helps you correlate events — it’s a micro detail with macro benefits.
Why Both Libraries Are Better Together
You might wonder: “Why not just use DataFaker alone?”
Good question.
- DataFaker is unbeatable for producing realistic values, but it doesn’t automatically populate deep object structures.
- EasyRandom, on the other hand, is great for object graphs, but its randomness feels too synthetic — you’ll end up with “[email protected].”
Together, they give you:
- Realism + Automation
- Ease of integration with tests and APIs
- Consistency through configuration and seeds
It’s a bit like combining a random word generator with a translator — one provides variation, the other makes sense of it.
Going Further: Postman, Docker, and CI/CD
Our complete project also includes:
- A Postman collection for quick testing
- A Dockerfile and docker-compose.yml for containerization
- GitHub Actions CI and Dependabot setup for automated builds and dependency updates
That makes this small demo a production-grade reference project for testing and learning.
If you’re mentoring junior developers or building internal utilities, this is a great example to show clean architecture and reproducible data generation.
Repo: github.com/wallaceespindola/fake-data-springboot
Practical Ideas for Using This Setup
- Load testing: Generate thousands of fake users to populate a database.
- UI prototyping: Feed your frontend with realistic API data.
- Demo environments: Seed a sandbox with dynamic sample users.
- Unit tests: Replace
new User("a","b")with a call toDataGenService.randomUserViaDatafaker(). - Data anonymization: Quickly replace sensitive production data with fake equivalents.
Each of these is a real-world scenario where this combination shines.
Closing Thoughts
The difference between a “meh” test dataset and a “wow, this looks real!” demo often comes down to how you generate data.
With DataFaker and EasyRandom, you can automate that process elegantly — using modern Java, minimal boilerplate, and libraries that just make sense together. You’ll not only save hours when building tests or mock APIs but also deliver demos that feel alive, diverse, and realistic.
The best part? It’s all open-source, lightweight, and easy to integrate with Spring Boot, Quarkus, Micronaut, or even a plain Java console app.
So next time you need to populate an API or test your system’s resilience, don’t settle for "John Doe" anymore. Give your fake data some personality — and let Java do the heavy lifting.
Need more tech insights?
Check out my GitHub repo and LinkedIn page.
Happy coding!
Opinions expressed by DZone contributors are their own.
Comments