Building Realistic Test Data in Java: A Hands-On Guide for Developers

Learn how to build a simple API that delivers believable fake users, perfect for testing, demos, or UI prototyping. No more “John Doe” data, finally, real-feel mocks.

Wallace Espindola

Oct. 10, 25 · Analysis

Likes (8)

Comment

Save

3.5K Views

There’s something that every backend or API developer faces sooner or later: the need for good fake data.

Whether you’re testing a new API, populating a database for demos, or simply trying to make your unit tests less “boring”, fake data is part of your daily routine. The problem? Most fake data feels… fake. You end up with “John Doe” and “123 Main Street” repeated over and over, which doesn’t look great when showing a prototype to your team or client.

So today, let’s fix that.

In this article, we’ll explore two powerful Java libraries that make generating fake yet realistic data a breeze: DataFaker and EasyRandom.

We’ll go beyond just generating names and emails — we’ll learn how to integrate both libraries inside a Spring Boot 3 project, how to combine their strengths, and how to make everything available through a REST API that returns test data.

This isn’t a theoretical overview. We’ll look at real code, and you’ll walk away knowing exactly how to reproduce it in your next project.

Why Bother Generating Fake Data?

Let’s face it: manually crafting test data is time-consuming and error-prone.

Imagine you’re developing a system for managing users. You need to test pagination, filtering, sorting, and edge cases (like missing emails or very long names). Instead of hand-writing 100 lines of sample JSON, wouldn’t it be nicer to generate it automatically and instantly?

Good fake data helps you:

Validate your logic in a more realistic scenario
Showcase prototypes with data that “looks real”
Stress test APIs or UI components with variable inputs
Automate unit tests without boilerplate “mock builders”

So instead of hardcoding “Alice” and “Bob,” we’ll let DataFaker and EasyRandom do the heavy lifting.

DataFaker: The Modern, Improved JavaFaker

If you’ve used JavaFaker in the past, DataFaker is its modern, actively maintained successor.

It’s built for recent Java versions (Java 17+), is fast, and offers hundreds of data categories — including names, addresses, finance, company information, internet data, crypto keys, and even Star Wars characters if you feel nostalgic.

Let’s see a quick example:

    Java
   
   import net.datafaker.Faker;

Faker faker = new Faker();
System.out.println(faker.name().fullName());
System.out.println(faker.internet().emailAddress());
System.out.println(faker.address().fullAddress());

Run that, and you’ll get something like:

Plain Text

Matilde Marques
[email protected]
Rua do Carmo 45, 1200-093 Lisboa

Pretty cool, right? And it even looks localized if you change the locale.

    Java
   
   Faker faker = new Faker(new Locale("pt"));

Now your data fits your language and region — an enjoyable touch for international testing.

EasyRandom: Because We Need More Than Fields

While DataFaker focuses on realistic field-level data, EasyRandom (formerly Random Beans) takes a different approach.

It’s great when you have complex Java objects — like entities or DTOs — and you want them automatically filled with random but valid values.

Think of EasyRandom as a smart “object generator” that knows how to populate your classes, including nested objects, lists, and maps.

Example:

    Java
   
   import org.jeasy.random.EasyRandom;

EasyRandom easyRandom = new EasyRandom();
Person randomPerson = easyRandom.nextObject(Person.class);

This will create a fully populated Person instance, with random strings, numbers, and even nested attributes.

So, where DataFaker gives realism (e.g., “John Smith, [email protected]”), EasyRandom gives structure and automation (e.g., filling an entire POJO graph).

And the best part? You can combine both — letting EasyRandom create your object and then using DataFaker to polish specific fields with more believable data.

Combining DataFaker and EasyRandom: The Sweet Spot

Here’s where things get fun.

We’ll create a small Spring Boot REST API that exposes endpoints to generate fake users. Each user will have an id, fullName, email, phone, and address. We’ll use DataFaker for realism and EasyRandom for automation.

Our project structure looks like this:

    Plain Text
   
 

   src/
 ├─ main/java/com/example/fakedata/
 │   ├─ Application.java
 │   ├─ config/
 │   ├─ api/
 │   ├─ controller/
 │   ├─ domain/
 │   ├─ dto/
 │   ├─ service/
 │   └─ mapper/
 └─ resources/
     └─ static/index.html
  

The User Domain Class

We’ll keep it simple, using Lombok to avoid boilerplate:

    Java
   
 

   @Data
@Builder
public class User {
  private String id;
  private String fullName;
  private String email;
  private String phone;
  private String address;
}
  

And for the API responses, we’ll use a Java record for immutability and readability:

    Java
   
   public record UserDto(String id, String fullName, String email, String phone, String address) { }

The Service: Combining Both Libraries

Here’s the core of our project:

    Java
   
 

   @Service
public class DataGenService {

  private final Faker faker = new Faker(Locale.ENGLISH);
  private final EasyRandom easyRandom;

  public DataGenService() {
    EasyRandomParameters params = new EasyRandomParameters()
        .seed(System.currentTimeMillis())
        .stringLengthRange(5, 20);
    this.easyRandom = new EasyRandom(params);
  }

  public User randomUserViaDatafaker() {
    return User.builder()
        .id(UUID.randomUUID().toString())
        .fullName(faker.name().fullName())
        .email(faker.internet().emailAddress())
        .phone(faker.phoneNumber().cellPhone())
        .address(faker.address().fullAddress())
        .build();
  }

  public User randomUserViaEasyRandom() {
    User u = easyRandom.nextObject(User.class);
    if (u.getId() == null || u.getId().isBlank()) {
      u.setId(UUID.randomUUID().toString());
    }
    u.setFullName(faker.name().fullName());
    u.setEmail(faker.internet().emailAddress());
    return u;
  }

  public List<User> manyUsers(int count, boolean easyRandomMode) {
    return IntStream.range(0, count)
        .mapToObj(i -> easyRandomMode ? randomUserViaEasyRandom() : randomUserViaDatafaker())
        .collect(Collectors.toList());
  }
}
  

You can see how we use DataFaker for realism and EasyRandom for structure — like a two-chef recipe: one creates the base, the other adds seasoning.

The REST Controller

Now, let’s make it accessible through a REST API.

    Java
   
 

   @RestController
@RequestMapping("/api/users")
public class UserController {

  private final DataGenService service;

  public UserController(DataGenService service) {
    this.service = service;
  }

  @GetMapping("/{count}")
  public ApiResponse<List<UserDto>> generateUsers(@PathVariable int count,
                                                  @RequestParam(defaultValue = "false") boolean easy) {
    List<UserDto> users = service.manyUsers(count, easy)
                                 .stream().map(UserMapper::toDto)
                                 .collect(Collectors.toList());
    return ApiResponse.of(users);
  }
}
  

And to make our API responses consistent, we wrap everything in an envelope with a timestamp:

    Java
   
 

   public record ApiResponse<T>(T data, Instant timestamp) {
  public static <T> ApiResponse<T> of(T data) {
    return new ApiResponse<>(data, Instant.now());
  }
}
  

That way, every API call returns data like this:

JSON

{
  "data": [
    {
      "id": "e7b1c37a-8b20-43c1-8ff3-b4aef8d89c3a",
      "fullName": "Lina Cordeiro",
      "email": "[email protected]",
      "phone": "+351 912 345 678",
      "address": "Rua do Comércio 12, Porto"
    }
  ],
  "timestamp": "2025-10-06T13:02:45.321Z"
}

Much cleaner and easier to debug.

Why Timestamp in Responses?

Adding timestamps isn’t just for looks. It’s a simple, useful practice that improves observability.

When debugging requests in distributed systems or when clients log responses, having the server timestamp right in the payload helps you correlate events — it’s a micro detail with macro benefits.

Why Both Libraries Are Better Together

You might wonder: “Why not just use DataFaker alone?”

Good question.

DataFaker is unbeatable for producing realistic values, but it doesn’t automatically populate deep object structures.
EasyRandom, on the other hand, is great for object graphs, but its randomness feels too synthetic — you’ll end up with “[email protected].”

Together, they give you:

Realism + Automation
Ease of integration with tests and APIs
Consistency through configuration and seeds

It’s a bit like combining a random word generator with a translator — one provides variation, the other makes sense of it.

Going Further: Postman, Docker, and CI/CD

Our complete project also includes:

A Postman collection for quick testing
A Dockerfile and docker-compose.yml for containerization
GitHub Actions CI and Dependabot setup for automated builds and dependency updates

That makes this small demo a production-grade reference project for testing and learning.

If you’re mentoring junior developers or building internal utilities, this is a great example to show clean architecture and reproducible data generation.

Repo: github.com/wallaceespindola/fake-data-springboot

Practical Ideas for Using This Setup

Load testing: Generate thousands of fake users to populate a database.
UI prototyping: Feed your frontend with realistic API data.
Demo environments: Seed a sandbox with dynamic sample users.
Unit tests: Replace new User("a","b") with a call to DataGenService.randomUserViaDatafaker().
Data anonymization: Quickly replace sensitive production data with fake equivalents.

Each of these is a real-world scenario where this combination shines.

Closing Thoughts

The difference between a “meh” test dataset and a “wow, this looks real!” demo often comes down to how you generate data.

With DataFaker and EasyRandom, you can automate that process elegantly — using modern Java, minimal boilerplate, and libraries that just make sense together. You’ll not only save hours when building tests or mock APIs but also deliver demos that feel alive, diverse, and realistic.

The best part? It’s all open-source, lightweight, and easy to integrate with Spring Boot, Quarkus, Micronaut, or even a plain Java console app.

So next time you need to populate an API or test your system’s resilience, don’t settle for "John Doe" anymore. Give your fake data some personality — and let Java do the heavy lifting.

Need more tech insights?

Check out my GitHub repo and LinkedIn page.

Happy coding!

Test data Data (computing) Java (programming language) Testing

Opinions expressed by DZone contributors are their own.

Related

Trending