Event Sourcing Unpacked: The What, Why, and How

Take a deep dive into event sourcing as we explore how it works, why it's important, and the major benefits and challenges it brings to modern systems.

Ammar Husain

CORE ·

Jun. 10, 25 · Analysis

Likes (1)

Comment

Save

1.1K Views

A traditional system maintains its state consistent with respective business rules. When queried, this system provides its current state only i.e. where the system is, and no information about how it got there.

A simple approach to track the systems’ state evolution (how it got there) is by maintaining history. However, this approach is limited to providing information about state changes only. Moreover, the record-keeping becomes burdened with process details and has to evolve with all the processes that affect the state change.

As an example, in a typical order management system, the order state could change from Pending to Confirmed. Unless the system captures the process details, such as payment receipt, inventory acknowledgment, etc., the state change information is merely indicative and lacks intent. Additionally, for any new processes or details, the record-keeping must be updated accordingly.

If a system captures every state change as an event object and stores them sequentially as they occur, not only is the intent clear, but the system state can also be reliably (re)constructed any number of times. The state reconstruction can even accommodate retroactive changes, such as fixing faults in processing. Fundamentally, this is how an Event Sourced system works.

Considering the earlier example of an order management system reimagined as event-sourced, the state change of an order can be captured as a ConfirmOrder event. This event, specifically a command, is triggered by the relevant process(es) and carries all the necessary details, eliminating the need to capture these details separately. Moreover, whenever any process commands the system via ConfirmOrder to indicate a state change, it is properly captured without requiring system adaptations.

Event Sourcing

An Event Sourced system records every state change to a domain object as an event. Once the state change occurs, it’s broadcasted through another event; for example, an OrderConfirmed event is published in response to a ConfirmOrder command when the Order domain object's state changes to Confirmed.

These events exhibit below characteristics themselves or at system level:

Events are immutable, representing a business fact of the domain. E.g. OrderPlaced, PaymentRecieved, etc.
Events are referred in past tense since they represent the domain object state change which has already occurred. E.g. OrderConfirmed, OrderCancelled etc.
Events carry required details and metadata for the handler to process the state change in relevant domain objects. E.g. On OrderConfirmed event the handler could trigger the dispatch of items and broadcast OrderDispatched event to notify system for next steps.
Events are captured sequentially in a separate data store which act as single source of truth. These events are then used to derive the system state. The derived state could be latest or historic, too.
Events can be queried or replayed such as to derive the system state at any point in history.
Events are reversible. While events are meant to be (re)played forward, it’s still a useful ability for them to reverse themselves. This ability is straightforward if events are modeled as delta. E.g., an event modeled with details “apply discount of 10%” is reversible in contrast to another event modeled with details “set order price at ₹550”. This characteristic, however, should be evaluated and incorporated only after careful considerations.

Event Processing

While modeling events with the discussed characteristics is crucial, it’s equally important to understand the nuances of event processing.

Below are a few considerations to keep in mind to get the most out of an Event Sourced system:

The event itself carries details of its processor; i.e., the processor is predefined. Upon receiving an event, the processor performs the required changes in the domain state. The downside of this approach is that the event and its processor are coupled. Alternatively, the event processor can select the processors and delegate to domain models for the necessary domain state changes. This approach decouples process selection from domain state changes, providing greater flexibility.
An idempotent event processor should be able to distinguish between real and replayed events. Otherwise, the system loses the ability to correctly process replayed events, potentially resulting in undesired states during replay.
If the system involves any external updates, such as email/text notifications or payments, it becomes tricky to handle them during event replay. One option is to disable these communications altogether during replay. However, this requires domain processing to be aware of whether it is in replay mode or not. Another approach is to create a gateway with external system(s) that has built-in intelligence to differentiate between real and replayed events. This way, domain processing remains largely decoupled. If external updates are batched, the only safe window for multiple replays is the duration between two consecutive batch triggers.
If the system relies on data fetched from external queries to build its state, it must be able to return historical data from these queries. For example, if the system queries inventory details from an external system, the data should be available for a specified date. Non-availability could lead to incorrect system state reconstruction. Therefore, to ensure reliable state reconstruction, either the external system must respond with data for a specified date, or the system itself must remember the query results.
Like any event processing system, handling out-of-order events is crucial, especially when in replay mode. During real-time processing, it is impossible to predict the sequence in which events arrive, so events are processed as they come, according to domain requirements. However, during replay, since all events are available, special mechanisms can be introduced to reorder events into the expected sequence before applying them. Replay thus provides an excellent opportunity to fix anomalies caused by out-of-order events during live processing.
As the system evolves, event handler logic, essentially its code, changes. These changes may introduce new features, bug fixes, or temporal logic adjustments. New features can be added as long as they do not invalidate existing logic. Bug fixes correct logic that led to incorrect states. Simple replay reconstructs the new state with these updates, accompanied by relevant state changes. However, external systems may add complexity and should be carefully evaluated and handled. Temporal logic changes can complicate processing quickly, as business rules may no longer apply retrospectively. Conditional logic might be required to handle these changes correctly, so temporal changes must be introduced cautiously and only when necessary.
As the system grows, replaying all events from the beginning may become infeasible. Therefore, a checkpointing mechanism can be implemented. Checkpoints can be logical and achieved by reversing events up to a certain state. This means the processor must have the ability to reverse and reprocess events. This capability makes Event Sourcing very powerful, allowing retrospective changes without disrupting other system aspects.

Usage

The primary use case for building an Event Sourced system is a strong requirement for audit or traceability. Since every domain change is captured as a sequential event, an Event Sourced system inherently provides out-of-the-box auditing. That said, similar audit capabilities can sometimes be achieved through structured application logs, without the added complexity.

Debugging an Event Sourced system is relatively easier. With the ability to replay events, the system state can be (re)constructed multiple times. Once the fix is available (as applicable) it can be verified too with replays. In addition, data migration can be achieved much more easily. With replay of events the system state on a different data store can be constructed without going into the nitty gritty of complex migration processes.

However, given the complexity that comes with an Event Sourced system, it's often advisable to consider simpler alternatives first. That said, Event Sourcing is well-suited for domains such as Finance & Banking, Healthcare, IoT & Telemetry, Supply Chain & Logistics, and Gaming, where auditability, scalability, and complex event-driven workflows are critical.

When Not to Use Event Sourcing

Below are key considerations to be aware of when Event Sourcing is not a suitable option:

Simple CRUD — For system with basic create, read, update, and delete operations without the need to trace a detailed history, event sourcing adds unnecessary complexity.
Event Modeling — Not every domain or systems have frequent changes or natural translation into business events. In such cases the difficulty of designing and evolving the event schema might outweigh its benefits.
Strong Consistency — Systems that require strong consistency may suffer from event sourcing’s inherent eventual consistency model.
Operational Overhead — Managing an ever-growing event log, ensuring robust monitoring, and handling replay operations require significant infrastructure and maintenance efforts. For smaller or latency-critical operations, this extra load might be impractical.

Consider opting for a simpler architecture if your project lacks the demand for deep audit logs, replay capabilities, or high scalability that event sourcing offers.

Related Patterns — DDD, CQRS, EDA, WAL

Folks familiar with Event Driven Architecture (EDA), Domain Driven Design (DDD), CQRS and Write Ahead Logging (WAL) must have already found a lot of commonality with Event Sourcing. Below is a short comparative summary between them:

Domain-Driven Design (DDD) — The primary focus of DDD is to model complex business domains using aggregates and bounded contexts while Event Sourcing records all state changes as events. Event Sourcing can complement DDD by ensuring historical traceability of domain entities. Also do note that Event Storming of DDD is different from Event Sourcing.
CQRS — CQRS separates read and write operations for scalability and performance. Event Sourcing naturally fits with CQRS, as commands generate events that update the system state, while queries retrieve materialized views optimized for fast reads.
Event-Driven Architecture (EDA) — EDA enables loosely coupled systems that react to events asynchronously. Event Sourcing provides a structured way to store and replay events, making it a useful foundation for EDA-based systems.
Write Ahead Logging (WAL) — While both WAL and Event Sourcing involve logging changes, they serve different purposes and operate at different levels of abstraction. WAL is a low-level technique for ensuring data integrity in databases, while event sourcing is a higher-level architectural pattern for capturing and utilizing the complete history of a system’s state changes. Also, they differ in terms of lifespan and granularity.

Each pattern serves a distinct purpose and can be combined as needed to maximize their strengths while balancing their inherent shortcomings.

Conclusion

Event Sourcing offers significant benefits for systems that require persistent audit trails, rich debugging capabilities with event replay. It is especially effective in domains like finance, healthcare, e-commerce, and IoT, where every transaction or state change is critical and must be traceable.

However, its complexity means that it isn’t ideal for every scenario. For applications that primarily engage in basic CRUD operations or demand immediate consistency, the overhead of managing an ever-growing event log, handling event schema evolution, and coping with eventual consistency can outweigh the benefits. In such cases, simpler persistence models may be more appropriate.

When compared with related patterns, Event Sourcing naturally complements CQRS by decoupling read and write operations, and it enhances Domain-Driven Design by providing a historical record of domain events. Additionally, it underpins Event-Driven Architectures by facilitating loosely coupled, scalable communication. The decision to implement Event Sourcing should therefore balance its powerful capabilities against the operational and developmental complexities it introduces, ensuring it aligns with the project’s specific needs and long-term architectural goals.

Event systems write-ahead logging

Published at DZone with permission of Ammar Husain. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending