Graham Brooks - Event-Carried State Transfer Pattern

One thing about developing software - it is never dull. I have been lucky to be on projects that are on the leading edge with technologies and patterns that become mainstream. Event driven architectures and particularly distributed event driven architectures using state transfer feels like one of those trends. Driving applications through events is not new. What makes it interesting is scale and distribution that we are now developing. 100s or 1000s of services redundantly deployed into clusters all interacting with each other using events. This architecture gets interesting … fast.

For the last 3 years I have been developing event driven systems based on event-carried state and event sourcing.

The notification event pattern decouples the publisher and subscriber in design (the publisher does not know who is listening). The subscriber receiving the events still needs to know about the publisher and be given an API it can use to read the current publisher state. Publisher and Subscriber are still runtime coupled because the subscriber needs to call the publisher for details of the change. This callback reduces availability and increases load and traffic.

By adding state to the published events the Event-Carried State Transfer pattern reduces or removes this run-time dependency. A publisher could send an event and immediately quit. Because the event contains state the subscriber only needs the event to do its work

Figure 1. Direct Subscription - Shipping subscribes to orders from Sales

Figure 2. Broker based publish subscribe - Shipping subscribes to orders, Sales publishes orders

Decoupling the publisher and subscriber is a really nice characteristic of event driven patterns. But what we gain from this decoupling also introduces significant complexity in the form of consistency or more accurately inconsistency in the overall state of the system.

The notification event pattern introduces inconsistency that is resolved by rapid processing of the notification event. The system of components is still inconsistent for a period until the event has been processed but because the subscriber calls back to the publisher it sees the current state. The system does not need to be concerned with event race conditions and event ordering.

Like any API the publisher design needs careful consideration to provide enough information to be useful to known and unknown subscribers.

Historical reliance

Systems that use events carrying state have to track multiple event streams. To understand orders we need to know about the customers they are for.

JSON example event structure

{
  "order-id" : "1iepyhw1ybdyd",
  "customer-id" : "1h7wi8mam7uxu",
  "items" : [ ]
}

In this simplified model the order has an id and reference to the customer with the customer-id on its own this event is of little use to the subscriber.

A more complete event might look something like:

{
  "event-type" : "new-order",
  "order-id" : "uvhgbsm7jy9g",
  "customer-id" : "1h7wi8mam7uxu",
  "items" : [ {
    "name" : "pencil",
    "quantity" : 1,
    "price" : 0.5
  } ]
}

When subscribers have received and process the new-order event. Lets say that the customer updates the order adding more pencils.

Basic order amendment

{
  "event-type" : "order-amended",
  "order-id" : "1jeiqfw7yuc83",
  "customer-id" : "1h7wi8mam7uxu",
  "items" : [ {
    "name" : "pencil",
    "quantity" : 10,
    "price" : 0.5
  } ]
}

So now we have two events about the same order. In a distributed multi-node environment processing these events in sequence becomes significant because new-order and order-amended could be received in either order.

Subscribers could be made tolerant perhaps by using an event store themselves to store events until they need to be used in sequence but this puts a lot of burden on the subscriber.

Delivering events in the right order is one of the things we can ask a message broker to do. Generating a sequence number in the publisher gives us a ordinal for sequencing.

While this helps solve problems when data is generated from a single publisher but lets assume that we have a system managing customers and another orders. customer events are not going to share sequence numbers with the order system. Any system that listens to both the customer and orders systems will need to handle events arriving in a non-deterministic sequence.

Back to the real world where we have to assume that orders can have many (unbounded?) amendments before it is takes action and goods shipped.

Multiple amendments

[ {
  "event-id" : "t4nd4om4v19y",
  "event-ordinal" : 1,
  "event-type" : "order-amended",
  "order-id" : "qy1m4ydlkojd",
  "customer-id" : "1h7wi8mam7uxu",
  "items" : [ {
    "name" : "pencil",
    "quantity" : 1,
    "price" : 0.5
  } ]
}, {
  "event-id" : "1jicrohqzq7g3",
  "event-ordinal" : 1,
  "event-type" : "order-amended",
  "order-id" : "qy1m4ydlkojd",
  "customer-id" : "1h7wi8mam7uxu",
  "items" : [ {
    "name" : "pencil",
    "quantity" : 10,
    "price" : 0.5
  } ]
} ]

The event now contains a unique id and an ordinal to indicate publication/processing order. If the events arrive at two subscriber instances in a cluster at the same time then the subscriber can resolve ordering by making sure that the events are processed in ordinal sequence. Most message brokers can be configured to supply events in order which can really help simplify subscriber processing.

But the order references a customer by an opaque id and contains no other details. The shipping system does not know how to address the customer or any other details essential to good customer experiences.

The sales system could provide customer details as part of the order, e.g. the customers name and preferred shipping address. By considering these details to be part of the order the shipping system only needs to worry about events about orders. This approach adds some coupling. The sales system is second guessing the subscribers need for customer information.

The sales system could provide a reference to the customer http://customer/{id} as part of the event. The subscriber would then be able to call the customer system to get the required details. This simple addition makes it easy for the subscriber to read the customer details it needs.

By calling back to the customer management system for the current state of the customer adds some runtime coupling, reducing availability and service independence. To maintain low coupling and higher availability the shipping system can subscribe to customer events, using the customer id to build its own data model specific to shipping orders. This is much more complex, events are now sourced from multiple publishers (customer and orders). While events from a single component can maintain order we now have race conditions in update from multiple sources. Often these conditions don’t cause a problem because event processing is much faster than event production. Relying on slow event production is risky. Event Sourcing and particularly when replaying events breaks the slow producer assumption.

Events are a statement of fact (point in time) and not a statement of intent.

Events carry state at a point in time - typically when the event was generated. Asynchronous distributed systems can distribute commands, request/response semantics etc. these are not events (IMO).

Wrapping Up

The high availability and loose coupling provided by this pattern are very attractive capabilities. These capabilities come with a cost - much greater complexity.

Events carrying state require careful design, Events need to be received in order. Event processing needs to be idempotent or events processed once and only once.

System consistency needs to be monitored, managed and controled to provide good experiences in client UI applications.

Events change over time making release and maintainance more complex particularly if Event Sourcing is used.

Further Reading

Martin Fowler:What do you mean by “Event-Driven”?