Why Strong Consistency Matters with Event-driven Architectures
Event-driven architectures are widely adopted patterns that model an application as a series of software components or services. Those services react to commands that represent business and/or user actions and result in events. Event-driven architectures help modern applications solve the challenges of scalability, auditability / traceability and immutability. These architectures become especially powerful if you can partition your data into smaller subsets. This allows event processing to scale horizontally in a large distributed architecture, like microservices.
However, event-driven architectures are not a complete panacea. A critical aspect of an event-driven architecture is the causal relationship between datasets; processing the current event often depends on states derived from previous events. This requires an exact ordering of events. In a distributed system, ordering of events becomes a severe challenge because of what has become known as the “fallacies of distributed computing” where data is obliged to flow over an unreliable network that introduces some level of latency. This leads to the pattern commonly known as eventual consistency, which pushes the complexity of consistency to the application tier.
This becomes especially difficult with the current generation of NoSQL databases that provide global scale by throwing out the features that made prior database systems so attractive, such as flexibility in how the data was modeled and most importantly the ability to enforce the integrity of the application using database transactions. Maintaining these constraints is termed consistency - one of the foundations of ACID transactions featured in most traditional RDBMS. The current generation of NoSQL databases either introduces windows of inconsistency at best (eventual consistency), which must be reconciled (not always feasible), or introduces complex cluster orchestration to partition the stream of events, which must be maintained across all services that process the stream. With consistency often missing in modern NoSQL databases, achieving it can be a major challenge of modern scale-out, event-driven architectures. It’s also important to note that modern NoSQL databases do not keep a history of data revisions and have no way to report on these changes, making it difficult to offer auditability.
Different Patterns of Event-Driven Architecture
Martin Fowler clarifies some of the common event-driven patterns in his article “What do you mean by “Event-Driven”?” The two primary patterns we will discuss in this article are event sourcing and Command Query Responsibility Segregation (CQRS).
Event sourcing captures changes to the state of a system as a series of events. The event store then becomes the primary source of truth, and the current state of the system is derived from this event store. More details on how the pattern is defined can be found in the domain-driven design literature (Implementing Domain-Driven Design), or online (Microsoft, Martin Fowler, InfoQ).
The other common pattern is CQRS, which is the notion of having separate data structures for reading and writing information, allowing for different views or projections of event streams.
[Example] Modeling a Hotel Points Program as Events
A hotel points program can be designed similar to the ledger model used in accounting. In this model, every credit/debit transaction is written as an immutable ledger of events. If a mistake is made, a correcting transaction would need to be made, similar to how accountants never change existing entries.
Using the patterns of event sourcing, we can architect a points program to process the commands (such as a hotel stay) and record them as immutable events in the ledger. For example, when a user earns points, the event is stored as a ledger entry of the event type and the number of points earned. In this example the entry would be (“hotel stay”, 32 points). This provides a full audit log of the events that happened, and a custom view of the ledger is created to provide a current account balance.
Such a simple model might seem like it should easily scale on even the most simplistic key-value storage. Unfortunately, this is not the case. For instance, to ensure that a customer never claims entitlements worth more than their points balance, our model requires snapshot isolation or a primitive compare-and-swap-style operation. If we want to allow customers to transfer balances between accounts, we again need a better isolation level. To do this in a scale-out fashion, without baking our concurrency model into the application, our choices for database backends shrink very rapidly. Finally, the choice of databases narrows even further when we need to provide different views of the data. This requires secondary indices that allow querying the data with access patterns that differ from how the data is stored.
Globally Distributed ACID Transactions in FaunaDB
The most unique capability of FaunaDB is that it provides ACID-compliant, consistent transactions in a partitioned, globally distributed environment.
When it comes to database technology, traditional thinking has held that it’s impossible to provide both global scale and strong consistency. This ideology has led to the pervasive misconception that the only way to achieve global scale is to sacrifice strong consistency and transactions.
FaunaDB upends this thinking by bringing an entirely new approach to database transaction resolution by providing a globally distributed ACID transaction engine. It has been developed over the past several years by technical leaders with experience at Twitter and Couchbase.
The use of globally distributed ACID transactions in FaunaDB allows enforcement of complex business constraints (by enforcing consistency) and linearization of event processing. Offloading this concern to the database frees the developer to focus on application concerns and reduces the complexity of the application code.
Order of Events in FaunaDB
Enforcing the complex business constraints discussed earlier often requires the events be captured in an exact order. For example, the bids on an auction, or the trades on the stock exchange, need to be captured in an exact linear history. This linear history relies on a database that offers a stronger guarantee than eventual consistency.
FaunaDB, with ACID transactions, allows events to be captured while preserving the linear order of their history.
Temporality in FaunaDB
FaunaDB keeps all instances of a dataset. Instead of overwriting, it creates new ones when a write is performed. This is especially useful when auditing event data and verifying its evolution over time. In an event-driven architecture, this would allow you to view the state of the world and all related events at any point of time.
Indexes in FaunaDB
FaunaDB stores data in the form of flexible objects, where each object is an instance of a class. To allow for different access patterns, FaunaDB also allows multiples indexes of these classes so that they can be searched and retrieved by different attributes. Also, an index can contain references to multiple classes. The other unique aspect of FaunaDB is that updates to the index are atomic. This ensures that the index itself also maintains strong consistency alongside the core data and can enforce constraints like uniqueness.
For event-driven architectures, this enables different views or projections of the core data, as well as the CQRS (Command Query Responsibility Segregation) pattern. Also, similar to a relational database, the index allows unique constraints to be put on the events. This enables the ability to provide constraints such as ensuring only one copy of an event is stored in the database.
Capturing the Current State of the World
Event sourcing stores the entire history of a system as a stream of events. This could include banking transactions, auction history, etc. For many businesses, this ledger of events is extremely important in understanding what happened. In addition to this ledger of events, businesses also need to capture the current state of the world. Some examples include an account balance, the owner of a legal contract, the winning bid, etc. This “mutable state” can be calculated in multiple ways depending on the business requirements for how real-time the value needs to be.
If this “mutable state” can have a slight delay, then event sourcing only needs to ensure consistency on the write level. Then, snapshots and/or roll-ups using an eventual consistency model can be used for calculating projections out of the streams of events. Different consumers of events can be used to create read models, syndicate data, and prepare batch jobs that are eventually consistent. Calculating these different projections could take several seconds and could be handled later, as the event write itself only cares about making sure the new events were calculated based on the latest available state.
However, if this “mutable state” needs to be real-time, it can be very difficult to calculate in an eventual consistency model since it’s often calculated after the fact with snapshots and roll-ups. Instead, an exact value can be created using Fauna ACID transactions to capture the exact state of the world at all times. This would allow for real-time data and business logic, such as determining if a customer has enough funds available for a purchase.
Comparison to Other Systems
FaunaDB is not the only database out there that offers global transactional consistency. Two other databases including Spanner (Google’s database-as-a-service) and CockroachDB (the open source stepchild of Spanner) provide similar functionality, but in a radically different way. Both are based on a protocol known as “Spanner’. In contrast, FaunaDB is based on a very different, patent-pending protocol. The distinction among these three databases boils down to a comparison between the different transactional protocols. Read more about the distinction between our protocol and Spanner’s here.
Beyond providing a fundamentally different transactional model, FaunaDB also delivers operational advantages. Baked into the architecture of FaunaDB is a multi-tenant, quality-of-service mechanism. Each workload that runs against FaunaDB has an associated priority. While other databases provide static multi-tenancy with fixed resources for each tenant, FaunaDB’s model is completely dynamic. An entire FaunaDB data infrastructure, with multiple global data centers, can function as a single set of resources such as I/O, compute, and storage. While you won’t notice this when you are coding your first application against FaunaDB, it is extremely beneficial when you need your database to run as the core platform for multiple applications and datasets. The best part is that it completely decouples application development from operational activities. Operations just need to ensure that there is sufficient capacity for all the various applications and data, and can shift the infrastructure topology however needed.
How Fauna Solves these Challenges
FaunaDB solves the major challenges of building an event-driven architecture by providing mission critical ACID transactions at a global scale. By using FaunaDB as the foundation of the application architecture, the complexity of maintaining strong consistency and ordering of events is solved by the database. This greatly simplifies the overall application architecture.
In the next article, we will dive into an example of building an event-driven application, starting with a look at the core Fauna queries and features that form the building blocks.
Thanks to Ben Edwards for helping out with the writing and proofreading of the article!