One of the critical architectural decisions you must make when designing modern applications is selecting the right data storage technology. Not only can this decision be expensive to change later, but it also deeply affects your application’s availability, security, and performance.
Depending on the size and reach of your organization, you might need to replicate your data to multiple geographical locations to serve a global base of users. Multi-region storage can reduce access latency, speed up recovery, and help you manage legal compliance for your data.
There are many services available to provide multi-region storage, including Amazon DynamoDB and Fauna. Amazon DynamoDB
is an AWS-managed service that provides a serverless key-value NoSQL database. Fauna
is a multi-region, serverless, document-relational database. Because these two services use different data storage methods, their terminology and features
This article will compare the behavior of these two NoSQL databases in multi-region deployments. You’ll learn more about how they are configured for multi-region deployments, how their distributed transaction behaviors differ, and how they select the closest replica, so that you can decide which database would be a better choice for your projects.
Configuring multi-region deployments
Running multi-region data stores comes with its own challenges. Due to the geographically distributed nature of these systems, network latency and interruptions can cause major performance and reliability issues.
By default, DynamoDB tables are deployed within a single AWS region. To achieve multi-region deployment, a global DynamoDB table
has to be created instead.
A global DynamoDB table is composed of multiple regional replicas. Each one of these replicas should be created and managed independently (such as in their capacity provisioning and storage class) and should be named identically. Queries can be run on any replica, and AWS will ensure the data is eventually consistent
across all the replicas.
Fauna provides a different level of abstraction. Instead of configuring each regional replica independently, each database is configured to a specific region group
. Replicas can run in the United States, Europe, or across both. Fauna uses a Calvin-inspired
transaction engine to provide distributed consistency
and a high-performance system.
While global DynamoDB tables are available in more regions than Fauna (such as Asia, for instance), additional setup and maintenance are necessary to run them. Fauna is more straightforward to set up across multiple regions, but currently only supports the United States, Europe, or both as available Region Groups. This means your best choice here depends on which regions your organization serves. If you only do business in the US and Europe, then Fauna will work better for you. If you have users in other regions, you may prefer DynamoDB.
Using database transactions
Database transactions atomically perform database operations that affect multiple data elements like tables and fields. A database transaction’s atomicity ensures data consistency
With DynamoDB, it’s possible to bundle independent queries via transactions
. However, the best practice is to use regular, non-transactional queries wherever possible and use transactions only when necessary. ACID compliance
is only guaranteed in DynamoDB transactions when performed within the same region. It’s not available for global tables.
Transactions can target up to twenty-five items in DynamoDB tables in the same region within the same AWS account. Multiple tables can be included in the same transaction. Note that read and write queries can’t be mixed on the same transaction—for example, multiple write operations can be bundled into a transaction, but a read operation will have to use a different transaction. The aggregated size for all items in a transaction can’t exceed 4 MB.
Due to the nature of DynamoDB, serializability between transactions and other queries is provided only in some cases. Developers need to ensure they understand the documentation
, because caveats might apply to their systems.
On the other hand, Fauna was designed to be a fully transactional database, treating transactions as a first-class concept. Fauna’s database engine was built to provide strictly serializable transactions and guarantee short latencies even between geographically distributed replicas. The effect is that transactions can have any number of documents and can be up to 16 MB in size
While DynamoDB is designed to run a high number of small, independent queries, Fauna is designed to support transactions to always keep a consistent state of the data. DynamoDB best practices prescribe using transactions sparsely, whereas with Fauna, all queries are considered transactions. If your application heavily depends on transactions and can’t handle incomplete or inconsistent data, Fauna may be the better choice.
Using transactions in multi-region deployments
As noted above, there are transaction limitations that apply to both single-region and multi-region data stores. Because of the distributed nature of multi-region deployments, though, there are several other caveats when using transactions.
DynamoDB transactions, as previously noted, are only applied—and relevant to—the region where the transaction was initiated. After a transaction successfully completes in the source region, the changes will be propagated to all replicas. By design, as transactions are region-specific, there’s no ACID compliance between different regions.
This means that you can potentially see partially completed transactions in different replicas until DynamoDB finishes the replication. The application needs to be architected to address such data anomalies and possible transaction conflicts
. Your application needs to treat all data read from a replica as if there’s no concept of a transaction, but rather just individual, independent queries. This also needs to be addressed if you are caching the data, as you might have read inconsistent data.
With Fauna, transaction guarantees are fully supported in all configurations, including multi-region setups. Fauna’s engine will automatically ensure that no replica is exposed to transaction anomalies like dirty or phantom reads
If your application can’t tolerate inconsistent data, or if ACID compliance is required from all replicas, Fauna will be the better choice.
Routing connections to the nearest replica
Application architectures often use multi-region data stores to reduce latency for users. This ensures that the data is stored closest to users’ geographical locations.
With DynamoDB, your application must provide a specific endpoint to choose the region for accessing the nearest replica table. Global DynamoDB tables ensure data is synchronized between replicas, but your application still needs to decide which replica to use and how to access it. That means a few more coding and configuration changes are necessary, and your application must identify the closest region and change the AWS region accordingly. For example, an EC2 instance in
eu-west-1 should reach a DynamoDB replica in
eu-west-2 instead of
us-east-1 if those are the two regions the global DynamoDB table is deployed to.
This article demonstrated how DynamoDB and Fauna differ in multi-region setups. Though both services are strong choices, as you saw, they are each best for different use cases.
If your application needs strong ACID compliance across multi-region replicas, Fauna may be a better choice. It eliminates data anomalies in replicas by serializing transactions. Fauna offers fewer regions than DynamoDB, but it requires less configuration tweaking and maintenance. Additionally, it can configure multi-region setups and routing connections transparently.
If you’re interested in learning more about Fauna, its developer-friendly features
, and data API, you can sign up for free
to give it a try. If you'd like to get a free consultation with a Fauna database expert, you can schedule time with one
Cintia Del Rio helps companies improve their infrastructure in the cloud. An engineering manager at Envato, she has been working with infrastructure and DevOps for more than ten years, and before that she was a developer. She has been the lead in infrastructure for the OpenMRS open source community for the past seven years.