🚀 White Paper: Fauna Architectural Overview - A distributed document-relational database delivered as a cloud API.
Download free
Fauna logo
Product
Solutions
Pricing
Resources
Company
Log InContact usStart for free
Fauna logo
Pricing
Customers
Log InContact usStart for free
© 0 Fauna, Inc. All Rights Reserved.

Related posts

Understanding the document-relational databaseReducing complexity by integrating through the databaseReal world database latency

Start for free

Sign up and claim your forever-free Fauna account
Sign up and get started
Isolation Levels

A Comparison of Scalable Database Isolation Levels

Evan Weaver|Mar 15th, 2019|

Categories:

Engineering
It is very difficult to find accurate information about the correctness and isolation levels offered by modern distributed databases and the operational conditions required to achieve them. Developers use different terms for the same thing, the meaning of terms varies or is ambiguous, and sometimes vendors themselves do not actually know.
At Fauna, we care a lot about accurately describing which guarantees different systems provide. This is our effort to centralize a description of which database does what. For consistency’s sake, we will use the terminology from Kyle Kingsbury’s explanation on the Jepsen site. The chart is ranked by the maximum multi-partition isolation level offered.
The data is based on statements about isolation levels from vendor documentation, white papers, and developer commentary, exclusive of aspirational marketing statements. We have tried to be neutral in the characterization of the various systems' architectural properties. Whether the system implementations uphold these guarantees is addressed elsewhere. If you haven't already, please see Fauna's own Jepsen results for confirmation that Fauna upholds its guarantees.

Before we BEGIN

In discussing transactional isolation, we frequently encounter the "worse is better" argument, which essentially goes:
  • This database does what it does
  • Implementing better isolation in the database is impossible or has unacceptable tradeoffs
  • Implementing better isolation in the application is simple and useful
This argument also goes by "it's not a bug, it's a feature". 
The pretense of low maximum isolation levels, eventual consistency, or CRDTs is that application developers are ready and willing to work through every failure and recovery condition of their distributed dataflow. But in practice, moving beyond “works on my machine” correctness testing requires an extraordinary level of investment that product teams simply can not do. 
In my experience, the implications of different isolation levels are very subtle. Pushing the burden to application developers—especially when there are a lot of distinct applications, like in a microservices architecture—is tremendously detrimental to productivity. And although tunable consistency increases flexibility, it cannot be used to paper over an isolation level that is fundamentally too weak to effectively compose
After all, /dev/null is serializable, but not very useful as a database. 

Distributed Databases

Distributed databases present a unified topology and do not require operator management of replication, although some, like the Percolator systems, do require management of special nodes.

Maximum isolation level

Default isolation level

Minimum isolation level

Consensus
architecture

Limitations

Fauna

Strict serializability

Snapshot

Read/write transactions without indexes are always strictly serializable.
SnapshotCalvin 1 with optimistic concurrency control

Writes must coordinate on local log leaders. Reads can be served from any replica.

Google Cloud Spanner 4

Strict serializability (and "external consistency")Strict serializabilitySnapshot (called “bounded staleness”)Spanner 2

Writes must coordinate on partition leaders which may be remote. Reads can be served from any replica.

FoundationDB 5 6

Strict serializabilityStrict serializabilitySnapshotModified Percolator 3

All queries must coordinate on the timestamp oracle.

Conflict resolution is deferred until commit.

CockroachDB 7 8

SerializableSerializableSerializableModified Spanner

All queries must coordinate on the partition leaders for their respective keys.
Transactions with shared keys are mutually serializable, but transactions with disjoint keys can suffer “causal reversal”.

Isolation is violated under clock skew.

Yugabyte 9

SnapshotSnapshotSnapshotSpannerIsolation is violated under clock skew.

TiDB 10

Repeatable readRepeatable readRepeatable readPercolator

All queries must coordinate on the timestamp oracle.

DynamoDB

Strong partition serializabilityRead committedRead committedPaxos

Multi-partition two-phase commit offers limited serializability support.
Multi-partition transactions limited to 10 primary keys with explicit read dependencies.

Indexes are not serializable.

Isolation is violated if there are non-transactional queries to the same keys or if global tables are used.

CosmosDB 11 12

Strong partition serializabilityLinearizable for single-region, snapshot for multi-regionRead uncommittedPaxosMulti-partition transactions are not supported.

Cassandra 13

Strong partition serializabilityRead uncommitted (aka “eventual consistency)Read uncommittedSingle-decree PaxosMulti-partition transactions are not supported.

Isolation is violated if there are non-transactional queries to the same keys, or if global secondary indexes are used.

MongoDB 14

Session causalityRead uncommittedRead uncommittedSharded, semi-synchronous replication with automated failoverMulti-partition transactions are not supported.

Isolation is violated during partitions and shard leader election.

Replicated Databases

Replicated databases require operator management of primaries and secondaries and the associated replication links. Asynchronous replication can improve availability and scale read capacity, but does not offer any distributed consistency guarantees. Semi-synchronous replication further improves availability, but does not improve distributed isolation.
This is the traditional RDBMS scale-out model.

Maximum isolation level

Default isolation level

Minimum isolation level

Replication architecture

Limitations

Oracle

SnapshotSnapshotRead committedAsynchronous replicationOracle's SERIALIZABLE isolation is not serializable, but is actually snapshot isolation with write conflict detection. This allows write skew anomalies.

MySQL

Serializable, primary node onlyRepeatable read, primary node onlyRead uncommittedSemi-synchronous replication

PostgreSQL

Serializable, primary node onlyRead committedRead committedSemi-synchronous replication

Conclusion

A good way to think about isolation is in terms of the breadth of potential anomalies. The lower the isolation level, the more types of anomalies can occur, and the harder it is to reason about application behavior both at steady-state and under faults. At Fauna, we encourage you to think critically about whether your current databases really guarantee the level of transactional isolation you need.

References

  1. Calvin: Fast Distributed Transactionsfor Partitioned Database Systems
  2. Spanner: Google’s Globally-Distributed Database
  3. Large-scale Incremental Processing Using Distributed Transactions and Notifications
  4. Cloud Spanner: TrueTime and external consistency
  5. FoundationDB Consistency
  6. FoundationDB Record Layer:A Multi-Tenant Structured Datastore
  7. CockroachDB's Consistency Model
  8. Jepsen CockroachDB beta-20160829
  9. YugaByte Isolation Levels
  10. TiDB Transaction Isolation Levels
  11. Consistency levels in Azure Cosmos DB
  12. Transactions and optimistic concurrency control
  13. How are Cassandra transactions different from RDBMS transactions?
  14. Jepsen MongoDB 3.6.4

If you enjoyed our blog, and want to work on systems and challenges related to globally distributed systems, and serverless databases, Fauna is hiring!

Share this post

TWITTERLINKEDIN
‹︁ PreviousNext ›︁

Subscribe to Fauna's newsletter

Get latest blog posts, development tips & tricks, and latest learning material delivered right to your inbox.