Jun 06, 2018


Blog Categories

Categories

Blog Categories

Using ACID Transactions to Combine Queries and Ensure Integrity

ACID transactions are a key element of the success and applicability of RDBMS long term, as they provide the system with the flexibility to model aggregate data structures regardless of the data access pattern. Historically, ACID databases have been able to add new features that work with existing data, in part because of the integrity that can be maintained via transactions.  In this blog post we’ll look at query examples that reflect real-world development challenges.

Updating multiple documents in an ACID transaction is a near-universal pattern when querying FaunaDB. There are several ways to do it, from a bare list of operations, to an explicit Do statement, to using Map and Paginate to iterate over result sets. As you’ll notice, this article is organized by use case where you’ll see how to express each one in FaunaDB’s query language.

Create a New Workspace

Many applications set up new workspaces at various points in the application lifecycle. Whether it is creating a new gameworld for players to populate, a new rich document for users to collaborate on, or a new online store, some default schema and data items need to be provisioned. In a fast moving user-onboard flow, you don’t want users encountering an incoherent state. To prevent this from happening, create your default objects in a single transaction.

In a fast moving user-onboard flow, you don’t want users encountering an incoherent state. To prevent this from happening, create your default objects in a single transaction.

Here is a quick query that creates a few classes with their indexes in a single transaction. It’s from an example game, where it sets up the schema for a new gameworld.

q.Let({
  players: q.CreateClass({name: "players"}),
  items : q.CreateClass({name: "items"}),
  purchases : q.CreateClass({name: "purchases"})},
q.Do(
  q.CreateIndex( {
    name: "players",
    source: q.Var("players")
  }),
  q.CreateIndex( {
    name: "items_for_sale",
    source: q.Var("items"),
    terms: [{
      field: ["data", "for_sale"]
    }]
  }),
  q.CreateIndex( {
    name: "purchases",
    source: q.Var("purchases")
  }),
  q.CreateIndex( {
    name: "items_by_owner",
    source: q.Var("items"),
    terms: [{
      field: ["data", "owner"]
    }]
  })))

Here is another query in the Animal Exchange example source code, where all the animals are created and assigned owners.

const animals = ["🐄","🐆","🐿","🐇","🐈","🐋","🐍","🐎","🐒","🐘",
  "🐙","🐛","🐝","🐞","🐣","🐬","🐯","🐸","🐹","🐩"].map((emoji) => {
    return {
      label : emoji,
      for_sale : Math.random() < 0.2,
      owner : players[Math.floor(Math.random()*players.length)].ref,
      price : Math.ceil(Math.random()*40)
    }
});
return client.query(
  q.Foreach(animals, (animal) =>
    q.Create(q.Class("items"), {data : animal}))
);

You’ll also see queries like this when you are saving a complex UI corresponding to many user fields. Each control or field in the UI can be persisted with its own logic, and all updates composed into a single transaction.

Sell Items Between Accounts

Typically, when implementing an ecommerce transaction, the preconditions include as much logic as the transfer itself. Before an item can be purchased, the system must ensure it’s for sale, in stock, and that the purchaser can afford it. Only after the preconditions have passed does it make sense to move funds from buyer to seller, change item ownership, and/or reduce stock levels.

Before an item can be purchased, the system must ensure it’s for sale, in stock, and that the purchaser can afford it. 

In FaunaDB, complex transactions are described and sent to the server all at once. So, a transaction transfering an items from a buyer to a seller will look something like this:

q.Let({
  buyer : q.Get(player.ref),
  item : q.Get(item.ref)
}, q.Let({
  isForSale : q.Select(["data", "for_sale"], q.Var("item")),
  itemPrice : q.Select(["data", "price"], q.Var("item")),
  buyerBalance : q.Select(["data", "credits"], q.Var("buyer")),
  seller : q.Get(q.Select(["data", "owner"], q.Var("item")))
}, q.If(q.Not(q.Var("isForSale")),
    "purchase failed: item not for sale",
    q.If(q.Equals(q.Select("ref", q.Var("buyer")), q.Select("ref", q.Var("seller"))),
      q.Do(
        q.Update(q.Select("ref", q.Var("item")), {
          data : {
            for_sale : false
          }
        }),
        "item removed from sale"
      ),
      // check balance
      q.If(q.LT(q.Var("buyerBalance"), q.Var("itemPrice")),
        "purchase failed: insufficient funds",
        // all clear! record the purchase, update the buyer, seller and item.
        q.Do(
          q.Create(q.Class("purchases"), {
            data : {
              item : q.Select("ref", q.Var("item")),
              price : q.Var("itemPrice"),
              buyer : q.Select("ref", q.Var("buyer")),
              seller : q.Select("ref", q.Var("seller"))
            }
          }),
          q.Update(q.Select("ref", q.Var("buyer")), {
            data : {
              credits : q.Subtract(q.Var("buyerBalance"), q.Var("itemPrice"))
            }
          }),
          q.Update(q.Select("ref", q.Var("seller")), {
            data : {
              credits : q.Add(q.Select(["data", "credits"], q.Var("seller")), q.Var("itemPrice"))
            }
          }),
          q.Update(q.Select("ref", q.Var("item")), {
            data : {
              owner : q.Select("ref", q.Var("buyer")),
              for_sale : false
            }
          }),
          "purchase success"
        )
      )
    )
   )))
);

To see a full breakdown of a similar query, see my talk at GOTO Berlin about distributed ledgers, or read this article introducing an example distributed ledger application. The principles and concepts behind ledger queries are broadly applicable to other applications, making the code worth studying further.

Update Matching Records

If you need to update multiple matching records in a single query, you can do it by looping over the result set and running update logic. This can be useful any time you want to correct a batch of data, or make special case changes that impact the records which match a query. In the following query, we find customers with a pro plan in the finance industry and update their marketing-qualified-lead level to increase by 50 percent.

q.Map(q.Paginate(q.Match(q.Index("customers-by-plan"), "pro")),
function(row) { 
    return q.Let({customer: q.Get(q.Select(0, row))}, )
        q.If(q.Equals("finance", q.Select(["data","industry"], q.Var("customer")),
            q.Update(q.Var("customer")), {data : {
                mqlScore: q.Multiply(1.5,q.Select(["data","mqlScore"], q.Var("customer")))
        }})
})

In many cases this logic does not be performed transactionally. However, in cases where a discount is required to be available to a subset of customers simultaneously, ACID transactions are necessary.

Maintain Aggregates

An important use for ACID transactions is in maintaining custom aggregates alongside your data changes. If you need to maintain a counter or other aggregate data structure as you write, you can apply changes to that document at the same time as you’re update the other documents. In the example we’ll add a record and update a counter in a single transaction.

An important use for ACID transactions is in maintaining custom aggregates alongside your data changes.

First, we’ll need a class to store our aggregates in. Let’s imagine we want to maintain more than one type of aggregate in our application, so we’ll add a tag field.

q.CreateClass({name:"aggregates"})

We can use this class to store all the aggregates in our system, so we can index the aggregates by tag for reading.

q.CreateIndex(
    {
      name: "aggregates-by-tag",
      source: q.Class("aggregates"),
      terms: [{ field: ["data", "tag"] }],
      values: [{ field: ["data", "value"] },{ field: ["ref"] }]
    })

Now, say we want to create an aggregate to keep stats like counts and averages across our purchases. We can create a document with the tag we plan to use and seed it with an empty counter.

q.Create(q.Class("aggregates"), {data : 
    {tag : "purchase-stats", value : {count : 0, sum: 0}})

Now that we have prepared our aggregate, we need to keep it up to date as we make transactions. To do this, we can cache the Ref of the aggregate document in our application. This query should run once per process start, and the Ref can be kept around.

q.Paginate(q.Match(q.Index("aggregates-by-tag"), "purchase-stats"), 
    function(row) { return q.Select(1, row)})

This will return all of the Refs for aggregates for the "purchase-stats" tag. In a moment we’ll discuss why there might be more than one. For now we can just hang onto one.

var myPurchaseStatsRef = aggregateRefs[Math.floor(Math.random()*aggregateRefs.length)];

On insert, we update a the aggregate document with statistics each time we write new data. In practice it might look like this:

var newPurchase = {amount : 40, items : ["..."]}
client.query(
  q.Do(
    q.Let({"stats": q.Get(myPurchaseStatsRef)}, 
      q.Update(myPurchaseStatsRef, {data : { value : { 
        count : q.Add(1, q.Select(["data","value","count"], q.Var("stats"))),
        sum : q.Add(newPurchase.amount, q.Select(["data","value","sum"], q.Var("stats")))
    }})), 
    q.Create(q.Class("purchases"), {data : newPurchase})
  )
)

It’s important to note that we use q.Do to join multiple statements into a single transaction. This serializes access to the aggregate document, so that it always reflects correct values. However, under heavy concurrent load, this means all requests are waiting for their turn to update the aggregate.  Best practice would be to spread that load across multiple documents. Luckily, just by issuing another create query for a purchase-status document with a count of zero, the code presented above will bind randomly to one of the stats documents. The optimal number of stats documents is probably somewhere between the number of machines in your FaunaDB replica and the number of concurrent application server process you’re running.

On insert, we update a the aggregate document with statistics each time we write new data.

To query your replica, you can fetch the values from the index and reduce them in your application.

q.Paginate(q.Match(q.Index("aggregates-by-tag"), "purchase-stats"), 
  function(row) { return row[0] })

This will return an array of count / sum pairs, and you can add them up and divide to find the total average. If scale is not an issue for your application, you can skip the sharding, and just create a single key, but the solution complexity with a sharded aggregate is low enough that I’d recommend using it as the first option.

Conclusion

FaunaDB transactions can easily accommodate complex logic. Query language features like iteration, the Do statement, and precondition logic like If can be combined to express your business operations in queries that are easy to read and develop as your native programming language.

If you want to learn more about FaunaDB’s query API, read the documentation here. Or read here to learn more about how queries are processed.