The Database that Stays Alive Even When You Issue a Command to Remove the Last Replica
Operational simplicity is often the bane of a database administrator’s (DBA) life. Deploying a simple configuration is one thing, but building and maintaining a cluster across multiple data centers can be spine-chilling.
Fauna is easy to deploy, easy to cluster, and easy to maintain.
At Fauna, reducing operational complexity is one of the core tenets of product design. In everything we deliver, our goal is to make our users’ lives simpler, more productive, and more efficient. As a result, Fauna is easy to deploy, easy to cluster, and easy to maintain.
The Basics of Fauna Operations
Fauna is deployed and scaled as a collection of nodes, each of which operate within a cluster in an autonomous fashion. There are no additional pieces of management software, such as a dedicated cluster manager to deploy. Each node participates in the cluster. As nodes are added or removed, these nodes communicate with each other to arrive at the necessary state. Figure 1 illustrates a typical Fauna cluster topology.
As discussed, a Fauna cluster consists of individual nodes operating together to self-manage the system.
Every node is a computer with a unique IP address. Nodes are grouped into replicas, with every node belonging to exactly one replica. The main significance for grouping nodes into replicas is that a replica contains a full copy of the data. Within a replica, a particular piece of data is normally found on exactly one node; there is no data redundancy within a replica. Having multiple replicas, each containing the full set of data, is what provides redundancy in a Fauna cluster.
Reducing Operational Complexity with Fauna
Fauna provides a database administrator with first class commands to manipulate the cluster. These commands are designed to establish the FOS methodology - Fully online, Operationally simple, and Safe.
This refers to a database’s ability to be fully operational while the command is executing. Most databases employ the rolling window operational scheme, which makes portions of the system unavailable for a duration while the command is taking place. This limits the database capacity and availability of the command, and requires scheduling with the business operations to avoid peak usage times.
Fauna ensures that your database is online and available before, during, and after the execution of a command.
In contrast, Fauna ensures that your database is online and available before, during, and after the execution of a command. As an example, Fauna’s Remove node command (as seen in figure 2) achieves 100% global availability when invoked by an administrator. Rest assured, your applications continue to function at full throttle while you manage your cluster!
To make an administrator’s daily life simple, a database command should be a single command, not a series of steps an administrator must orchestrate.
Introducing a series of commands means database administrators would need to remember or, even worse, look up in the manual when to take such steps and under what conditions. Remembering a few steps is not too complex, but understanding how to react to each possible error along the way requires specific database experience. Lacking this experience leads to incorrect usage, unwanted errors, and complexity. In addition, the command needs to accomplish the steps automatically.
Fauna challenges this traditional approach by using transactional commands. Each operational command is complete in itself, and designed to fit into your devops process.
Each operational command is complete in itself, and designed to fit into your devops process.
For example, the remove command (as seen in Figure 2) accomplishes all the work required to remove the node from the cluster and replica for the database administrator. There is no pre or post work required for the database administrator by the command.
Another complex situation that arises in a production environment deals with scale. While removing a single node that is fully online is admirable, in many cases database administrators need to change the layout of many nodes at once. Waiting for a single node to complete in order to start the next operations can be expensive and take an excessive amount of wasted resources. Let’s look into this situation in great detail. If you have 50 nodes in a cluster and need to delete 10 nodes and add 2 new nodes one at a time (i.e. serially), then the first deletion of a node will spread the data to the other 49 node. But, as a DBA, you know that we have to delete 9 of the nodes to which data was just sent. Doing it one at a time would be mind-numbing, slow, and a waste of resources.
The Fauna operational commands are parallel, stackable (i.e. not serial), and optimized immediately for the desired end state of the cluster. You can provide the Fauna database server with 10 remove node commands and 2 add node commands, and the end state will not only be immediately recognized to be 42 nodes but data will also be automatically optimized to achieve a 42 node cluster. No intermediate data state is required. In the middle of moving the data to the 42 node state, the DBA might realize that 11 nodes need to be removed instead of the 10 nodes that were discussed. Just issue the remove node command and immediately the new cluster is optimized for a 41 node state. Fauna’s cluster management behavior reflects the team’s years of experience with scaling high stakes deployments in fast moving environments like Twitter.
Oftentimes, systems are complex, having many parts that can lead to mistakes being made in a variety of ways. The smallests of mistakes can have a catastrophic impact on your operational data and render your applications in an undesirable state. Safety implies the database server protects the administrator from making mistakes that will harm the operational state of the cluster.
In Fauna, we designed the remove node command (and other topology commands) such that they do not allow a DBA to modify the cluster configuration into an unwanted or unrecoverable state.
For example, let’s consider the removal of a cluster node again. A simple typo in a script could lead to the removal of a node that contains the last copy of live data. In Fauna, we designed the remove node command (and other topology commands) such that they do not allow a DBA to modify the cluster configuration into an unwanted or unrecoverable state. Losing the last copy of data would meet this criteria and Fauna would prevent such a situation. Another example would be attempting to remove a node with an active transaction log, such that the transactional integrity would be compromised. In these ways, and many others, Fauna ensures even the most complex multi-region clusters are safe for the business.
Fauna modernizes database operations by making them simpler, efficient, and low-cost. Simplicity addresses not just a technical pain point, but also a critical business issue for all enterprises — large or small. Database operations are often the single largest hurdle in business agility. At Fauna, we built our operational interface with significant input from experienced database administrators managing complex multi-datacenter environments. The Fauna team went the extra mile to ensure cluster operations are fully online, operationally simple, and safe. Rest assured, we are your champions, dear DBA.
To learn more about Fauna’s architecture and capabilities, take a look at our whitepaper.
If you enjoyed our blog, and want to work on systems and challenges related to globally distributed systems, serverless databases, GraphQL, and Jamstack, Fauna is hiring!
Subscribe to Fauna blogs & newsletter
Get latest blog posts, development tips & tricks, and latest learning material delivered right to your inbox.