A fast-loading website can make or break your business, and using a cache is one of the best ways to improve performance. However, if you ask anyone today that is building web apps using GraphQL, caching keeps coming up as one of the biggest challenges. Why is that? To really understand what’s going on, it’s important to precisely understand the type of caching at play in the context of GraphQL APIs.
In web applications, data can be cached at several different layers — at the server, in the middle-tier, or at the client. For example, an e-commerce application might store session data in a cache on the server-side. In order to avoid unnecessary network requests, your web browser's HTTP cache is usually your first line of defense. In spite of the limited control you have over the lifetime of cached responses, it is effective, supported by most browsers, and doesn't take much time to manage.
The GraphQL specification does not clearly outline HTTP caching, leading to many developers thinking that GraphQL does not support caching. In actuality though, caching in GraphQL only requires a few additional architecture considerations. Let’s take a closer look at them in this article.
Does GraphQL support caching?
is a highly flexible query language used for interacting with APIs. It was developed originally by Facebook as a specification that was powerful enough to describe everything Facebook does, while also being easy to use and understand by developers. As a result, the GraphQL API specification was written to be as general as possible, mentioning caching in passing in only one section
. This has led some practitioners to conclude that the technology doesn't support caching or that it wasn't a priority when it was being developed.
However, GraphQL has taken this into account, by providing users with some additional recommendations on how to handle client-side caching in its online user documentation
Let’s review some of the basics of HTTP caching.
How HTTP caching works
Requesting content from a server across the network is time-consuming and expensive. In addition, these requests may have large response payloads that require multiple roundtrips between the browser and server. If your application requires a lot of resources to load a single page, it may seem slow and unresponsive to the end-users. With HTTP caching, you can optimize these requests, decreasing the total number of requests you must make to the server.
HTTP requests are first routed to the browser cache to see if a valid response is already available. If a valid cached response is found, it can fulfill the request without contacting the server. In HTTP caching, the API URL is a globally unique identifier that the client can leverage to build a cache. URLs can be embedded with a fingerprint that can be used to validate whether cached data is still valid.
For example, 93jdje93 is the fingerprint (or hash) of the file’s contents in the URL below.
If the file contents are modified, the URL of the resource is changed, and the fingerprint is changed. The client is then forced to download the newly changed file from the server.
To ensure that the data is valid and up to date, HTTP caching allows users to specify properties to configure how responses are cached through the Cache-Control header, a general field used to specify caching mechanisms in HTTP requests and responses. To prevent cached data from being served forever, items are periodically removed from the cache. The time period that the cache lasts is called the data’s freshness. To verify that a piece of data is valid, the client compares HTTP response headers such as Last-Modified and Etags against the specified freshness period before requesting fresh data.
Recommended GraphQL caching techniques
As in the case of HTTP caching, the same principles can be applied for caching in the context of GraphQL APIs. The fingerprint added to a URL in a REST API is a global unique identifier that the client can use to build a cache where the URL can be mapped to stored data. However, in GraphQL, all requests are made to a single endpoint, where unique identifiers cannot be assigned to the URLs. In this case, it is best practice for the APIs themselves to expose an identifier for the clients to use.
Reserving a globally unique identifier
The alternative to the URL-based identifier is to reserve a field like ‘id’ to serve as the globally unique identifier. If a response for a specific ID is already stored in the cache, the client can use that data, otherwise it’ll reach into the database for fresh data.
Working with existing APIs
When moving existing REST APIs to GraphQL, there are a few issues that you might want to be aware of.
Current APIs use a type-specific ID with a type-aware endpoint. Since GraphQL has a single endpoint, a truly global unique identifier is needed for every type of request sent. For example, the following two queries expect different data types, but since the IDs are type-specific, they could be analyzed by the cache the same even though the queries are different, causing the wrong data to be returned to the client
Global unique identifiers can be used as one solution. A global unique identifier can be generated by using the following mechanism:
- Reuse the UUID or transaction ID from the backend infrastructure
- Concatenate the type and the ID (for example,
- Apply URL-safe base64-encoding to get the global ID
However, using a global unique identifier raises the question of how the GraphQL API would interact with existing APIs. This problem can be solved by adding extra fields to store previous API IDs. This way, GraphQL clients can rely on a consistent mechanism based on global, unique identifiers, and clients requiring the previous APIs can pull the previousAPIId from the object.
Database with native GraphQL
Fauna is a flexible, developer-friendly, transactional database delivered as a secure and scalable cloud API with native GraphQL. Fauna supports relations, documents, and graphs for unmatched modeling flexibility. It also offers query interface features such as complex joins and custom business logic (ala stored procedures), as well as support for real-time streaming and GraphQL
Fauna is connectionless and is accessible directly from the browser or mobile clients. With Fauna, you can use serverless, multiregional instances in the cloud, accessible via an API. With native support for languages such as Node.js, C#, JVM, Go, and Python, Fauna makes developing applications easy.
Sign-up for free
The data API for modern applications is here. Sign-up for free without a credit card and get started instantly.
Quick start guide
Try our quick start guide to get up and running with your first Fauna database, in only 5 minutes!