Securing your GraphQL API

Part #2 of the GraphQL Patterns Series

8 min readOct 13, 2020

GraphQL offers a lot of flexibility but it also comes with its challenges. The two main challenges are how do we secure the API and the infamous query depth question.
In this article I want to present the patterns you can use to secure your API and am dividing the topic in three parts — authentication, authorisation (on a resolver and field level) and preventing common attacks. All of the examples will be shown in Apollo Server but the patterns can be applied to other implementations as well.

Authentication

Unlike other established technologies and frameworks, GraphQL does not offer a boilerplate regarding authentication. Any authentication mechanism you choose is fine. But where do we pass in our data and how do we get it to our resolvers? For that we use GraphQL’s context.

Where the context resides, everything that is in it gets passed on to the resolvers.

Context is a value that is provided to every resolver and holds important contextual information like the currently logged in user or access to the database.

The third argument of the resolver is context which is basically an object that contains various data.

In Apollo Server we pass in the context while we create the server instance like this:

The context is passed into the ApolloServer object as a property.

In the example above, the req variable contains various properties regarding the request that is being passed in. From this variable we can pull in a token or a cookie (depending on how your authentication is implemented) and return an object that contains the data that is then passed to our resolvers.

The returned value can be either a custom authentication implementation or just a plain object with variables that contain information about the user that is currently logged in. In our example we will just be passing in a plain object that has a method called hasRole which returns true if a user has a certain role, which brings us to the second part of the article, authorisation.

Authorization

So now that we have gotten through authentication, we also need to handle permissions to each of the underlying resolvers.

A good pattern is to pass in down a user object to all your resolvers (in the context itself) in order to get the data you need. That data is usually stored either in a database or on an external authorisation solution (e.g. Keycloak, Auth0 etc.).

In the example above we have created a user object which contains a method called hasRole. This is a dummy object created for the purposes of this example but here you would actually insert your business logic of parsing the JWT token, getting user roles and all of the other parameters you need in order to successfully make an authorisation scheme.

For our example we will create a simple blogging service which contains users, posts and comments:

A simple diagram showing our connections in our schema.

Our example schema of a blogging service.

Let us presume that our system has two roles, administrator and blogger. The administrator is the only one who can delete posts. To ensure this feature we need to add a rule to the mutation deletePost so that only the administrator has access.

In Apollo Server GraphQL Shield is a very good library to achieve this. It is a middleware which gets triggered before your resolvers and actually checks if the user has access or not. In order to use it we need to first create roles (which are usually based on the user context) and then use those methods on the underlying resolvers:

This is a permissions file. For our example schema we will be using two roles, blogger and admin.

Our context in this case already contains the user object that we have passed in while creating our authentication.

When we have the permissions we can create the middleware:

We import the rules we created earlier.

As you can see this looks similar to defining a schema. You actually pass in the rules that need to be executed using the schema you created earlier. The library offers the generation of more complex rules and caching mechanisms. I suggest you read more about them in the official documentation.

With GraphQL Shield we are setting permissions on a resolver level where we are giving permissions if a certain user role can execute an action. We can also expand our authorisation on a per field level as well by using a feature called directives.

Directives are a simple mechanism for triggering logic before resolving and returning the actual data. Let’s see an example on how we can use this mechanism to authorise fields in a type:

Our directive that handles field authorisation.

A directive has multiple methods that need to be implemented (I won’t be going into details about directives, you can read more about their imlementation in the official documentation here.).

The visitFieldDefinition method gets triggered for each field in a type. The requiredRoles comes from the schema itself where we actually input the data which roles are needed for each field:

Adding authorisation directive on a field.

On the firstName field we added an auth directive which triggers our visitFieldDefinition function. In the code above we just check if the user has the role, otherwise we trigger an error.

Preventing common attacks

With new technology comes new responsibility. These are some of the possible attacks we will be covering:

Injection
Inconsistent authorisation checks
Depth attack (denial of service)
Introspection and entry points
Batching attacks

Injection

Even though GraphQL’s type mechanism offers type safety that doesn’t necessarily mean that it isn’t prone to injection attacks.

For our example imagine our GraphQL query looking like this:

The first query gets executed normally, the second one tries an SQL injection attack.

So in the background if input is not sanitised, SQL injection is possible on the underlying infrastructure. So make sure you always sanitise your inputs (or use a library that does this on its own, this official Apollo Server article explains the basics) even though the type mechanism involves fields that are strongly typed.

Inconsistent authorisation checks

It sometimes happens that we have created security checks on our resolvers (on the root level) but we forgot to add checks when we are querying at a node level. Let’s look at the following query:

Let’s assume that our goal is that the user from the session can only see comments of the posts he has written. Maybe for this nested query the rules are written and everything is checked. But what we can try is to write a query that fetches only the comments, e.g. look for potential security holes in the permissions themselves. Something like this:

It can happen in certain scenarios that a query like this isn’t secured by the nested query rule that we created above (when fetching the posts and comments). So make sure you always check for these permissions and also write tests for this.

Depth attack (denial of service)

To explore the next possible security breach let us again take our blog example. From the graph we introduced all of our data nodes are connected bidirectionally (cycles are present in our graph). So in theory, we could do a query like this:

Because author is connected to the post and the post to the author we can just recursively make calls.

Because our author node is connected to the post node and vice versa, we can recursively query the data and in theory trigger a denial of service attack.

We can solve this by applying a depth limit to our queries. For that we can use the GraphQL Depth Limit library and tweak our code to look like this:

We set the validation depth to 2 thus preventing having larger queries take place.

As the schema designer, we know how deep our queries should be nested because we know our data and how the consumers will be fetching it. Set the limit accordingly based on the details you know regarding your schema consumption.

Introspection and entry points

Introspection is one of the most powerful features in GraphQL. It allows us to get the information about all the requests, mutations, subscriptions and data types. All this information is located in the __schema meta field which, by default is always available to the query of the root type.

Introspection is used by tools like GraphiQL, Voyager, Hasura to get the schema information they need so that they can show you visualisations of your data and do schema merging.

In production though, it is best to disable introspection as it gives your consumers valuable information about how your data is structured.

Apollo Server has introspection and the playground environment turned off by default if the NODE_ENV is set to production. You can manually activate or deactivate it in other environments by passing in the introspection and playground variables upon server creation:

Disabling introspection and the playground environment.

Besides introspection you should also disable debugging (which can be triggered by adding ?debug=1 to the server query) and stack tracing in production. To do that in Apollo Server (example above) pass in the variable debug upon server initialisation and set it to false.

Batching attacks

It’s great that GraphQL allows us to send multiple queries or mutations in one request and receive the data in a single response. While this is very useful it also opens up potential threats. Imagine we have a password reset mutation like this:

In short we enter a username, new password and the reset code that was sent to our email. Now a potential attacker could create an account and see what the reset codes look like and then use a batching attack to send multiple password reset requests in one single request:

Send multiple mutations in one single HTTP request.

While it is not likely that password reset codes are so short, we still need to limit the number of mutations and queries that can be executed in a single HTTP request to prevent possible attacks like this and the overloading of our server. For that we can use GraphQL Rate Limit. In our schema we can use the rateLimit directive:

On any mutation or query where we want to limit the number of request that can be done in a certain time window we just add the rateLimit rule. With the rule above we can only execute a single query in a five minute window (which means that if anything goes wrong, the user can only try to reset his password again five minutes later). This will prevent the hacker to do single requests with a large number of queries.

By default the library uses the InMemoryStore for storing the data for requests. I would suggest you should replace it with something better like Redis. The documentation is available here.

Make sure you apply these rules accordingly to the queries and mutations where you see fit.

We presented patterns for authentication and authorisation and also shown some possible attacks on our GraphQL server. These are some of the common things that we need to be aware of when moving our servers into production. I suggest reading more in detail in the references below.