Design Considerations for Secure GraphQL APIs

In this article, we are going to discuss a variety of security risks to GraphQL deployments and migrations that we’ve seen during our client engagements. We will cover familiar high-risk authorization vulnerabilities as well as less familiar server-side request forgery (SSRF) issues that we’ve seen in migrations trying to achieve GraphQL to REST API interoperability.

In addition to vulnerabilities, we will highlight common misconfigurations and risky designs to help you avoid common mistakes and arm you with a set of test cases to help validate your implementation.

GETTING STARTED

Language Choice Matters

Although there is GraphQL support for a wide number of languages, facilities such as community-supported libraries may be less available or mature for some. Take time to explore the options available for your language of preference and determine if they will meet your long-term maintenance requirements.

We will focus our own recommendations on JavaScript, as it is the most common language that we encounter.

Secure Baseline Configuration

Before diving into common application-level vulnerabilities, we will first address some common configurations that should be implemented in all GraphQL API designs.

Although one of the key benefits of GraphQL is its expressive query structure, that freedom comes with performance and availability risks due to a lack of default constraints. As with all expensive API operations, establishing a secure constraint and validation configuration reduces the opportunities for denial of service (DoS).

Since this is a well-discussed topic, I won’t cover the most common issues and their well-established solutions for JavaScript implementations in depth, but I’ve listed them below to check when establishing a secure baseline:

Configuration Constraint Solution
Query rate limiting Limit API calls per user. graphql-rate-limit
Query depth limiting Limit the complexity of GraphQL queries based on depth. graphql-depth-limit, express-graphql, or query whitelisting
Query amount limiting Limit the quantity of objects that can be requested in a single query. Requiring pagination and configuring an upper limit of requested values
Query complexity limiting Declare the cost of various operations, and disallow queries that exceed the precomputed cost. (This is more advanced than depth limiting.) graphql-validation-complexity, graphql-cost-analysis, express-graphql, or query whitelisting
Disabling introspection queries Prevent the public disclosure of the GraphQL schema. graphql-disable-introspection


To check your status on these issues, you can ask yourself the question in the “Security Test Cases" section below. For an in-depth discussion on constraining query execution, there’s an excellent article on How to Secure a GraphQL API.

COMMON SECURITY ISSUES

Now that we’ve covered baseline configuration, let’s dive into three classes of common security issues that we often see in our clients’ GraphQL APIs.

Improper Authorization Controls

Hands down, the most common high-risk issues that we see in GraphQL (and REST/SOAP API) design relate to improper authorization controls, but they are especially pervasive in GraphQL due to an overreliance on default resolvers and the lack of a centralized authorization layer. Let’s explore a few examples:

#1 Perform node-based access control and leverage an authorization layer

Every node should be responsible for its data and who has access to its data. Edge-based checks are ineffective since there are often multiple edges that lead to a given node. (In other words, just because you have access to a list doesn’t mean you should be able to access every node in the list.)

Additionally, an overreliance on default resolvers and unsafe default field visibility often lead to the introduction of authorization vulnerabilities during development. Check out the graphql-shield project  for a fully featured authorization layer for GraphQL; it could be used as a library or as inspiration for your own design.

Finally, be sure to investigate how authorization exceptions are handled with your preferred implementation. In some cases, an exception may disclose the presence of a field, which could be an impactful information disclosure.

#2 Leverage visualization tools to help design test cases

Having a well-defined plan allows you to create authorization test cases and develop a clear access control system. Consider using GraphQL visualization tools like the ones below to identify sensitive fields or nodes when developing test cases.

The most popular option is GraphQL Voyager:

Graphql Voyager

 

As an alternative, consider checking out GraphQL Editor:

Graphql Editor


#3 Always use the user’s session as the single source of truth for evaluating access

The user’s session should be the only user input that defines a user’s roles or capabilities. Resolve a user’s session and rely on it as the single source of truth for evaluating access.

We often see APIs that rely directly on object lookups in a database (i.e., they rely on the entropy of UUIDs for security), rather than affirming that the requesting session has access to an object. Relying solely on secrecy of object identifiers for authorization just exposes more risks for data security by creating more secrets to manage.

Insecure Input Validation

After authorization, input validation is the second most common vulnerability we see in GraphQL APIs. Insecure input validation includes all the classic vulnerability classes such as SQL injection (SQLi), cross-site scripting (XSS), and server-side request forgery (SSRF).

#1 Perform strong input validation using custom scalars

GraphQL provides the following built-in scalar types: String, Int, Float, Boolean, and ID (serialized as a string accepting integer or string inputs). However, GraphQL also allows you to define custom scalars to create types with custom validation and serialization logic (e.g., a  DateTime type). Strong input and type validation reduces the attack surface from user input and is a great first line of defense in securing user-input processing for your API.

When implementing custom scalars, consider using and contributing to open source libraries that aim to define a set of commonly used custom scalars. This helps reduce the amount of time and effort required for each team to develop their own validation logic, and helps prevent everyone from repeating the same mistakes. Here are examples of two existing efforts to create a shared library of validated custom scalars:

Urigo’s project includes a basic template for a regex-validated scalar, which eases the transition of porting existing regex patterns from a REST API to custom scalars:

Urigo

#2 Avoid custom scalars that encapsulate other types (JSON/XML)

Avoid creating custom scalars for complex types, such as JSON (e.g.,  graphql-type-json).

Complex custom scalars prevent proper validation of nested types and may introduce vulnerabilities, such as GraphQL query injection or NoSQL injection. We will explore these risks more in the section below.

Rest Translation and Caching Issues

Most GraphQL implementations are introduced into an existing REST ecosystem. In this section, we’ll examine the risks you may encounter during your transition. The process of translating between GraphQL and REST, combined with caching, can introduce a variety of unexpected vulnerabilities.

#1 Avoid input validation issues when translating between REST and GraphQL

REST to GraphQL

Let’s say we introduced a new GraphQL back-end service to support data retrieval for an existing REST API gateway. To query the GraphQL service, our developers might attempt to interpolate incoming query parameters into a back-end GraphQL request:

userUpdateQuery = `
mutation {
       updateUser(
       firstName: "${request['firstname']}",
       lastName: "${request['lastname']}",
       ) {
           User {
               firstName
               lastName
           }
     }
   }
`;


Traditional REST APIs can be weakly typed (GET/POST parameters) or strongly typed (JSON/XML), which can make converting to a strongly typed API prone to error. For example, when attempting to translate this way, there are opportunities to inject additional JSON syntax into a query.

To reduce these risks, consider using persisted queries. Persisted queries allow you to transmit a hash corresponding to a stored server-side query, alongside the input variables for the query. This delegates secure interpolation to the library, and limits opportunities for query injection when building the request.

GraphQL to REST

Now, let’s consider the inverse. As part of our migration strategy, let’s say we are placing a GraphQL service in front of our existing REST APIs.

So, the server-side resolve function for our GraphQL API might perform an internal GET request to an internal REST API using a user-provided filename argument like this:

let myFile = await axios.get(`https://api.product.int/file/${args.filename}`);


By submitting a path traversal payload as an argument, an attacker could control the outgoing API request in order to perform malicious behavior (e.g., ../user/setRoles?roles=[admin,user]). This is just one example of unsafe query building that results in SSRF; there is risk any time user input is used in an outgoing request.

This translation can prove more challenging since user input needs to be sanitized twice: in the GraphQL front end with additional sanitization for the data used in query building for the outgoing requests, and then again in the REST background service.

Ensure that all query parameters are sanitized for the context in which they will be placed in a request (e.g., URI syntax for path parameters and JSON syntax for JSON messages). Rely on standard libraries where possible.

#2 Avoid breaking authorization when introducing intermediary caching

Some GraphQL implementations offload all authorization to existing back-end REST infrastructure. This is especially common when creating a GraphQL gateway to an existing REST API. However, as this will incur additional latency, it’s common to introduce intermediary caching between the GraphQL server and the REST API server. However, similar to the risks introduced with translation, this approach can also cause issues.

If authorized responses are stored in the cache, unauthorized requests may inappropriately retrieve cache entries without ever reaching the back-end REST server to perform the authorization check. With the authorization logic sitting behind the intermediary caching layer, the GraphQL server needs to incorporate cache retrieval logic to ensure that no back-end access controls are violated.

SECURITY TEST CASES

Now that we have a good idea of the issues, and the things that can go wrong during migration, let’s define a list of test cases:

  • Is introspection disabled in production?
  • Rate limiting:
    • Is there query rate limiting?
    • Is there a depth limitation?
    • Is there a response limit (i.e., is pagination enabled)?
    • Are there complexity limitations?
  • Authorization:
    • Do queries have proper access controls at the node level?
    • Do all paths to data maintain the same access controls?
    • For fields that are accessible only to limited roles, are these access controls enforced?
    • Do different error responses disclose the presence of fields or nodes?
    • Source code:
      • Do authorization checks rely on a single source of truth (i.e., the user’s session)?
      • Does the code have fallback deny rules for non-whitelisted fields?
    • Input validation:
      • Does the API perform strong input validation (e.g., restricted integer values and limited character sets for names)?
      • Does the API handle null values correctly?
      • When injecting common SQL (e.g.,  [‘] , [--], or [#]) or GraphQL (e.g., JSON) syntax into input, do server-side errors occur?
    • Translation and caching:
      • GraphQL to REST: When injecting restricted URI characters, do REST API errors occur?
      • REST to GraphQL: When submitting JSON syntax to query parameters for a front-end REST API, do GraphQL errors occur?
      • Caching: When rapidly submitting requests, does unexpected behavior occur? What about rapidly submitted requests from two separate user accounts?

The following logging and repudiation test cases are not covered in this article, but are worth considering as well:

  • Do server-side logs track the associated session and operation name? Can you determine the user associated with a malicious request?
  • Do you record when an expensive query or other exception occurs?

CONCLUSION

We’ve covered a variety of common GraphQL bugs here, but many more can happen in the unique context of a deployment; misconfigured intermediate caching layers or unsafe server-side query building can lead to harder-to-find bugs.

Great security comes from solid design patterns and easy-to-read code, and the most common flaws still come from the improper design of business logic and authorization controls. We recommend reading Shopify’s GraphQL design tutorial, which shares the lessons they’ve learned and how to leverage third-party libraries for secure configurations and authorization, so that the community can benefit together.