Guarding against N+1 issues in GraphQL

The N+1 query problem is a common one to encounter during software development, particularly with ORMs (Object Relational Mappers) and their capabilities around lazy loading. A quick example looks like

var orders = GetOrders();

foreach(var order in orders)
{
    Console.WriteLine($"{order.Customer.Name} made an order on {order.CreatedOn}");
}

Unless you’re careful, the lazy loading of the Customer of the Order results in a database query, for 100 orders this would result in 101 database calls, one per Customer and one extra for the orders (hence the name N+1).

I won’t go into the ways we can solve this in ORMs as this is a well documented problem with a myriad of approaches to avoid it.

GraphQL and N+1

When we look at a typical GraphQL query we can see the makings of the same problem.

query {
    orders {
        createdOn
        customer {
            name
        }
    }
}

Whether the above query results in N+1 database queries is completely up to the the implementation of the resolvers. Most GraphQL server frameworks support the idea of a DataLoader. A Dataloader allows you to batch the Customer requests into a single query and put the correct Customer into the correct location of the results. For better examples you can view the documentation for Hot Chocolate, a popular .NET GraphQL server framework. This only works when the underlying data source supports a performant batch operation (in these case getting a list of customers based on a list of ids).

When DataLoader won’t work

Sometimes however this batch operation isn’t supported and we’re left with our N+1 performance problem, what can we do?

Let’s assume our Orders and Customers data live in different microservices and that Customers microservice doesn’t support a batch operation for requesting a list of Customers.

In that case a query like the above one wouldn’t be performant, resulting in N requests to the underlying Customer microservice, however a query like the following would only result in a single call.

query {
    order("b3JkZXI6NDI=") {
        createdOn
        customer {
            name
        }
    }
}

So how do we ensure one is valid, but one is invalid? The easiest way would be to define different Order types in our schema, one that has customer field and one doesn’t and never expose the former in a list field. While it works it would cause a long term maintenance problem as we struggle to keep these types in sync and doesn’t create a particularly nice graph.

Instead we’re going to play with the validation rules feature of Hot Chocolate. We’ll create a rule that the customer field can only be included in a query where none of its parent fields are a list. This would ensure the latter query is valid while the former is invalid.

One really nice feature about validation rules are that they’re done during the parsing of the query document. This means the validation result can be cached along with the parsing result. This way if you’re sending constant queries (where only the variables are different) then the validation rule won’t be executed that often.

Creating the Directive

First we need to create a directive, these function in much the same way as C# attributes, a way to attach metadata to our GraphQL schema that other code can introspect and make use of. In this case we’ll define an includeOnce.

public class IncludeOnceDirectiveType : DirectiveType
{
    protected override void Configure(IDirectiveTypeDescriptor descriptor)
    {
        descriptor.Name("includeOnce");
        descriptor.Location(DirectiveLocation.FieldDefinition);
    }
}

This directive doesn’t have any behavior and simply functions as a “marker” for our validation rule. If we’re doing schema first development we register the directive in the schema.

type Order {
    ...
    customer: Customer @includeOnce
}

While in code first it would be in the field definition

descriptor.Field("customer")
    .Directive<IncludeOnceDirectiveType>()
    .Type<NonNullType<CustomerType>>()
    ...

The directive also needs to be registered with the schema itself.

c.RegisterDirective<IncludeOnceDirectiveType>();

Creating the validation rule

The implementation of the validation rule can be quite complex, the code below illustrates the concept but doesn’t cover some of the more complex cases (such as mutations). For a more fully featured example I suggest the MaxComplexityRule and MaxComplexityVisitor from the core repository.

What we need to do is create a QuerySyntaxWalker that supports the traversal of the AST (abstract syntax tree), if you’ve done anything with Roslyn some of this will look familiar.

It can look complicated, but we’re visiting each field within the query and maintaining some context from the parent fields we’ve visited. If we find a field whose definition has the includeOnce directive then we examine the list of parent fields, if any are a list type then we report an error.

public class IncludeOnceVisitor : QuerySyntaxWalker<IncludeOnceContext>
{
    protected override bool VisitFragmentDefinitions => false;

    protected override void VisitField(FieldNode node, IncludeOnceContext context)
    {
        var newContext = context;

        if (context.TypeContext is IComplexOutputType type && type.Fields.TryGetField(node.Name.Value,
            out IOutputField fieldDefinition))
        {
            newContext = newContext.WithField(fieldDefinition);

            if (fieldDefinition.Type.NamedType() is IComplexOutputType ct)
            {
                newContext = newContext.SetTypeContext(ct);

                if (fieldDefinition.Directives.Contains("includeOnce"))
                {
                    var includedMoreThanOnce = context.FieldPath.Any(f => f.Type.IsListType());

                    if (includedMoreThanOnce)
                    {
                        newContext.ReportError(new ValidationError($"The field {node.Name} can only be includes once in a query", node));
                    }
                }
            }

            base.VisitField(node, newContext);
        }
    }
}

public class IncludeOnceContext
{
    public IncludeOnceContext(ISchema schema, Action<IError> reportError)
    {
        Schema = schema;
        FieldPath = new List<IOutputField>();
        ReportError = reportError;
    }

    protected IncludeOnceContext(ISchema schema, IEnumerable<IOutputField> fieldPath, Action<IError> reportError)
    {
        Schema = schema;
        FieldPath = fieldPath.ToList();
        ReportError = reportError;
    }

    public Action<IError> ReportError { get; }

    public IList<IOutputField> FieldPath { get; }
    public ISchema Schema { get; }
    public INamedOutputType TypeContext { get; protected set; }

    public IncludeOnceContext WithField(IOutputField field)
    {
        return new IncludeOnceContext(Schema, FieldPath.Concat(new[] { field }), ReportError);
    }

    public IncludeOnceContext SetTypeContext(INamedOutputType typeContext)
    {
        return new IncludeOnceContext(Schema, FieldPath, ReportError)
        {
            TypeContext = typeContext
        };
    }
}

After all this, the validation rule itself is pretty straightforward. Create a visitor, explore the document, then report if there are any errors.

public class IncludeOnceValidationRule : IQueryValidationRule
{
    private static readonly IncludeOnceVisitor includeOnceVisitor = new IncludeOnceVisitor();

    public QueryValidationResult Validate(ISchema schema, DocumentNode queryDocument)
    {
        var errors = new List<IError>();
        var context = new IncludeOnceContext(schema, errors.Add).SetTypeContext(schema.QueryType);

        includeOnceVisitor.Visit(queryDocument, context);

        return errors.Any() ? new QueryValidationResult(errors) : QueryValidationResult.OK;
    }
}

The one final step is register our new validation rule with the schema

.AddExecutionConfiguration(execution => execution
    .AddValidationRule<IncludeOnceValidationRule>()
)

We can now validate that our poorly performing field is only included in queries that will not exacerbate the issue. Hope this helps.