website/pages/docs/n1-dataloader.mdx
When building a server with GraphQL.js, it's common to encounter performance issues related to the N+1 problem: a pattern that results in many unnecessary database or service calls, especially in nested query structures.
This guide explains what the N+1 problem is, why it's relevant in
GraphQL field resolution, and how to address it using
DataLoader.
The N+1 problem happens when your API fetches a list of items using one query, and then issues an additional query for each item in the list. In GraphQL, this usually occurs in nested field resolvers.
For example, in the following query:
{
posts {
id
title
author {
name
}
}
}
If the posts field returns 10 items, and each author field fetches
the author by ID with a separate database call, the server performs
11 total queries: one to fetch the posts, and one for each post's author
(10 total authors). As the number of parent items increases, the number
of database calls grows, which can degrade performance.
Even if several posts share the same author, the server will still issue duplicate queries unless you implement deduplication or batching manually.
In GraphQL.js, each field resolver runs independently. There's no built-in coordination between resolvers, and no automatic batching. This makes field resolvers composable and predictable, but it also creates the N+1 problem. Nested resolutions, such as fetching an author for each post in the previous example, will each call their own data-fetching logic, even if those calls could be grouped.
DataLoader is a utility library designed
to solve this problem. It batches multiple .load(key) calls into a single batchLoadFn(keys)
call and caches results during the life of a request. This means you can reduce redundant data
fetches and group related lookups into efficient operations.
To use DataLoader in a graphql-js server:
DataLoader instances for each request.contextValue passed to GraphQL execution. You can attach the
loader when calling graphql() directly, or
when setting up a GraphQL HTTP server such as express-graphql..load(id) in resolvers to fetch data through the loader.Suppose each Post has an authorId, and you have a getUsersByIds(ids)
function that can fetch multiple users in a single call:
import {
graphql,
GraphQLObjectType,
GraphQLSchema,
GraphQLString,
GraphQLList,
GraphQLID
} from 'graphql';
import DataLoader from 'dataloader';
import { getPosts, getUsersByIds } from './db.js';
function createContext() {
return {
userLoader: new DataLoader(async (userIds) => {
const users = await getUsersByIds(userIds);
return userIds.map(id => users.find(user => user.id === id));
}),
};
}
const UserType = new GraphQLObjectType({
name: 'User',
fields: () => ({
id: { type: GraphQLID },
name: { type: GraphQLString },
}),
});
const PostType = new GraphQLObjectType({
name: 'Post',
fields: () => ({
id: { type: GraphQLID },
title: { type: GraphQLString },
author: {
type: UserType,
resolve(post, args, context) {
return context.userLoader.load(post.authorId);
},
},
}),
});
const QueryType = new GraphQLObjectType({
name: 'Query',
fields: () => ({
posts: {
type: GraphQLList(PostType),
resolve: () => getPosts(),
},
}),
});
const schema = new GraphQLSchema({ query: QueryType });
With this setup, all .load(authorId) calls are automatically collected and batched
into a single call to getUsersByIds. DataLoader also caches results for the duration
of the request, so repeated .load(id) calls for the same ID don't trigger
additional fetches.
DataLoader instance per request. This ensures that caching is scoped
correctly and avoids leaking data between users.DataLoader contract. If a key is not found, return null or throw depending on
your policy..loadMany() sparingly. While it's useful when you already have a list of IDs, it's
typically not needed in field resolvers, since .load() already batches individual calls
made within the same execution cycle.DataLoader GitHub repository: Includes full API docs and usage examples