Back to Graphql Ruby

Dataloader vs. GraphQL-Batch

guides/dataloader/adopting.md

2.6.14.1 KB
Original Source

{{ "GraphQL::Dataloader" | api_doc }} solves the same problem as GraphQL::Batch. There are a few major differences between the modules:

  • Concurrency Primitive: GraphQL-Batch uses Promises from promise.rb; GraphQL::Dataloader uses Ruby's Fiber API. These primitives dictate how batch loading code is written (see below for comparisons).
  • Maturity: Frankly, GraphQL-Batch is about as old as GraphQL-Ruby, and it's been in production at Shopify, GitHub, and others for many years. GraphQL::Dataloader is new, and although Ruby has supported Fibers since 1.9, they still aren't widely used.
  • Scope: It's not currently possible to use GraphQL::Dataloader outside GraphQL.

The incentive in writing GraphQL::Dataloader was to leverage Fiber's ability to transparently pause and resume work, which removes the need for Promises (and removes the resulting complexity in the code). Additionally, GraphQL::Dataloader should eventually support Ruby 3.0's Fiber.scheduler API, which runs I/O in the background by default.

Comparison: Fetching a single object

In this example, a single object is batch-loaded to satisfy a GraphQL field.

  • With GraphQL-Batch, you call a loader, which returns a Promise:

    ruby
    record_promise = Loaders::Record.load(1)
    

    Then, under the hood, GraphQL-Ruby manages the promise (using its lazy_resolve feature, upstreamed from GraphQL-Batch many years ago). GraphQL-Ruby will call .sync on it when no further execution is possible; promise.rb implements Promise#sync to execute the pending work.

  • With GraphQL::Dataloader, you get a source, then call .load on it, which may pause the current Fiber, but it returns the requested object.

    ruby
    dataloader.with(Sources::Record).load(1)
    

    Since the requested object is (eventually) returned from .load, Nothing else is required.

Comparison: Fetching objects in sequence (dependent)

In this example, one object is loaded, then another object is loaded based on the first one.

  • With GraphQL-Batch, .then { ... } is used to join dependent code blocks:

    ruby
    Loaders::Record.load(1).then do |record|
      Loaders::OtherRecord.load(record.other_record_id)
    end
    

    That call returns a Promise, which is stored by GraphQL-Ruby, and finally .synced.

  • With GraphQL-Dataloader, .load(...) returns the requested object (after a potential Fiber pause), so no other method calls are necessary:

    ruby
    record = dataloader.with(Sources::Record).load(1)
    dataloader.with(Sources::OtherRecord).load(record.other_record_id)
    

Comparison: Fetching objects concurrently (independent)

Sometimes, you need multiple independent records to perform a calculation. Each record is loaded, then they're combined in some bit of work.

  • With GraphQL-Batch, Promise.all(...) is used to to wait for several pending loads:

    ruby
    promise_1 = Loaders::Record.load(1)
    promise_2 = Loaders::OtherRecord.load(2)
    Promise.all([promise_1, promise_2]).then do |record, other_record|
      do_something(record, other_record)
    end
    

    If the objects are loaded from the same loader, then .load_many also works:

    ruby
    Loaders::Record.load_many([1, 2]).then do |record, other_record|
      do_something(record, other_record)
    end
    
  • With GraphQL::Dataloader, each request is registered with .request(...) (which never pauses the Fiber), then data is loaded with .load (which will pause the Fiber as needed):

    ruby
    # first, make some requests
    request_1 = dataloader.with(Sources::Record).request(1)
    request_2 = dataloader.with(Sources::OtherRecord).request(2)
    # then, load the objects and do something
    record = request_1.load
    other_record = request_2.load
    do_something(record, other_record)
    

    If the objects come from the same Source, then .load_all will return the objects directly:

    ruby
    record, other_record = dataloader.with(Sources::Record).load_all([1, 2])
    do_something(record, other_record)