docs/content/guides/operator/indexer-stack-setup.mdx
See GraphQL and General-Purpose Indexer for more information on the stack.
The indexer consists of multiple pipelines that each read, transform, and write checkpoint data to a table. Multiple instances of the indexer can run in parallel, each configured by its own TOML file.
The general-purpose indexer writes to a Postgres database. The storage footprint estimations outlined below are based on the network as of early 2026, and may fluctuate in relation to network growth. These numbers should be seen as directional rather than exact figures.
The bulk of the storage is consumed by obj_versions at 8.2 TB. A pruning strategy is in development.
A 30-day retention adds 1.8 TB on top, while a 90-day retention contributes up to an additional 3.1 TB.
30-day retention:
| Table | Heap (GB) | Idx (GB) |
|---|---|---|
| tx_affected_objects | 64–70 | 276–397 |
| tx_calls | 27–30 | 239–366 |
| ev_struct_inst | 15–19 | 174–267 |
| tx_affected_addresses | 17–18 | 63–92 |
| ev_emit_mod | 8–9 | 54–85 |
| tx_balance_changes | 18–20 | 5–6 |
| tx_digests | 8 | 21–25 |
| tx_kinds | 5 | 14–16 |
| cp_sequence_numbers | 10 | 5 |
| TOTAL | 444–450 | 1,827–1,842 |
90-day retention:
| Table | Heap (GB) | Idx (GB) |
|---|---|---|
| tx_affected_objects | 188–202 | 580–752 |
| tx_calls | 82–87 | 560–715 |
| ev_struct_inst | 45–50 | 432–531 |
| tx_affected_addresses | 50–54 | 151–183 |
| ev_emit_mod | 22–24 | 129–167 |
| tx_balance_changes | 57–63 | 10–16 |
| tx_digests | 24–26 | 45–55 |
| tx_kinds | 15–16 | 30–36 |
| cp_sequence_numbers | 10 | 5 |
| TOTAL | 755–842 | 2,440–3,181 |
sui-indexer-altRun an indexer instance using this command for each of the configuration files. The command varies based on whether the pipeline is prunable or unprunable:
$ sui-indexer-alt indexer \
--config <CONFIG_FILE> \
--database-url <DATABASE_URL> \
--remote-store-url <REMOTE_STORE_URL>
$ sui-indexer-alt indexer \
--config <CONFIG_FILE> \
--database-url <DATABASE_URL> \
--remote-store-url <REMOTE_STORE_URL> \
--first-checkpoint <CHECKPOINT_NUMBER>
For prunable pipelines, calculate the first-checkpoint based on your retention period:
current_checkpoint - 10368000current_checkpoint - 31104000| CLI param | Description |
|---|---|
<CONFIG_FILE> | Path to indexer configuration file. |
<DATABASE_URL> | Postgres database connection string. |
<REMOTE_STORE_URL> | URL of a checkpoint bucket to index from, one of multiple possible data sources. |
<CHECKPOINT_NUMBER> | (Optional) For prunable pipelines only - the checkpoint to start indexing from based on retention requirements. |
Unprunable pipeline (from genesis):
$ sui-indexer-alt indexer \
--config unpruned.toml \
--database-url postgres://username:password@localhost:5432/database \
--remote-store-url https://checkpoints.mainnet.sui.io
Prunable pipeline with 30-day retention (assuming current checkpoint is 100,000,000):
$ sui-indexer-alt indexer \
--config events.toml \
--database-url postgres://username:password@localhost:5432/database \
--remote-store-url https://checkpoints.mainnet.sui.io \
--first-checkpoint 89632000
Use the TOML files below; they are grouped by pipeline speed. All pipelines in an instance are limited by the slowest pipeline in that instance so these files each contain pipelines that run at approximately the same speed.
:::info
When backfilling, you should set the ingest-concurrency to a higher value, e.g. 200, then reduce it to 20 for normal operation at network tip.
:::
All consistent store pipelines run in the same instance based on a single configuration file. Like the indexer, the pipelines run in parallel and throughput is limited by the slowest pipeline.
Restores one or more pipelines from checkpoint data in a GCS bucket.
$ sui-indexer-alt-consistent-store restore \
--azure <AZURE_BUCKET> \
--database-path <DATABASE_PATH> \
--gcs <GCS_BUCKET> \
--http <HTTP_ENDPOINT> \
--object-file-concurrency <OBJECT_FILE_CONCURRENCY> \
--pipeline <PIPELINE_NAME> \
--remote-store-url <REMOTE_STORE_URL> \
--s3 <S3_BUCKET>
| CLI parameter | Description |
|---|---|
<AZURE_BUCKET> * | Name or URL of Azure bucket containing managed snapshots. |
<DATABASE_PATH> | Path to RocksDB database. |
<GCS_ACCOUNT> * | Name or URL of GCS bucket containing managed snapshots. |
<HTTP_ENDPOINT> * | URL of formal snapshot API. |
<OBJECT_FILE_CONCURRENCY> | Path to indexer configuration file. |
<PIPELINE_NAME> | Name of pipeline to restore. Can be set multiple times; once per pipeline. |
<REMOTE_STORE_URL> | URL of a checkpoint bucket to index from, one of multiple possible data sources. |
<S3_BUCKET> * | Name or URL of AWS S3 bucket containing managed snapshots. |
* Must specify one of <AZURE_BUCKET>, <GCS_ACCOUNT>, <HTTP_ENDPOINT>, or <S3_BUCKET>.
Example:
$ sui-indexer-alt-consistent-store restore \
--database-path /path/to/rocksdb \
--http https://formal-snapshot.mainnet.sui.io \
--object-file-concurrency 5 \
--pipeline balances \
--pipeline object_by_owner \
--pipeline object_by_type \
--remote-store-url https://checkpoints.mainnet.sui.io
Run a consistent store instance using this command for the configuration file that follows:
$ sui-indexer-alt-consistent-store run \
--config <CONFIG_FILE> \
--database-path <DATABASE_PATH> \
--remote-store-url <REMOTE_STORE_URL>
| CLI param | Description |
|---|---|
<CONFIG_FILE> | Path to consistent store configuration file. |
<DATABASE_PATH> | Path to RocksDB database. |
<REMOTE_STORE_URL> | URL of a checkpoint bucket to index from, one of multiple possible data sources. |
Example:
$ sui-indexer-alt-consistent-store run \
--config consistent-store.toml \
--database-path /path/to/rocksdb \
--remote-store-url https://checkpoints.mainnet.sui.io
GraphQL RPC server reads data from the general-purpose indexer's database (Postgres), the consistent store, and the archival service.
<Tabs className="tabsHeadingCentered--small"> <TabItem value="prereq" label="Prerequisites">obj_versions.toml and unpruned.toml) have fully caught up to the network tip before starting the GraphQL RPC server. The GraphQL service will only operate normally once these pipelines are complete.Scale the number of nodes based on the desired read throughput requirements of your client applications.
The GraphQL RPC server relies on multiple backend services to fulfill different types of queries:
--ledger-grpc-url) provides historical data for most queries involving checkpoints, objects, and transactions.--consistent-store-url) serves live data for queries related to current object and balance ownership.--database-url) is the primary store for most queries, except for direct object and transaction lookups handled by the Archival service.--fullnode-rpc-url) powers transaction simulation and execution.Set the appropriate service URLs in your run command based on the query types your GraphQL RPC server needs to support.
sui-indexer-alt-graphql:::info
If you use the Sui Foundation–hosted public good archival service on Testnet or Mainnet, you may encounter performance issues. The team will address these before the GraphQL RPC and Archival Service reach general availability.
:::
Use the following command to run a GraphQL RPC server node:
sui-indexer-alt-graphql rpc \
--config <PATH_TO_GRAPHQL_CONFIG_FILE> \
--indexer-config <PATH_TO_INDEXER_CONFIG_FILE_1> \
--indexer-config <PATH_TO_INDEXER_CONFIG_FILE_2> \
--indexer-config <PATH_TO_INDEXER_CONFIG_FILE_3> \
--ledger-grpc-url <LEDGER_GRPC_URL> \
--consistent-store-url <CONSISTENT_STORE_URL> \
--database-url <DATABASE_URL> \
--fullnode-rpc-url <FULLNODE_RPC_URL>
Multiple --indexer-config parameters can be provided, one for each general-purpose indexer instance.
| CLI parameter | Description |
|---|---|
CONFIG_FILE | Path to the optional GraphQL RPC server configuration file |
INDEXER_CONFIG_FILE | Path to general-purpose indexer configuration file; can be set multiple times for different pipelines |
LEDGER_GRPC_URL | URL to Archival service's LedgerService gRPC API |
CONSISTENT_STORE_URL | URL to Consistent store API |
DATABASE_URL | Postgres database connection string |
FULLNODE_RPC_URL | URL to full node RPC |
Example:
sui-indexer-alt-graphql rpc \
--config graphql.toml \
--indexer-config events.toml \
--indexer-config obj_versions.toml \
--indexer-config tx_affected_addresses.toml \
--indexer-config tx_affected_objects.toml \
--indexer-config tx_calls.toml \
--indexer-config tx_kinds.toml \
--indexer-config unpruned.toml \
--ledger-grpc-url https://archive.mainnet.sui.io:443 \
--consistent-store-url https://localhost:7001 \
--database-url postgres://username:password@localhost:5432/database \
--fullnode-rpc-url https://localhost:9000
You can run the GraphQL RPC server without a configuration file, which will use default values. To customize settings, generate a config file using the command below and edit it as needed:
sui-indexer-alt-graphql generate-config > <PATH_TO_GRAPHQL_CONFIG_FILE>
It will produce output similar to the following:
<details> <summary>`graphql.toml`</summary> <ImportContent source="examples/prod-config/graphql.toml" mode="code" /> </details>Both the indexer and GraphQL server require a Postgres-compatible database shared between them.
These GraphQL request throughputs were tested against the following recommended specs:
AlloyDB Omni recommends 8GB RAM per vCPU link. Allocating less than this results in the database closing Indexer and GraphQL connections during load testing.
Adding a new pipeline to an existing indexer currently requires these steps:
--first-checkpoint <checkpoint> flag set if you want to start from a checkpoint after genesis.
--first-checkpoint is only respected if no watermark record exists for the pipeline(s) (the pipeline has not been run before). The watermark record must be manually removed if you want to run the pipeline with a different value of --first-checkpoint or it will be ignored.Bloat is the difference between the size of the data in the table or index and the amount of space it takes up on disk. Autovacuum prevents bloat if approximately the same number of rows is continually inserted and deleted like the case when pipeline pruning is enabled. However, autovacuum will not handle more rows being deleted than inserted. There are two cases where this can occur:
These tools can be used to reduce bloat:
| Tool | Type | ACCESS EXCLUSIVE locking | Schedulable | Link |
|---|---|---|---|---|
VACUUM FULL | Built-in | Entire operation | No | https://www.postgresql.org/docs/current/sql-vacuum.html |
| pg_repack | Extension | Briefly during initial and final step | No | https://reorg.github.io/pg_repack/ |
| pg_squeeze | Extension | Briefly during final step | Yes | https://github.com/cybertec-postgresql/pg_squeeze |