backend/onyx/file_store/README.md
The Onyx file store provides a unified interface for storing files and large binary objects in S3-compatible storage systems. It supports AWS S3, MinIO, Azure Blob Storage, Digital Ocean Spaces, and other S3-compatible services.
The file store uses a single database table (file_record) to store file metadata while the actual file content is stored in external S3-compatible storage. This approach provides scalability, cost-effectiveness, and decouples file storage from the database.
The file_record table contains the following columns:
file_id (primary key): Unique identifier for the filedisplay_name: Human-readable name for the filefile_origin: Origin/source of the file (enum)file_type: MIME type of the filefile_metadata: Additional metadata as JSONbucket_name: External storage bucket/container nameobject_key: External storage object key/pathcreated_at: Timestamp when the file was createdupdated_at: Timestamp when the file was last updatedStores files in external S3-compatible storage systems while keeping metadata in the database.
Pros:
Cons:
All configuration is handled via environment variables. The system requires S3-compatible storage to be configured.
S3_FILE_STORE_BUCKET_NAME=your-bucket-name # Defaults to 'onyx-file-store-bucket'
S3_FILE_STORE_PREFIX=onyx-files # Optional, defaults to 'onyx-files'
# AWS credentials (use one of these methods):
# 1. Environment variables
S3_AWS_ACCESS_KEY_ID=your-access-key
S3_AWS_SECRET_ACCESS_KEY=your-secret-key
AWS_REGION_NAME=us-east-2 # Optional, defaults to 'us-east-2'
# 2. IAM roles (recommended for EC2/ECS deployments)
# No additional configuration needed if using IAM roles
S3_FILE_STORE_BUCKET_NAME=your-bucket-name
S3_ENDPOINT_URL=http://localhost:9000 # MinIO endpoint
S3_AWS_ACCESS_KEY_ID=minioadmin
S3_AWS_SECRET_ACCESS_KEY=minioadmin
AWS_REGION_NAME=us-east-1 # Any region name
S3_VERIFY_SSL=false # Optional, defaults to false
S3_FILE_STORE_BUCKET_NAME=your-space-name
S3_ENDPOINT_URL=https://nyc3.digitaloceanspaces.com
S3_AWS_ACCESS_KEY_ID=your-spaces-key
S3_AWS_SECRET_ACCESS_KEY=your-spaces-secret
AWS_REGION_NAME=nyc3
The file store works with any S3-compatible service. Simply configure:
S3_FILE_STORE_BUCKET_NAME: Your bucket/container nameS3_ENDPOINT_URL: The service endpoint URLS3_AWS_ACCESS_KEY_ID and S3_AWS_SECRET_ACCESS_KEY: Your credentialsAWS_REGION_NAME: The region (any valid region name)The system uses the S3BackedFileStore class that implements the abstract FileStore interface. The database uses generic column names (bucket_name, object_key) to maintain compatibility with different S3-compatible services.
The FileStore abstract base class defines the following methods:
initialize(): Initialize the storage backend (create bucket if needed)has_file(file_id, file_origin, file_type): Check if a file existssave_file(content, display_name, file_origin, file_type, file_metadata, file_id): Save a fileread_file(file_id, mode, use_tempfile): Read file contentread_file_record(file_id): Get file metadata from databasedelete_file(file_id): Delete a file and its metadataget_file_with_mime_type(file_id): Get file with parsed MIME typefrom onyx.file_store.file_store import get_default_file_store
from onyx.configs.constants import FileOrigin
# Get the configured file store
file_store = get_default_file_store(db_session)
# Initialize the storage backend (creates bucket if needed)
file_store.initialize()
# Save a file
with open("example.pdf", "rb") as f:
file_id = file_store.save_file(
content=f,
display_name="Important Document.pdf",
file_origin=FileOrigin.OTHER,
file_type="application/pdf",
file_metadata={"department": "engineering", "version": "1.0"}
)
# Check if a file exists
exists = file_store.has_file(
file_id=file_id,
file_origin=FileOrigin.OTHER,
file_type="application/pdf"
)
# Read a file
file_content = file_store.read_file(file_id)
# Read file with temporary file (for large files)
file_content = file_store.read_file(file_id, use_tempfile=True)
# Get file metadata
file_record = file_store.read_file_record(file_id)
# Get file with MIME type detection
file_with_mime = file_store.get_file_with_mime_type(file_id)
# Delete a file
file_store.delete_file(file_id)
When deploying the application, ensure that:
S3_FILE_STORE_BUCKET_NAME exists or the service account has permissions to create itfile_store.initialize() during application startup to ensure the bucket existsThe file store will automatically create the bucket if it doesn't exist and the credentials have sufficient permissions.