skills/lamindb/references/setup-deployment.md
This document covers installation, configuration, instance management, storage options, and deployment strategies for LaminDB.
# Install the current validated baseline
uv pip install 'lamindb==2.5.1'
# Minimal namespace-only package for lightweight clients
uv pip install 'lamindb-core==2.5.1'
Install optional dependencies for specific functionality:
# Google Cloud Platform support
uv pip install 'lamindb[gcp]==2.5.1'
# Flow cytometry formats
uv pip install 'lamindb[fcs]==2.5.1'
# Array storage and streaming (Zarr v2 support)
uv pip install 'lamindb[zarr-v2]==2.5.1'
# AWS S3 support (usually included by default)
uv pip install 'lamindb==2.5.1'
# Multiple extras
uv pip install 'lamindb[gcp,zarr-v2,fcs]==2.5.1'
# Biological ontologies (Bionty)
uv pip install 'bionty==2.4.0'
# Wet lab functionality
uv pip install 'lamindb-wetlab==<reviewed-version>'
# Clinical schemas, such as clinicore or OMOP-focused modules
uv pip install '<clinical-module>==<reviewed-version>'
Use a project lock file for production deployments. The pinned LaminDB and Bionty versions above reflect the current refresh baseline; for optional modules, look up the current official package and pin the reviewed version rather than installing a floating latest release.
import lamindb as ln
print(ln.__version__)
# Check available modules
import bionty as bt
print(bt.__version__)
# Login with API key
lamin login
# You'll be prompted to enter your API key
# API key is stored locally at ~/.lamin/
Data Privacy: LaminDB authentication only collects basic metadata (email, user information). Your actual data remains private and is not sent to LaminDB servers.
Local vs Cloud: Local lamin init instances do not require LaminHub login. Login is required for LaminHub-managed collaboration and cloud-hosted metadata.
Credential handling: Never ask the agent to print API keys, cloud secrets, or database URLs containing passwords. When debugging, check only whether named variables exist and redact values in terminal output and logs.
For local development and small datasets:
# Initialize in current directory
lamin init --storage ./mydata
# Initialize in specific directory
lamin init --storage /path/to/data
# Initialize with specific modules
lamin init --storage ./mydata --modules bionty
# Initialize with multiple modules
lamin init --storage ./mydata --modules bionty,wetlab
Use cloud storage but local SQLite database:
# AWS S3
lamin init --storage s3://my-bucket/path
# Google Cloud Storage
lamin init --storage gs://my-bucket/path
# S3-compatible (MinIO, Cloudflare R2)
lamin init --storage 's3://bucket?endpoint_url=http://endpoint:9000'
For production deployments:
# S3 + PostgreSQL
export LAMIN_DB_URL='<set-in-secret-manager>'
lamin init --storage s3://my-bucket/path \
--db "$LAMIN_DB_URL" \
--modules bionty
# GCS + PostgreSQL
export LAMIN_DB_URL='<set-in-secret-manager>'
lamin init --storage gs://my-bucket/path \
--db "$LAMIN_DB_URL" \
--modules bionty
Set LAMIN_DB_URL through a secret manager or secure shell session. Do not commit it to configuration files, notebooks, or chat transcripts.
# Specify instance name
lamin init --storage ./mydata --name my-project
# Default name uses directory name
lamin init --storage ./mydata # Instance name: "mydata"
# By name
lamin connect my-project
# By full path
lamin connect account_handle/my-project
# Connect to someone else's instance
lamin connect other-user/their-project
# Requires appropriate permissions
# List available instances
lamin info
# Switch instance
lamin connect another-instance
# Close current instance
lamin close
Advantages:
Setup:
lamin init --storage ./data
Advantages:
Setup:
# Prefer IAM roles or workload identity. If local variables are needed,
# set them outside shared scripts and never echo their values.
export AWS_ACCESS_KEY_ID='<redacted>'
export AWS_SECRET_ACCESS_KEY='<redacted>'
export AWS_DEFAULT_REGION='us-east-1'
# Initialize
export LAMIN_DB_URL='<set-in-secret-manager>'
lamin init --storage s3://my-bucket/project-data \
--db "$LAMIN_DB_URL"
S3 Permissions Required:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-bucket/*",
"arn:aws:s3:::my-bucket"
]
}
]
}
Setup:
# Authenticate
gcloud auth application-default login
# Or use service account
export GOOGLE_APPLICATION_CREDENTIALS=/secure/path/to/service-account.json
# Initialize
export LAMIN_DB_URL='<set-in-secret-manager>'
lamin init --storage gs://my-bucket/project-data \
--db "$LAMIN_DB_URL"
For MinIO, Cloudflare R2, or other S3-compatible services:
# MinIO example
export AWS_ACCESS_KEY_ID='<redacted>'
export AWS_SECRET_ACCESS_KEY='<redacted>'
lamin init --storage 's3://my-bucket?endpoint_url=http://minio.example.com:9000'
# Cloudflare R2 example
export AWS_ACCESS_KEY_ID='<redacted>'
export AWS_SECRET_ACCESS_KEY='<redacted>'
lamin init --storage 's3://bucket?endpoint_url=https://account-id.r2.cloudflarestorage.com'
Advantages:
Limitations:
Setup:
# SQLite is default
lamin init --storage ./data
# Database stored at ./data/.lamindb/
Advantages:
Setup:
# Full connection string stored in a named secret
export LAMIN_DB_URL='<set-in-secret-manager>'
lamin init --storage s3://bucket/path \
--db "$LAMIN_DB_URL"
# With SSL
export LAMIN_DB_URL='<set-in-secret-manager-with-ssl>'
lamin init --storage s3://bucket/path \
--db "$LAMIN_DB_URL"
PostgreSQL Versions: Compatible with PostgreSQL 12+
# Check current schema version
lamin migrate check
# Upgrade schema
lamin migrate deploy
# View migration history
lamin migrate history
LaminDB maintains a local cache for cloud files:
import lamindb as ln
# View cache location
print(ln.settings.cache_dir)
# Set cache directory
lamin cache set /path/to/cache
# View current cache settings
lamin cache get
For shared systems with multiple users:
# Create system settings file
sudo mkdir -p /system/settings
sudo nano /system/settings/system.env
Add to system.env:
lamindb_cache_path=/shared/cache/lamindb
Ensure permissions:
sudo chmod 755 /shared/cache/lamindb
sudo chown -R shared-user:shared-group /shared/cache/lamindb
import lamindb as ln
# Clear cache for specific artifact
artifact = ln.Artifact.get(key="data.h5ad")
artifact.delete_cache()
# Check if artifact is cached
if artifact.is_cached():
print("Already cached")
# Manually clear entire cache
import shutil
shutil.rmtree(ln.settings.cache_dir)
import lamindb as ln
# User settings
print(ln.setup.settings.user)
# User(handle='username', email='[email protected]', name='Full Name')
# Instance settings
print(ln.setup.settings.instance)
# Instance(name='my-project', storage='s3://bucket/path')
# Set development directory for relative keys
lamin settings set dev-dir /path/to/project
# Configure git sync
lamin settings set sync-git-repo https://github.com/user/repo.git
# View all settings
lamin settings
# Cache directory
export LAMIN_CACHE_DIR=/path/to/cache
# Settings directory
export LAMIN_SETTINGS_DIR=/path/to/settings
# Git sync
export LAMINDB_SYNC_GIT_REPO=https://github.com/user/repo.git
# Database URL for setup examples; keep the value secret
export LAMIN_DB_URL='<set-in-secret-manager>'
# Current instance info
lamin info
# List all instances
lamin ls
# View instance details
lamin instance details
# Set instance visibility (requires LaminHub)
lamin instance set-visibility public
lamin instance set-visibility private
# Invite collaborators (requires LaminHub)
lamin instance invite [email protected]
# Backup instance
lamin backup create
# Restore from backup
lamin backup restore backup_id
# Export instance metadata
lamin export instance-metadata.json
# Delete instance (preserves data, removes metadata)
lamin delete --force instance-name
# This only removes the LaminDB metadata
# Actual data in storage location remains
Development:
# Local development
lamin init --storage ./dev-data --modules bionty
Production:
# Cloud production
export LAMIN_DB_URL='<set-in-secret-manager>'
lamin init --storage s3://prod-bucket/data \
--db "$LAMIN_DB_URL" \
--modules bionty \
--name production
Migration: Export artifacts from dev, import to prod
# Export from dev
artifacts = ln.Artifact.filter().all()
for artifact in artifacts:
artifact.export("/tmp/export/")
# Switch to prod
lamin connect production
# Import to prod
for file in Path("/tmp/export/").glob("*"):
ln.Artifact(str(file), key=file.name).save()
Deploy instances in multiple regions for data sovereignty:
# US instance
export LAMIN_DB_URL_US='<set-in-secret-manager>'
lamin init --storage s3://us-bucket/data \
--db "$LAMIN_DB_URL_US" \
--name us-production
# EU instance
export LAMIN_DB_URL_EU='<set-in-secret-manager>'
lamin init --storage s3://eu-bucket/data \
--db "$LAMIN_DB_URL_EU" \
--name eu-production
Multiple users, shared data:
# Shared storage with user-specific DB
export LAMIN_DB_URL_USER1='<set-in-secret-manager>'
lamin init --storage s3://shared-bucket/data \
--db "$LAMIN_DB_URL_USER1" \
--name user1-workspace
export LAMIN_DB_URL_USER2='<set-in-secret-manager>'
lamin init --storage s3://shared-bucket/data \
--db "$LAMIN_DB_URL_USER2" \
--name user2-workspace
# Use connection pooling for PostgreSQL
# Configure in database server settings
# Optimize queries with indexes
# LaminDB creates indexes automatically for common queries
# Use appropriate storage classes
# S3: STANDARD for frequent access, INTELLIGENT_TIERING for mixed access
# Configure multipart upload thresholds
export AWS_CLI_FILE_IO_BANDWIDTH=100MB
# Pre-cache frequently used artifacts
artifacts = ln.Artifact.filter(key__startswith="reference/")
for artifact in artifacts:
artifact.cache() # Download to cache
# Use backed mode for large arrays
adata = artifact.backed() # Don't load into memory
Credentials Management:
Access Control:
Network Security:
Data Protection:
import lamindb as ln
# Check database connection
try:
ln.Artifact.filter().count()
print("✓ Database connected")
except Exception as e:
print(f"✗ Database error: {e}")
# Check storage access
try:
test_artifact = ln.Artifact("test.txt", key="healthcheck.txt").save()
test_artifact.delete(permanent=True)
print("✓ Storage accessible")
except Exception as e:
print(f"✗ Storage error: {e}")
# Enable debug logging
import logging
logging.basicConfig(level=logging.DEBUG)
# LaminDB operations will produce detailed logs
# Regular database backups (PostgreSQL)
pg_dump -h hostname -U username -d database > backup_$(date +%Y%m%d).sql
# Storage backups (S3 versioning)
aws s3api put-bucket-versioning \
--bucket my-bucket \
--versioning-configuration Status=Enabled
# Metadata export
lamin export metadata_backup.json
Issue: Cannot connect to instance
# Check instance exists
lamin ls
# Verify authentication
lamin login
# Re-connect
lamin connect instance-name
Issue: Storage permissions denied
# Check AWS credentials
aws s3 ls s3://your-bucket/
# Check GCS credentials
gsutil ls gs://your-bucket/
# Verify IAM permissions
Issue: Database connection error
# Test PostgreSQL connection
psql "$LAMIN_DB_URL"
# Check database version compatibility
lamin migrate check
Issue: Cache full
# Clear cache
import lamindb as ln
import shutil
shutil.rmtree(ln.settings.cache_dir)
# Set larger cache location
lamin cache set /larger/disk/cache
# Upgrade after reviewing changelog and compatibility matrix
uv pip install --upgrade 'lamindb==2.5.1'
# Upgrade database schema
lamin migrate deploy
Check the compatibility matrix to ensure your database schema version is compatible with your installed LaminDB version.
Major version upgrades may require migration:
# Check for breaking changes
lamin migrate check
# Review migration plan
lamin migrate plan
# Execute migration
lamin migrate deploy