scientific-skills/lamindb/references/setup-deployment.md
This document covers installation, configuration, instance management, storage options, and deployment strategies for LaminDB.
# Install LaminDB
pip install lamindb
# Or with pip3
pip3 install lamindb
Install optional dependencies for specific functionality:
# Google Cloud Platform support
pip install 'lamindb[gcp]'
# Flow cytometry formats
pip install 'lamindb[fcs]'
# Array storage and streaming (Zarr support)
pip install 'lamindb[zarr]'
# AWS S3 support (usually included by default)
pip install 'lamindb[aws]'
# Multiple extras
pip install 'lamindb[gcp,zarr,fcs]'
# Biological ontologies (Bionty)
pip install bionty
# Wet lab functionality
pip install lamindb-wetlab
# Clinical data (OMOP CDM)
pip install lamindb-clinical
import lamindb as ln
print(ln.__version__)
# Check available modules
import bionty as bt
print(bt.__version__)
# Login with API key
lamin login
# You'll be prompted to enter your API key
# API key is stored locally at ~/.lamin/
Data Privacy: LaminDB authentication only collects basic metadata (email, user information). Your actual data remains private and is not sent to LaminDB servers.
Local vs Cloud: Authentication is required even for local-only usage to enable collaboration features and instance management.
For local development and small datasets:
# Initialize in current directory
lamin init --storage ./mydata
# Initialize in specific directory
lamin init --storage /path/to/data
# Initialize with specific modules
lamin init --storage ./mydata --modules bionty
# Initialize with multiple modules
lamin init --storage ./mydata --modules bionty,wetlab
Use cloud storage but local SQLite database:
# AWS S3
lamin init --storage s3://my-bucket/path
# Google Cloud Storage
lamin init --storage gs://my-bucket/path
# S3-compatible (MinIO, Cloudflare R2)
lamin init --storage 's3://bucket?endpoint_url=http://endpoint:9000'
For production deployments:
# S3 + PostgreSQL
lamin init --storage s3://my-bucket/path \
--db postgresql://user:password@hostname:5432/dbname \
--modules bionty
# GCS + PostgreSQL
lamin init --storage gs://my-bucket/path \
--db postgresql://user:password@hostname:5432/dbname \
--modules bionty
# Specify instance name
lamin init --storage ./mydata --name my-project
# Default name uses directory name
lamin init --storage ./mydata # Instance name: "mydata"
# By name
lamin connect my-project
# By full path
lamin connect account_handle/my-project
# Connect to someone else's instance
lamin connect other-user/their-project
# Requires appropriate permissions
# List available instances
lamin info
# Switch instance
lamin connect another-instance
# Close current instance
lamin close
Advantages:
Setup:
lamin init --storage ./data
Advantages:
Setup:
# Set credentials
export AWS_ACCESS_KEY_ID=your_key_id
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-east-1
# Initialize
lamin init --storage s3://my-bucket/project-data \
--db postgresql://user:pwd@host:5432/db
S3 Permissions Required:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-bucket/*",
"arn:aws:s3:::my-bucket"
]
}
]
}
Setup:
# Authenticate
gcloud auth application-default login
# Or use service account
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
# Initialize
lamin init --storage gs://my-bucket/project-data \
--db postgresql://user:pwd@host:5432/db
For MinIO, Cloudflare R2, or other S3-compatible services:
# MinIO example
export AWS_ACCESS_KEY_ID=minioadmin
export AWS_SECRET_ACCESS_KEY=minioadmin
lamin init --storage 's3://my-bucket?endpoint_url=http://minio.example.com:9000'
# Cloudflare R2 example
export AWS_ACCESS_KEY_ID=your_r2_access_key
export AWS_SECRET_ACCESS_KEY=your_r2_secret_key
lamin init --storage 's3://bucket?endpoint_url=https://account-id.r2.cloudflarestorage.com'
Advantages:
Limitations:
Setup:
# SQLite is default
lamin init --storage ./data
# Database stored at ./data/.lamindb/
Advantages:
Setup:
# Full connection string
lamin init --storage s3://bucket/path \
--db postgresql://username:password@hostname:5432/database
# With SSL
lamin init --storage s3://bucket/path \
--db "postgresql://user:pwd@host:5432/db?sslmode=require"
PostgreSQL Versions: Compatible with PostgreSQL 12+
# Check current schema version
lamin migrate check
# Upgrade schema
lamin migrate deploy
# View migration history
lamin migrate history
LaminDB maintains a local cache for cloud files:
import lamindb as ln
# View cache location
print(ln.settings.cache_dir)
# Set cache directory
lamin cache set /path/to/cache
# View current cache settings
lamin cache get
For shared systems with multiple users:
# Create system settings file
sudo mkdir -p /system/settings
sudo nano /system/settings/system.env
Add to system.env:
lamindb_cache_path=/shared/cache/lamindb
Ensure permissions:
sudo chmod 755 /shared/cache/lamindb
sudo chown -R shared-user:shared-group /shared/cache/lamindb
import lamindb as ln
# Clear cache for specific artifact
artifact = ln.Artifact.get(key="data.h5ad")
artifact.delete_cache()
# Check if artifact is cached
if artifact.is_cached():
print("Already cached")
# Manually clear entire cache
import shutil
shutil.rmtree(ln.settings.cache_dir)
import lamindb as ln
# User settings
print(ln.setup.settings.user)
# User(handle='username', email='[email protected]', name='Full Name')
# Instance settings
print(ln.setup.settings.instance)
# Instance(name='my-project', storage='s3://bucket/path')
# Set development directory for relative keys
lamin settings set dev-dir /path/to/project
# Configure git sync
lamin settings set sync-git-repo https://github.com/user/repo.git
# View all settings
lamin settings
# Cache directory
export LAMIN_CACHE_DIR=/path/to/cache
# Settings directory
export LAMIN_SETTINGS_DIR=/path/to/settings
# Git sync
export LAMINDB_SYNC_GIT_REPO=https://github.com/user/repo.git
# Current instance info
lamin info
# List all instances
lamin ls
# View instance details
lamin instance details
# Set instance visibility (requires LaminHub)
lamin instance set-visibility public
lamin instance set-visibility private
# Invite collaborators (requires LaminHub)
lamin instance invite [email protected]
# Backup instance
lamin backup create
# Restore from backup
lamin backup restore backup_id
# Export instance metadata
lamin export instance-metadata.json
# Delete instance (preserves data, removes metadata)
lamin delete --force instance-name
# This only removes the LaminDB metadata
# Actual data in storage location remains
Development:
# Local development
lamin init --storage ./dev-data --modules bionty
Production:
# Cloud production
lamin init --storage s3://prod-bucket/data \
--db postgresql://user:pwd@db-host:5432/prod-db \
--modules bionty \
--name production
Migration: Export artifacts from dev, import to prod
# Export from dev
artifacts = ln.Artifact.filter().all()
for artifact in artifacts:
artifact.export("/tmp/export/")
# Switch to prod
lamin connect production
# Import to prod
for file in Path("/tmp/export/").glob("*"):
ln.Artifact(str(file), key=file.name).save()
Deploy instances in multiple regions for data sovereignty:
# US instance
lamin init --storage s3://us-bucket/data \
--db postgresql://user:pwd@us-db:5432/db \
--name us-production
# EU instance
lamin init --storage s3://eu-bucket/data \
--db postgresql://user:pwd@eu-db:5432/db \
--name eu-production
Multiple users, shared data:
# Shared storage with user-specific DB
lamin init --storage s3://shared-bucket/data \
--db postgresql://user1:pwd@host:5432/user1_db \
--name user1-workspace
lamin init --storage s3://shared-bucket/data \
--db postgresql://user2:pwd@host:5432/user2_db \
--name user2-workspace
# Use connection pooling for PostgreSQL
# Configure in database server settings
# Optimize queries with indexes
# LaminDB creates indexes automatically for common queries
# Use appropriate storage classes
# S3: STANDARD for frequent access, INTELLIGENT_TIERING for mixed access
# Configure multipart upload thresholds
export AWS_CLI_FILE_IO_BANDWIDTH=100MB
# Pre-cache frequently used artifacts
artifacts = ln.Artifact.filter(key__startswith="reference/")
for artifact in artifacts:
artifact.cache() # Download to cache
# Use backed mode for large arrays
adata = artifact.backed() # Don't load into memory
Credentials Management:
Access Control:
Network Security:
Data Protection:
import lamindb as ln
# Check database connection
try:
ln.Artifact.filter().count()
print("✓ Database connected")
except Exception as e:
print(f"✗ Database error: {e}")
# Check storage access
try:
test_artifact = ln.Artifact("test.txt", key="healthcheck.txt").save()
test_artifact.delete(permanent=True)
print("✓ Storage accessible")
except Exception as e:
print(f"✗ Storage error: {e}")
# Enable debug logging
import logging
logging.basicConfig(level=logging.DEBUG)
# LaminDB operations will produce detailed logs
# Regular database backups (PostgreSQL)
pg_dump -h hostname -U username -d database > backup_$(date +%Y%m%d).sql
# Storage backups (S3 versioning)
aws s3api put-bucket-versioning \
--bucket my-bucket \
--versioning-configuration Status=Enabled
# Metadata export
lamin export metadata_backup.json
Issue: Cannot connect to instance
# Check instance exists
lamin ls
# Verify authentication
lamin login
# Re-connect
lamin connect instance-name
Issue: Storage permissions denied
# Check AWS credentials
aws s3 ls s3://your-bucket/
# Check GCS credentials
gsutil ls gs://your-bucket/
# Verify IAM permissions
Issue: Database connection error
# Test PostgreSQL connection
psql postgresql://user:pwd@host:5432/db
# Check database version compatibility
lamin migrate check
Issue: Cache full
# Clear cache
import lamindb as ln
import shutil
shutil.rmtree(ln.settings.cache_dir)
# Set larger cache location
lamin cache set /larger/disk/cache
# Upgrade to latest version
pip install --upgrade lamindb
# Upgrade database schema
lamin migrate deploy
Check the compatibility matrix to ensure your database schema version is compatible with your installed LaminDB version.
Major version upgrades may require migration:
# Check for breaking changes
lamin migrate check
# Review migration plan
lamin migrate plan
# Execute migration
lamin migrate deploy