smoke-test/README.md
This directory contains end-to-end smoke tests for DataHub functionality. These tests can be run locally for faster development and debugging compared to the full CI pipeline.
DataHub must be running locally
# From project root
./gradlew quickstartDebug
Set up Python environment (one-time setup)
# From project root - sets up metadata-ingestion venv
./gradlew :metadata-ingestion:installDev
# Set up smoke-test specific environment
cd smoke-test
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip wheel setuptools
pip install -r requirements.txt
export DATAHUB_VERSION=v1.0.0rc3-SNAPSHOT # or current version
export TEST_STRATEGY=no_cypress_suite0 # for non-Cypress tests
cd smoke-test
source venv/bin/activate
# Set environment variables
export DATAHUB_VERSION=v1.0.0rc3-SNAPSHOT
export TEST_STRATEGY=no_cypress_suite0
# Run all tests (WARNING: Takes a long time, requires full setup)
pytest -vv
# Run specific test file (RECOMMENDED for development)
pytest test_system_info.py -vv
# Run specific test method
pytest test_system_info.py::test_system_info_main_endpoint -vv
# Run multiple specific tests
pytest test_e2e.py::test_healthchecks test_e2e.py::test_gms_usage_fetch -v
test_system_info.py)✅ FAST - Can run independently
Tests the system info API endpoints:
/openapi/v1/system-info - Spring components only/openapi/v1/system-info/properties - Detailed properties/openapi/v1/system-info/spring-components - Component status# Run all system info tests (takes ~30 seconds)
pytest test_system_info.py -vv
test_e2e.py)⚠️ SLOW - Requires full ingestion pipeline
Tests that require data ingestion and full DataHub functionality. Many tests depend on the initial ingestion fixture which can fail if Kafka/Schema Registry aren't properly configured.
# Run health checks only (fast)
pytest test_e2e.py::test_healthchecks -v
# Run authentication tests (fast)
pytest test_e2e.py::test_frontend_auth -v
# Run full e2e tests (slow, requires full setup)
pytest test_e2e.py -vv
After making changes to system info APIs:
Restart DataHub
# Kill existing processes
./gradlew :datahub-frontend:stop :datahub-gms:stop
# Restart
./gradlew quickstartDebug
Run System Info Tests
cd smoke-test
source venv/bin/activate
export DATAHUB_VERSION=v1.0.0rc3-SNAPSHOT
export TEST_STRATEGY=no_cypress_suite0
pytest test_system_info.py -vv
# Check if DataHub is running
curl -s http://localhost:8080/health | head -5
# Test system info endpoint directly
curl -s http://localhost:8080/openapi/v1/system-info | jq . | head -20
❌ "Connection refused" errors
❌ "401 Unauthorized" for direct curl
❌ Kafka/Schema Registry connection errors
❌ Python environment issues
# Recreate virtual environment
rm -rf venv
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Check if services are running
curl -s http://localhost:8080/health
curl -s http://localhost:9092 # Kafka (will show connection refused if not running)
# Verify Python environment
source venv/bin/activate
which python
python --version
pip list | grep datahub
# Check environment variables
echo "DATAHUB_VERSION: $DATAHUB_VERSION"
echo "TEST_STRATEGY: $TEST_STRATEGY"
./gradlew :smoke-test:pytest - full pipeline with Docker containerstest_e2e.py - Main test suite (1387 lines)test_system_info.py - System info API tests (169 lines)conftest.py - Test configuration and fixturestests/utils.py - Test utilities and helpers💡 Pro Tip: For rapid development, use pytest test_system_info.py -vv which runs in ~30 seconds vs the full test suite which can take 30+ minutes.