pkg/clusterversion/runbooks/M3_enable_upgrade_tests.md
This section provides step-by-step instructions for the M.3 task: "Enable upgrade tests" on the master branch after the first RC is published. This task enables roachtest upgrade tests with the newly released version.
š Quick Reference: For a streamlined checklist-style guide, see M3_enable_upgrade_tests_QUICK.md. This document provides detailed explanations and troubleshooting.
When to perform: After the first RC (e.g., v25.4.0-rc.1) is published and available on the releases page.
What it does:
PreviousRelease constant to enable upgrade testing from the forked releasecockroach-go-testserver-25.4 logictest configurationCritical requirement: Fixture generation MUST be done on a gceworker (amd64 required, cannot use Mac).
Dependencies:
Before starting, ensure:
Based on recent PRs, there are two valid approaches:
Recent examples:
Example: #141765
Recommendation: Use Approach A (two PRs) as it's the most recent pattern and makes review easier.
This PR adds the roachtest fixtures and updates the releases file.
a) Create a new branch:
git checkout master # or your M.2 branch if M.2 isn't merged yet
git pull origin master
git checkout -b enable-upgrade-tests-25.4-fixtures
b) Build and run the release tool:
bazel build //pkg/cmd/release:release
_bazel/bin/pkg/cmd/release/release_/release update-releases-file
What this updates:
pkg/testutils/release/cockroach_releases.yaml - Adds latest RC version under "25.4"pkg/sql/logictest/REPOSITORIES.bzl - Adds RC binaries with checksums for all platformsc) Verify the changes:
# Check that 25.4.0-rc.1 was added
grep -A 2 '"25.4":' pkg/testutils/release/cockroach_releases.yaml
# Check that the RC binaries were added to REPOSITORIES.bzl
grep "25.4.0-rc.1" pkg/sql/logictest/REPOSITORIES.bzl
ā ļø CRITICAL: Manually Verify REPOSITORIES.bzl
The release update-releases-file tool may incorrectly remove older version configs that are still needed by active testserver configurations.
Check which testserver configs are active:
grep "cockroach-go-testserver-" pkg/sql/logictest/logictestbase/logictestbase.go | grep "Name:"
Verify REPOSITORIES.bzl has binaries for ALL active testserver versions:
# Example: If you see 25.2, 25.3, 25.4 configs, verify all are in REPOSITORIES.bzl
grep -E "^\s+\(\"25\.[0-9]" pkg/sql/logictest/REPOSITORIES.bzl
If a needed version is missing (e.g., tool removed 25.2.7 but cockroach-go-testserver-25.2 still exists):
git show HEAD^:pkg/sql/logictest/REPOSITORIES.bzl | grep -A 4 "25.2.7"
Example from PR #156535:
Pattern: Keep N-2, N-1, and N (current) release binaries until their testserver configs are removed.
d) Commit the releases file update:
git add pkg/testutils/release/cockroach_releases.yaml pkg/sql/logictest/REPOSITORIES.bzl
git commit -m "master: Update pkg/testutils/release/cockroach_releases.yaml
This updates the releases file with the v25.4.0-rc.1 release candidate and
adds the RC binaries to REPOSITORIES.bzl for use in upgrade tests.
Part of M.3 fixtures preparation.
Release note: None
Epic: None"
ā ļø CRITICAL: This CANNOT be done on Mac. Must use gceworker (amd64 required).
The roachtest fixtures README states: "the roachtest needs to be run on amd64 (if you are using a Mac it's recommended to use a gceworker)."
IMPORTANT: According to gceworker setup notes, you MUST run update-firewall before create, otherwise you'll get stuck trying to connect and end up with a gceworker that doesn't have anything pre-installed.
a) Set your zone (optional but recommended):
# Add to ~/.zshrc to avoid specifying zone every time
echo 'export CLOUDSDK_COMPUTE_ZONE=us-west1-a' >> ~/.zshrc
source ~/.zshrc
Choose a zone close to you:
us-west1-a or us-west1-b (Oregon - for SF/West Coast)us-east1-b (South Carolina - for East Coast)b) Update firewall rules (REQUIRED before create):
# From your Mac, in the cockroach repo:
./scripts/gceworker.sh update-firewall
c) Create the gceworker (if it doesn't exist):
./scripts/gceworker.sh create
This will take 5-10 minutes and will auto-install dependencies (bazel, go, etc.).
d) SSH to the gceworker:
./scripts/gceworker.sh start
Common SSH connection errors:
If you see ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255]:
gcloud compute instances list | grep gceworker./scripts/gceworker.sh start (it will retry)./scripts/gceworker.sh destroy
./scripts/gceworker.sh update-firewall
./scripts/gceworker.sh create
On the gceworker terminal, run:
# Create directory structure
mkdir -p ~/go/src/github.com/cockroachdb
# Clone the repo (use your fork so you can push later)
cd ~/go/src/github.com/cockroachdb
git clone [email protected]:<your-github-username>/cockroach.git
cd cockroach
# If SSH doesn't work, use HTTPS:
# git clone https://github.com/<your-github-username>/cockroach.git
# Add upstream remote and fetch tags
git remote add upstream https://github.com/cockroachdb/cockroach.git
git fetch upstream --tags
# Checkout the RC tag
git checkout v25.4.0-rc.1
# Set environment variables
export FIXTURE_VERSION=v25.4.0-rc.1
export COCKROACH_DEV_LICENSE="<your-license-key>"
# Verify license is set
echo $COCKROACH_DEV_LICENSE
Expected: Should show the license key starting with crl-0-.
ā ļø Session Disconnection Note:
If you close your laptop or lose SSH connection, you'll need to:
./scripts/gceworker.sh startcd ~/go/src/github.com/cockroachdb/cockroachexport FIXTURE_VERSION=v25.4.0-rc.1
export COCKROACH_DEV_LICENSE="<your-license-key>"
On the gceworker terminal, run:
# First time only: run dev doctor to configure build settings
./dev doctor
When prompted:
# Build required binaries
./dev build cockroach short //c-deps:libgeos roachprod workload roachtest
# Clean up any previous local clusters (error "cluster local does not exist" is normal/OK)
./bin/roachprod destroy local
Expected:
Successfully built binary for target //pkg/cmd/cockroach:cockroachroachprod destroy error "cluster local does not exist" is normal - it just means no previous cluster to clean upCommon build errors:
Error: "please run dev doctor to refresh dev status"
./dev doctor and answer the prompts as shown aboveError: "COCKROACH_DEV_LICENSE not set"
export COCKROACH_DEV_LICENSE="<license>"Error: "bazel: command not found"
./dev doctor to install.Error: "./bin/roachprod: No such file or directory"
./dev build roachprod workload roachtest
On the gceworker terminal, run:
./bin/roachtest run generate-fixtures --local --debug \
--cockroach ./cockroach --suite fixtures
Expected: The test will FAIL intentionally and print instructions like:
--- FAIL: generate-fixtures (XXs)
fixtures.go:123:
To complete the test, run the following:
for i in 1 2 3 4; do
mkdir -p pkg/cmd/roachtest/fixtures/${i} && \
mv artifacts/generate-fixtures/run_1/logs/${i}.unredacted/checkpoint-*.tgz \
pkg/cmd/roachtest/fixtures/${i}/
done
Common errors:
Error: "license required"
echo $COCKROACH_DEV_LICENSEError: "roachprod cluster exists"
./bin/roachprod destroy localOn the gceworker terminal, copy the command from the test output and run it:
# Example command (use the exact command from your test output):
for i in 1 2 3 4; do
mkdir -p pkg/cmd/roachtest/fixtures/${i} && \
mv artifacts/generate-fixtures/run_1/logs/${i}.unredacted/checkpoint-*.tgz \
pkg/cmd/roachtest/fixtures/${i}/
done
Verify the fixtures were created:
ls -lh pkg/cmd/roachtest/fixtures/*/checkpoint-v25.4.tgz
Expected: Should show 4 files (one in each directory 1, 2, 3, 4), each around 3-5 MB.
Example output:
-rw-rw-r-- 1 user user 4.4M Oct 30 12:26 pkg/cmd/roachtest/fixtures/1/checkpoint-v25.4.tgz
-rw-rw-r-- 1 user user 4.2M Oct 30 12:26 pkg/cmd/roachtest/fixtures/2/checkpoint-v25.4.tgz
-rw-rw-r-- 1 user user 3.7M Oct 30 12:26 pkg/cmd/roachtest/fixtures/3/checkpoint-v25.4.tgz
-rw-rw-r-- 1 user user 2.8M Oct 30 12:26 pkg/cmd/roachtest/fixtures/4/checkpoint-v25.4.tgz
Note: If your fixture sizes are significantly different (e.g., 0 bytes or >10MB), compare against existing fixtures from a previous version to verify they're correct:
# On your Mac, check existing fixture sizes
ls -lh pkg/cmd/roachtest/fixtures/*/checkpoint-v25.3.tgz
From your Mac terminal (NOT the gceworker), run:
# Create tmp directory for fixtures
mkdir -p /tmp/fixtures-25.4
# Copy the fixtures from gceworker to local tmp directory
# Replace <yourname> with your username and adjust zone if needed
scp -r gceworker-<yourname>.us-west1-a.cockroach-workers:~/go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/fixtures /tmp/fixtures-25.4/
# Navigate to your local cockroach repo
cd ~/go/src/github.com/cockroachdb/cockroach
# Switch to your fixtures branch
git checkout enable-upgrade-tests-25.4-fixtures
# Copy the fixtures to the correct locations
for i in 1 2 3 4; do
cp /tmp/fixtures-25.4/fixtures/${i}/checkpoint-v25.4.tgz pkg/cmd/roachtest/fixtures/${i}/
done
# Verify they're in place
ls -lh pkg/cmd/roachtest/fixtures/*/checkpoint-v25.4.tgz
Expected: All 4 fixtures copied successfully, each ~3-5 MB.
Alternative using roachprod:
# If you created the gceworker with roachprod:
roachprod get <gceworker-name>:1 cockroach/pkg/cmd/roachtest/fixtures/ /tmp/fixtures-25.4/
On your Mac:
# Add the fixtures
git add pkg/cmd/roachtest/fixtures/*/checkpoint-v25.4.tgz
# Verify what's being committed
git status
# Create commit
git commit -m "roachtest: add 25.4 fixtures
This adds roachtest fixtures for v25.4.0-rc.1 to enable upgrade testing.
Fixtures were generated on gceworker by running:
./bin/roachtest run generate-fixtures --local --debug \\
--cockroach ./cockroach --suite fixtures
Part of M.3 \"Enable upgrade tests\" checklist.
Release note: None
Epic: None
š¤ Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>"
# Push to your fork
git push -u origin enable-upgrade-tests-25.4-fixtures
# Create PR
gh pr create --repo cockroachdb/cockroach \
--title "master: Update releases file and add 25.4 fixtures" \
--body "Part of M.3: Enable upgrade tests for 25.4.
This PR:
- Updates releases file with v25.4.0-rc.1
- Updates REPOSITORIES.bzl with RC binaries
- Adds roachtest fixtures for v25.4.0-rc.1
Fixtures were generated on gceworker using amd64 architecture as required.
Part of the quarterly release runbook.
Epic: None"
Wait for the fixtures PR to be reviewed and merged before proceeding with the code PR.
This PR updates the code to enable upgrade tests.
# Pull the merged fixtures PR
git checkout master
git pull origin master
# Create new branch for code changes
git checkout -b enable-upgrade-tests-25.4-code
File: pkg/clusterversion/cockroach_versions.go
Change: (around line 332)
// Before
const PreviousRelease Key = V25_3
// After
const PreviousRelease Key = V25_4
File: pkg/sql/logictest/logictestbase/logictestbase.go
a) Add the testserver configuration (after the 25.3 config, around line 560):
{
// This config runs tests using a 25.4 predecessor binary, testing upgrade
// compatibility.
Name: "cockroach-go-testserver-25.4",
NumNodes: 1,
OverrideDistSQLMode: "off",
UseCockroachGoTestserver: true,
CockroachGoTestserverVersion: "v25.4.0",
DeclarativeCorpusCollection: true,
},
b) Add to cockroach-go-testserver-configs set (around line 615):
"cockroach-go-testserver-configs": makeConfigSet(
"cockroach-go-testserver-25.2",
"cockroach-go-testserver-25.3",
"cockroach-go-testserver-25.4", // Add this line
),
File: pkg/sql/logictest/BUILD.bazel
Update cockroach_predecessor_version visibility (around line 160):
cockroach_predecessor_version(
name = "cockroach_predecessor_version",
visibility = [
"//pkg/ccl/logictestccl:__subpackages__",
"//pkg/sql/logictest/tests/cockroach-go-testserver-25.2:__pkg__",
"//pkg/sql/logictest/tests/cockroach-go-testserver-25.3:__pkg__",
"//pkg/sql/logictest/tests/cockroach-go-testserver-25.4:__pkg__", # Add this line
"//pkg/sql/sqlitelogictest:__subpackages__",
],
)
File: pkg/cmd/roachtest/roachtestutil/mixedversion/mixedversion.go
Check the supportsSkipUpgradeTo function (around line 850):
func supportsSkipUpgradeTo(v *version.Version) bool {
return v.Major() == 25 && v.Minor() == 4 && v.Patch() == 0 && v.PreRelease() == ""
}
The logic checks v.Minor() == 4 which should already cover 25.4. Verify this is correct - if the Ordinal is 4 for 25.4, no changes needed.
If changes are needed: Update the condition to include the new version.
./dev gen bazel
This generates:
pkg/sql/logictest/tests/cockroach-go-testserver-25.4/BUILD.bazelpkg/sql/logictest/tests/cockroach-go-testserver-25.4/generated_test.go# Add all changes
git add -A
# Verify what's being committed
git status
# Commit
git commit -m "clusterversion: bump PreviousRelease to V25_4
This change updates the PreviousRelease constant from V25_3 to V25_4
and adds the cockroach-go-testserver-25.4 logictest configuration to
enable upgrade tests for version 25.4.
Changes include:
- Updated PreviousRelease constant in pkg/clusterversion/cockroach_versions.go
- Added cockroach-go-testserver-25.4 test configuration
- Generated test files for cockroach-go-testserver-25.4 config
- Updated BUILD.bazel files via ./dev gen bazel
The supportsSkipUpgradeTo logic already correctly handles 25.4 (Ordinal == 4)
so no changes were needed there.
Part of M.3 \"Enable upgrade tests\" checklist.
Release note: None
Epic: None
š¤ Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>"
# Push to your fork
git push -u origin enable-upgrade-tests-25.4-code
# Create PR
gh pr create --repo cockroachdb/cockroach \
--title "clusterversion: bump PreviousRelease to V25_4" \
--body "Part of M.3: Enable upgrade tests for 25.4.
This PR:
- Updates PreviousRelease constant to V25_4
- Adds cockroach-go-testserver-25.4 logictest configuration
- Generates test files for the new configuration
Depends on the fixtures PR being merged first.
Part of the quarterly release runbook.
Epic: None"
If you prefer to do everything in one PR:
Step 1: Follow all steps from Approach A Phase 1 and Phase 2, but do them on a single branch.
Step 2: Commit everything together with a combined commit message.
Expected files: ~15 files total (6 from fixtures + 9 from code changes).
pkg/testutils/release/cockroach_releases.yaml - Updated by release toolpkg/sql/logictest/REPOSITORIES.bzl - Updated by release tool with RC binariespkg/cmd/roachtest/fixtures/1/checkpoint-v25.4.tgz - Generated fixturepkg/cmd/roachtest/fixtures/2/checkpoint-v25.4.tgz - Generated fixturepkg/cmd/roachtest/fixtures/3/checkpoint-v25.4.tgz - Generated fixturepkg/cmd/roachtest/fixtures/4/checkpoint-v25.4.tgz - Generated fixturepkg/clusterversion/cockroach_versions.go - PreviousRelease constantpkg/sql/logictest/logictestbase/logictestbase.go - Add testserver configpkg/sql/logictest/BUILD.bazel - Visibility updatepkg/BUILD.bazel - Updated by ./dev gen bazelpkg/cli/testdata/declarative-rules/deprules - May be regeneratedpkg/sql/logictest/tests/cockroach-go-testserver-25.4/BUILD.bazel - Generatedpkg/sql/logictest/tests/cockroach-go-testserver-25.4/generated_test.go - Generatedpkg/ccl/logictestccl/tests/cockroach-go-testserver-25.4/BUILD.bazel - Generatedpkg/ccl/logictestccl/tests/cockroach-go-testserver-25.4/generated_test.go - GeneratedNote: Additional test expectation files may need updates based on CI test results.
All of the above files in one PR (~15 files).
This task is performed every quarter. Before creating the M.3 PR(s), validate that changes follow the expected pattern.
Find the most recent M.3 PRs for reference:
Recent M.3 PRs (Two-PR approach):
Compare file lists:
For Fixtures PR:
# Get files from reference PR
gh pr view 150712 --json files --jq '.files[].path' | sort > /tmp/ref_fixtures.txt
# Get your current files
git diff --name-only master | sort > /tmp/my_fixtures.txt
# Compare
echo "=== Files ONLY in your PR (investigate!) ==="
comm -13 /tmp/ref_fixtures.txt /tmp/my_fixtures.txt
echo "=== Files ONLY in reference PR (might be missing!) ==="
comm -23 /tmp/ref_fixtures.txt /tmp/my_fixtures.txt
Expected differences:
For Code PR:
# Get files from reference PR
gh pr view 152080 --json files --jq '.files[].path' | sort > /tmp/ref_code.txt
# Get your current files
git diff --name-only master | sort > /tmp/my_code.txt
# Compare
echo "=== Files ONLY in your PR (investigate!) ==="
comm -13 /tmp/ref_code.txt /tmp/my_code.txt
echo "=== Files ONLY in reference PR (might be missing!) ==="
comm -23 /tmp/ref_code.txt /tmp/my_code.txt
# Check file sizes (should be 3-5 MB each)
ls -lh pkg/cmd/roachtest/fixtures/*/checkpoint-v25.4.tgz
# Verify all 4 fixtures exist
for i in 1 2 3 4; do
if [ ! -f "pkg/cmd/roachtest/fixtures/${i}/checkpoint-v25.4.tgz" ]; then
echo "Missing fixture $i"
fi
done
# Compare with previous version fixtures to verify sizes are reasonable
ls -lh pkg/cmd/roachtest/fixtures/*/checkpoint-v25.3.tgz
# Check that the RC was added
grep -A 5 '"25.4":' pkg/testutils/release/cockroach_releases.yaml
# Check that REPOSITORIES.bzl has the binaries
grep -c "25.4.0-rc.1" pkg/sql/logictest/REPOSITORIES.bzl
Expected: Should show multiple entries (one for each platform).
For Fixtures PR:
# No specific tests - the fixtures are binary artifacts
# Just verify they exist and have reasonable sizes
For Code PR:
# Test that the new config is recognized
./dev testlogic base --config=cockroach-go-testserver-25.4 --files=cluster_settings -v
# Bootstrap tests
./dev test pkg/sql/catalog/bootstrap -f TestInitialKeys -v
# Version tests
./dev test pkg/clusterversion pkg/roachpb -v
Expected: All tests should pass.
Before creating Fixtures PR:
Before creating Code PR:
./dev gen bazel run successfullyCause: REPOSITORIES.bzl not updated or RC binaries not available yet.
Fix:
release update-releases-fileCause: COCKROACH_DEV_LICENSE not set on gceworker.
Fix:
# On gceworker:
export COCKROACH_DEV_LICENSE="<your-license-key>"
echo $COCKROACH_DEV_LICENSE # Verify it's set
Cause: Fixture generation didn't complete or files weren't moved correctly.
Fix:
ls artifacts/generate-fixtures/find artifacts -name "checkpoint-*.tgz"Cause: Path to fixtures incorrect or gceworker hostname wrong.
Fix:
# Verify the fixtures exist on gceworker first
ssh gceworker-<yourname>.us-west1-b.cockroach-workers \
"ls -lh ~/go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/fixtures/*/checkpoint-v25.4.tgz"
# Use the correct full path in scp
Cause: Fixture generation failed but created empty files.
Fix:
cat artifacts/generate-fixtures/run_1/logs/*/roachtest.logCause: M.1 and M.2 not completed or working on wrong branch.
Fix:
pkg/clusterversion/cockroach_versions.goCause: The version skip logic doesn't recognize 25.4.
Fix:
supportsSkipUpgradeTo function in mixedversion.goCause: Build didn't complete or binary not in expected location.
Fix:
# On gceworker, verify the binary exists:
ls -lh ./cockroach
# If missing, rebuild:
./dev build cockroach short
Cause: CockroachDB repo is large (~10GB).
Solution:
git clone --depth 1 --branch v25.4.0-rc.1 \
https://github.com/cockroachdb/cockroach.git
After creating the M.3 PR, CI will typically reveal test failures. These are expected and follow patterns from previous quarters. Use this strategy to triage and fix them:
Step 1: Compare with previous M.3 PRs to understand the pattern
# Find which test files were modified in previous M.3 code PR
gh pr view 152080 --json files --jq '.files[] | select(.path | contains("test")) | .path'
# Example output shows:
# pkg/crosscluster/logical/logical_replication_job_test.go ā V25_2 ā V25_3 fix
# pkg/sql/logictest/testdata/logic_test/vector_index_mixed ā Config restriction
Step 2: Examine the actual fixes from previous PRs
# See what changed in a specific test file
gh api repos/cockroachdb/cockroach/pulls/152080/files | \
jq -r '.[] | select(.filename == "pkg/crosscluster/logical/logical_replication_job_test.go") | .patch'
# This shows the V25_2 ā V25_3 pattern, which you'll replicate as V25_3 ā V25_4
Step 3: Apply the same patterns to your failures
Common test failure patterns documented below (in order of frequency):
MakeTestingClusterSettingsWithVersions with old version + PreviousReleaseFor each failed test:
Cause: After updating PreviousRelease to V25_4, the CLI test expects declarative rules test data to reference version 25.4 instead of 25.3.
Symptoms:
output didn't match expected:
@@ -1,6 +1,6 @@
-debug declarative-print-rules 1000025.3 dep
+debug declarative-print-rules 1000025.4 dep
Fix:
Regenerate the test data using the --rewrite flag:
./dev test pkg/cli -f TestDeclarativeRules --rewrite
This updates pkg/cli/testdata/declarative-rules/deprules with the correct version references and new schema changer rules for 25.4.
Cause: After updating PreviousRelease from V25_3 to V25_4, tests that use PreviousRelease in combination with hardcoded older versions (like V25_3) may create invalid cluster configurations where the minimum supported version is higher than the binary version.
Example symptom:
F251030 16:52:35.743665 61030 settings/cluster/cluster_settings.go:183
Fatal error: minimum supported version 25.4 cannot be greater than binary version 25.3
Real-world example from PR #156535:
// This test FAILED after changing PreviousRelease from V25_3 to V25_4:
st := cluster.MakeTestingClusterSettingsWithVersions(
clusterversion.V25_3.Version(), // Binary version (25.3)
clusterversion.PreviousRelease.Version(), // Min supported (now 25.4!)
true,
)
// ERROR: Min supported (25.4) > binary version (25.3) is invalid
CRITICAL: Context-Aware Fix Required
ā ļø DO NOT blindly replace all V25_3 with V25_4 - you must understand the test's purpose first.
When TO update V25_3 ā V25_4:
Binary/cluster version in relation to PreviousRelease:
// CORRECT FIX: Binary version must be >= min supported version
st := cluster.MakeTestingClusterSettingsWithVersions(
clusterversion.V25_4.Version(), // Updated to match new PreviousRelease
clusterversion.PreviousRelease.Version(), // V25_4
true,
)
Testing current release functionality:
Testing version gates that should be active at V25_4:
When NOT to update V25_3 ā V25_4:
Testing upgrade paths FROM older versions:
// KEEP V25_3: Testing upgrade from 25.3 ā 26.1
testUpgrade(
from: clusterversion.V25_3.Version(), // Don't change!
to: clusterversion.Latest.Version(),
)
Testing backward compatibility with specific old versions:
// KEEP V25_3: Testing that 25.3 clusters can read new data
testBackwardCompat(clusterversion.V25_3) // Don't change!
Testing mixed-version behavior for specific versions:
// KEEP V25_3: Testing 25.3 + 26.1 mixed cluster
testMixedVersion(
old: clusterversion.V25_3, // Don't change!
new: clusterversion.Latest,
)
How to analyze and fix:
Read the test name and context:
# Understand what the test is doing
# Example: "TestGetWriterType/immediate-mode" with cluster settings
Check if test uses PreviousRelease:
grep -A 5 "PreviousRelease" <test_file.go>
Determine the relationship:
Look for these patterns that need fixing:
// PATTERN 1: Binary version as first argument to MakeTestingClusterSettingsWithVersions
cluster.MakeTestingClusterSettingsWithVersions(
clusterversion.V25_3.Version(), // ā FIX: Update to V25_4
clusterversion.PreviousRelease.Version(), // V25_4
true,
)
// PATTERN 2: MaxVersion in version range checks
if version >= V25_3 && version < PreviousRelease { // ā FIX: Update to V25_4
// PATTERN 3: Feature availability checks
featureAvailableAt := V25_3 // ā Maybe update to V25_4 if it's "current release"
Validate your fix:
# Run the specific test
./dev test <package> -f <TestName> -v
Search for affected tests:
# Find tests that might have this issue (binary version < PreviousRelease)
grep -r "V25_3.*PreviousRelease\|PreviousRelease.*V25_3" --include="*_test.go" pkg/
# Review each match carefully - don't automatically change all!
Example fix from PR #156535:
// File: pkg/crosscluster/logical/logical_replication_job_test.go
// Test: TestGetWriterType/immediate-mode
// BEFORE (BROKEN after PreviousRelease ā V25_4):
st := cluster.MakeTestingClusterSettingsWithVersions(
clusterversion.V25_3.Version(), // Binary: 25.3
clusterversion.PreviousRelease.Version(), // Min: 25.4 ā
true,
)
// AFTER (FIXED - binary version must be >= min supported):
st := cluster.MakeTestingClusterSettingsWithVersions(
clusterversion.V25_4.Version(), // Binary: 25.4 ā
clusterversion.PreviousRelease.Version(), // Min: 25.4 ā
true,
)
// RATIONALE: This test is checking immediate-mode writer behavior on a
// current-version cluster, not testing upgrade from 25.3. The binary
// version should match the new PreviousRelease.
Cause: After adding the cockroach-go-testserver-25.4 configuration, tests that verify "feature X is NOT supported before version Y" will fail when running on the 25.4 testserver because the feature is now available.
Example symptom:
--- FAIL: TestLogic_mixed_version_partial_stats (24.32s)
logic.go:4491: expected "pq: creating partial statistics with a WHERE clause is not yet supported",
but no error occurred
Real-world example from PR #156535:
# File: pkg/sql/logictest/testdata/logic_test/mixed_version_partial_stats
# LogicTest: cockroach-go-testserver-configs ā Runs on ALL testserver configs
# Test expects feature to NOT be available initially
statement error pq: creating partial statistics with a WHERE clause is not yet supported
CREATE STATISTICS pstat ON a FROM t WHERE a > 2;
# When this test runs on testserver-25.4, the feature IS available,
# so the expected error doesn't occur ā test fails
Fix: Restrict the test to run only on the previous testserver config (not the new one):
-# LogicTest: cockroach-go-testserver-configs
+# LogicTest: cockroach-go-testserver-25.3
IMPORTANT: Multi-version testing pattern for upgrade tests
For tests with LogicTest: cockroach-go-testserver-configs that have comments mentioning they test upgrade behavior to a specific series, update them to use the two previous series starting from MinSupported:
Rule:
LogicTest: cockroach-go-testserver-configs
ā
LogicTest: cockroach-go-testserver-{MinSupported} cockroach-go-testserver-{MinSupported+0.1}
How to find MinSupported:
Look in pkg/clusterversion/cockroach_versions.go:
const (
V25_2 // MinSupported
V25_3
V26_1 // Latest
)
Example from PR #156535 (M.3 for 25.4):
Given MinSupported = V25_2, update tests like:
# LogicTest: cockroach-go-testserver-configs
+# LogicTest: cockroach-go-testserver-25.2 cockroach-go-testserver-25.3
-# Sanity check that partial statistics with WHERE clause is only allowed to
-# be used once the cluster is upgraded to 25.4.
+# Sanity check that partial statistics with WHERE clause is only allowed to
+# be used once the cluster is upgraded to 25.4.
Why two versions? This ensures the test validates upgrade behavior from:
Tests that follow this pattern:
mixed_version_partial_stats - Comment mentions "upgraded to 25.4"mixed_version_ltree - Comment mentions "not supported until version 25.4"Pattern from PR #152080:
# In previous M.3 for 25.3, vector_index_mixed was changed from:
# LogicTest: cockroach-go-testserver-configs
# To:
# LogicTest: cockroach-go-testserver-25.2
How to identify tests that need this fix:
mixed_version_* that verify features are gatedupgrade commands: Tests that upgrade from an old version and verify behavior changesWhen TO restrict tests:
When NOT to restrict tests:
upgrade (generic) vs mixed_version_feature_x (specific feature)Search for potentially affected tests:
# Find tests running on all testserver configs
grep -r "# LogicTest: cockroach-go-testserver-configs" pkg/sql/logictest/testdata/logic_test/
# Check each for version-specific behavior
# Examples of tests that likely need restriction:
# - mixed_version_<feature_name> - if testing feature gated at 25.4
# - Tests with "statement error" expecting version-gated errors
# Examples of tests that are fine on all configs:
# - mixed_version_can_login - generic upgrade behavior
# - cross_version_tenant_backup - generic backup behavior during upgrade
Example from PR #156535:
mixed_version_partial_stats ā Changed to cockroach-go-testserver-25.3
mixed_version_can_login ā Still uses cockroach-go-testserver-configs
Cause: The mixed_version_bootstrap_tenant test compares bootstrap data between old and new executables to ensure identical system catalog initialization. When system tables evolve between versions (adding columns, changing descriptors), the test fails because table descriptors at keys /Table/3/1/<tableID>/2/1 differ between versions.
Example symptom from PR #156535:
--- FAIL: TestLogic_mixed_version_bootstrap_tenant
logic.go:4491: expected 0 rows, but found differences at:
/Table/3/1/12/2/1 old_executable <descriptor_hash_1>
/Table/3/1/12/2/1 new_executable <descriptor_hash_2>
/Table/3/1/15/2/1 old_executable <descriptor_hash_3>
/Table/3/1/15/2/1 new_executable <descriptor_hash_4>
Root cause: System tables changed between versions. For example:
payload column in PR #146130Before choosing a fix, consult code owners:
This test has complex trade-offs between test coverage and maintenance burden. Before implementing a fix, consult:
You can find current contributors using:
git log --format='%an <%ae>' --follow pkg/sql/logictest/testdata/logic_test/mixed_version_bootstrap_tenant | sort | uniq -c | sort -rn
Two possible approaches:
Approach A: Restrict test to older testserver config
Limit the test to run only on previous version configs where schema hasn't evolved yet:
-# LogicTest: cockroach-go-testserver-configs
+# LogicTest: cockroach-go-testserver-25.2
ā Pros:
ā Cons:
Use this when: Schema evolution is expected and intentional, and validating older version consistency is more important than newer version coverage.
Approach B: Update test to exclude table descriptors
Modify the test's WHERE clause to exclude table descriptor keys that are allowed to differ:
SELECT k, is_present_in, v FROM bootstrapped_tenant_data
WHERE is_present_in <> 'both'
AND k <> '/Table/6/1/"version"/0'
+ AND k NOT LIKE '/Table/3/1/%/2/1'
ORDER BY 1, 2
Also update the duplicate assertion:
SELECT 1/count(*) FROM bootstrapped_tenant_data
WHERE is_present_in <> 'both'
AND k <> '/Table/6/1/"version"/0'
+ AND k NOT LIKE '/Table/3/1/%/2/1'
ā Pros:
ā Cons:
/Table/3/1/%/2/1 is broad (excludes all table descriptors, not just evolved ones)Use this when: You want to maintain broad coverage but accept that system table schemas evolve between versions as part of normal development.
Historical context:
This test has evolved over time:
cockroach-go-testserver-configs to cockroach-go-testserver-24.1 due to schema evolutioncockroach-go-testserver-configs with comment: "Check that the bootstrapped data for tenants remains the same... Only the 'version' value in system.settings and table descriptors in system.descriptor are allowed to differ"The comment in commit 11d257111aa suggests that Approach B (excluding descriptors) may align with the original test author's intent, but consult with the code owners to confirm.
Decision matrix:
| Scenario | Recommended Approach | Rationale |
|---|---|---|
| System table changes are intentional for 25.4 | Approach B | Test comment already acknowledges descriptors can differ |
| You're unsure if schema changes are intentional | Consult code owners first | Don't mask potential bugs |
| Test keeps breaking every quarter | Approach B | Reduce maintenance burden |
| Coverage of all configs is critical | Approach B | Maintains broader test execution |
This is a quarterly task - validate your changes against previous M.3 PRs to ensure consistency and catch any missing or unexpected files.
Compare your changes against these previous M.3 PRs:
Use the combined PR as the primary reference if you're doing a single-PR approach, or the separate PRs if you're doing the two-PR approach.
Single combined PR approach (~14 files):
pkg/cmd/roachtest/fixtures/{1,2,3,4}/checkpoint-v25.4.tgzpkg/testutils/release/cockroach_releases.yamlpkg/sql/logictest/REPOSITORIES.bzlpkg/clusterversion/cockroach_versions.gopkg/cli/testdata/declarative-rules/deprulespkg/BUILD.bazel (visibility update)pkg/sql/logictest/logictestbase/logictestbase.gopkg/sql/logictest/BUILD.bazelpkg/sql/logictest/tests/cockroach-go-testserver-25.4/{BUILD.bazel,generated_test.go}Two-PR approach:
1. Get file lists:
# Your current PR files
gh pr view <YOUR_PR_NUMBER> --json files --jq '.files[].path' | sort > /tmp/current_m3_files.txt
# Reference PR files (use #141765 for combined, or #152080 for code-only)
gh pr view 141765 --json files --jq '.files[].path' | sort > /tmp/ref_m3_combined.txt
# Compare
comm -3 /tmp/current_m3_files.txt /tmp/ref_m3_combined.txt
2. Expected differences (version-specific files):
checkpoint-v25.4.tgz files, reference has checkpoint-v25.1.tgz (or v25.3 for #141765)cockroach-go-testserver-25.4/ directory, reference has cockroach-go-testserver-25.1/ (or 25.3)CLAUDE.md updates with new documentation3. Investigate if:
4. Review each unexpected file:
# For each file only in your PR, understand why:
gh pr diff <YOUR_PR_NUMBER> -- path/to/unexpected/file
# Is it:
# - A necessary test fix due to PreviousRelease bump? ā OK
# - Documentation update? ā OK
# - An unrelated change that should be in a separate PR? ā Remove
Before creating/updating the M.3 PR:
Example validation from PR #156535:
$ comm -3 /tmp/current_m3_files.txt /tmp/ref_m3_combined.txt
# Only in current (14 files):
CLAUDE.md # ā OK: New documentation
checkpoint-v25.4.tgz (Ć4) # ā OK: Version-specific
cockroach-go-testserver-25.4/ (Ć2) # ā OK: Version-specific
# Only in reference (12 files):
README.md # ā OK: Not needed in this PR
mixed_version_stats, mixed_version_ttl # ā OK: Bootstrap updates from M.2
cockroach-go-testserver-25.1/ (Ć2) # ā OK: Version-specific to that quarter
# Conclusion: All differences explained and expected ā
In the release cycle:
./scripts/gceworker.sh update-firewall BEFORE creating gceworkerSetup gceworker - On Mac (one-time):
# Optional: Set default zone
echo 'export CLOUDSDK_COMPUTE_ZONE=us-west1-a' >> ~/.zshrc
source ~/.zshrc
# REQUIRED: Update firewall before creating
./scripts/gceworker.sh update-firewall
# Create gceworker
./scripts/gceworker.sh create
# Connect to gceworker
./scripts/gceworker.sh start
Fixtures PR - On Mac:
# Update releases file
bazel build //pkg/cmd/release:release
_bazel/bin/pkg/cmd/release/release_/release update-releases-file
# Verify changes
grep -A 2 '"25.4":' pkg/testutils/release/cockroach_releases.yaml
Fixtures Generation - On gceworker:
# Setup (after SSH'ing in)
cd ~/go/src/github.com/cockroachdb/cockroach
git checkout v25.4.0-rc.1
export FIXTURE_VERSION=v25.4.0-rc.1
export COCKROACH_DEV_LICENSE="<license>"
# Run dev doctor (first time only)
./dev doctor
# Press Enter for "dev", type "n" for lintonbuild
# Build and generate
./dev build cockroach short //c-deps:libgeos roachprod workload roachtest
./bin/roachprod destroy local # Error "cluster local does not exist" is OK
./bin/roachtest run generate-fixtures --local --debug \
--cockroach ./cockroach --suite fixtures
# Move fixtures (use command from test output)
for i in 1 2 3 4; do
mkdir -p pkg/cmd/roachtest/fixtures/${i} && \
mv artifacts/generate-fixtures/run_1/logs/${i}.unredacted/checkpoint-*.tgz \
pkg/cmd/roachtest/fixtures/${i}/
done
# Verify (should be 3-5 MB each)
ls -lh pkg/cmd/roachtest/fixtures/*/checkpoint-v25.4.tgz
Copy Fixtures - On Mac:
# Create tmp directory
mkdir -p /tmp/fixtures-25.4
# Copy from gceworker (adjust username and zone)
scp -r gceworker-<name>.us-west1-a.cockroach-workers:~/go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/fixtures /tmp/fixtures-25.4/
# Copy to local repo
for i in 1 2 3 4; do
cp /tmp/fixtures-25.4/fixtures/${i}/checkpoint-v25.4.tgz pkg/cmd/roachtest/fixtures/${i}/
done
Code PR - On Mac:
# After fixtures PR merges
git checkout master && git pull origin master
git checkout -b enable-upgrade-tests-25.4-code
# Make code changes (PreviousRelease, testserver config, BUILD.bazel visibility)
# Then:
./dev gen bazel
# Verify
./dev testlogic base --config=cockroach-go-testserver-25.4 --files=cluster_settings