agentic/port_eugene_rules.md
This document tracks the progress of porting lint rules from Eugene to the postgres-language-server analyzer.
Eugene is a PostgreSQL migration linter that detects dangerous operations. We are porting its rules to the postgres-language-server to provide similar safety checks within the language server environment.
Eugene source location: eugene/eugene/src/lints/rules.rs
Hint metadata location: eugene/eugene/src/hint_data.rs
Example SQL files: eugene/eugene/examples/*/
Read Eugene's implementation in eugene/eugene/src/lints/rules.rs
added_serial_column)Read the hint metadata in eugene/eugene/src/hint_data.rs
Review example SQL in eugene/eugene/examples/<ID>/
bad.sql - Invalid cases that should trigger the rulegood.sql - Valid cases that should NOT trigger# Create rule with appropriate severity (error/warn)
just new-lintrule safety <ruleName> <severity>
# Example:
just new-lintrule safety addSerialColumn error
This generates:
crates/pgls_analyser/src/lint/safety/<rule_name>.rscrates/pgls_analyser/tests/specs/safety/<ruleName>/basic.sqlFile: crates/pgls_analyser/src/lint/safety/<rule_name>.rs
use pgls_analyse::{context::RuleContext, declare_lint_rule, Rule, RuleDiagnostic, RuleSource};
use pgls_console::markup;
use pgls_diagnostics::Severity;
declare_lint_rule! {
/// Brief one-line description (shown in lists).
///
/// Detailed explanation of what the rule detects and why it's problematic.
/// Explain the PostgreSQL behavior and performance/safety implications.
///
/// ## Examples
///
/// ### Invalid
///
/// ```sql,expect_diagnostic
/// -- SQL that should trigger the rule
/// ALTER TABLE users ADD COLUMN id serial;
/// ```
///
/// ### Valid
///
/// ```sql
/// -- SQL that should NOT trigger
/// CREATE TABLE users (id serial PRIMARY KEY);
/// ```
///
pub RuleName {
version: "next",
name: "ruleName",
severity: Severity::Error, // or Warning
recommended: true, // or false
sources: &[RuleSource::Eugene("<ID>")], // e.g., "E11"
}
}
impl Rule for RuleName {
type Options = ();
fn run(ctx: &RuleContext<Self>) -> Vec<RuleDiagnostic> {
let mut diagnostics = Vec::new();
// Pattern match on the statement type
if let pgls_query::NodeEnum::AlterTableStmt(stmt) = &ctx.stmt() {
// Rule logic here
if /* condition */ {
diagnostics.push(
RuleDiagnostic::new(
rule_category!(),
None,
markup! {
"Error message with "<Emphasis>"formatting"</Emphasis>"."
},
)
.detail(None, "Additional context about the problem.")
.note("Suggested fix or workaround."),
);
}
}
diagnostics
}
}
Accessing previous statements (for rules like multipleAlterTable):
let file_ctx = ctx.file_context();
let previous = file_ctx.previous_stmts();
Schema normalization (treating empty schema as "public"):
let schema_normalized = if schema.is_empty() {
"public"
} else {
schema.as_str()
};
Checking for specific ALTER TABLE actions:
for cmd in &stmt.cmds {
if let Some(pgls_query::NodeEnum::AlterTableCmd(cmd)) = &cmd.node {
if cmd.subtype() == pgls_query::protobuf::AlterTableType::AtAddColumn {
// Handle ADD COLUMN
}
}
}
Extracting column type:
if let Some(pgls_query::NodeEnum::ColumnDef(col_def)) = &cmd.def.as_ref().and_then(|d| d.node.as_ref()) {
if let Some(type_name) = &col_def.type_name {
let type_str = get_type_name(type_name);
}
}
fn get_type_name(type_name: &pgls_query::protobuf::TypeName) -> String {
type_name
.names
.iter()
.filter_map(|n| {
if let Some(pgls_query::NodeEnum::String(s)) = &n.node {
Some(s.sval.as_str())
} else {
None
}
})
.collect::<Vec<_>>()
.join(".")
}
Directory: crates/pgls_analyser/tests/specs/safety/<ruleName>/
Create multiple test files covering:
-- expect_lint/safety/<ruleName>
-- Description of what this tests
<SQL that should trigger>
-- Description of what this tests
-- expect_no_diagnostics
<SQL that should NOT trigger>
tests/specs/safety/addSerialColumn/
├── basic.sql # Basic case triggering the rule
├── bigserial.sql # Variant (bigserial type)
├── generated_stored.sql # Another variant (GENERATED)
├── valid_regular_column.sql # Valid: regular column
└── valid_create_table.sql # Valid: CREATE TABLE context
Run tests and accept snapshots:
cargo insta test -p pgls_analyser --accept
# Check compilation
cargo check
# Generate lint code and documentation
just gen-lint
# Run all tests
cargo test -p pgls_analyser --test rules_tests
# Final verification
just ready
Create a test SQL file:
-- test.sql
ALTER TABLE users ADD COLUMN id serial;
Run the CLI:
cargo run -p pgls_cli -- check /path/to/test.sql
Problem: Need to match tables across statements with different schema notations.
Solution: Normalize schema names:
let schema_normalized = if schema.is_empty() { "public" } else { schema.as_str() };
Problem: Eugene uses a simplified AST (StatementSummary), but pgt uses the full PostgreSQL AST.
Solution: Use pattern matching and helper functions. Look at existing rules for examples.
Problem: Rules that need to track state across statements (like multipleAlterTable).
Solution: Use ctx.file_context() to access AnalysedFileContext:
let file_ctx = ctx.file_context();
let previous_stmts = file_ctx.previous_stmts();
Problem: Some Eugene rules check transaction state (e.g., RUNNING_STATEMENT_WHILE_HOLDING_ACCESS_EXCLUSIVE).
Solution: This is more complex and may require extending the AnalysedFileContext to track transaction state. Consider implementing simpler rules first.
Problem: expect_lint expects exactly ONE diagnostic, but rule generates multiple.
Solution: Either:
Some Eugene rules may overlap with existing pgt rules:
| Eugene Rule | Potential PGT Overlap | Action |
|---|---|---|
SET_COLUMN_TYPE_TO_JSON | preferJsonb | Review both, may enhance existing |
CREATE_INDEX_NONCONCURRENTLY | requireConcurrentIndexCreation | Review both |
CHANGE_COLUMN_TYPE | changingColumnType | Review both |
ADD_NEW_UNIQUE_CONSTRAINT_WITHOUT_USING_INDEX | disallowUniqueConstraint | Review both |
When overlap exists:
These rules require tracking transaction state across multiple statements:
RUNNING_STATEMENT_WHILE_HOLDING_ACCESS_EXCLUSIVE (E4)LOCKTIMEOUT_WARNING (E9)Approach:
AnalysedFileContextEugene uses StatementSummary enum with simplified representations:
enum StatementSummary {
AlterTable { schema: String, name: String, actions: Vec<AlterTableAction> },
CreateIndex { schema: String, idxname: String, concurrently: bool, target: String },
// ...
}
enum AlterTableAction {
AddColumn { column: String, type_name: String, stored_generated: bool, ... },
SetType { column: String, type_name: String },
// ...
}
We use the full PostgreSQL protobuf AST:
pgls_query::NodeEnum::AlterTableStmt(stmt)
-> stmt.cmds: Vec<Node>
-> NodeEnum::AlterTableCmd(cmd)
-> cmd.subtype: AlterTableType
-> cmd.def: Option<Node>
-> NodeEnum::ColumnDef(col_def)
Translation Strategy:
pgls_query::protobuf for available types/enumseugene/eugene/src/lints/rules.rscrates/pgls_analyser/src/lint/safety/crates/pgls_analyser/CONTRIBUTING.mdcrates/pgls_query/src/lib.rspgls_query::protobuf moduleWhen porting a new rule, ensure:
src/lint/safety/<rule>.rssources: &[RuleSource::Eugene("<ID>")] attributioncargo insta test --acceptcargo test -p pgls_analyser --test rules_testscargo checkjust gen-lint