v3/implementation/adrs/ADR-028-neural-attention-mechanisms.md
Status: Proposed Date: 2026-01-16 Author: System Architecture Designer Version: 1.0.0
Claude-Flow v3 requires advanced attention mechanisms for several critical operations:
Agent Memory Retrieval: Current memory lookups use basic vector similarity (cosine/dot product). More sophisticated attention mechanisms can improve retrieval quality by weighing relevance, recency, and contextual importance.
Context Window Optimization: With growing context sizes, efficient attention computation becomes critical. Sparse and linear attention variants can reduce O(n^2) complexity to O(n) or O(n log n).
Pattern Matching in Code: Code analysis benefits from structural attention that respects AST hierarchies, function boundaries, and import relationships.
Swarm Coordination Signals: Multi-agent coordination requires cross-attention mechanisms to share relevant context between agents while filtering noise.
The RuVector intelligence system provides 39 attention mechanism implementations optimized for AI agent workloads. Integrating these into Claude-Flow v3 will significantly improve memory retrieval, context management, and agent coordination.
Integrate RuVector's 39 attention mechanism types into Claude-Flow v3 via a unified AttentionService that provides:
| Type | Description | Use Case | Complexity |
|---|---|---|---|
standard-mha | Standard multi-head attention (Vaswani et al.) | General purpose | O(n^2) |
rotary-mha | Multi-head with rotary position embeddings (RoPE) | Long sequences | O(n^2) |
alibi-mha | Multi-head with ALiBi position bias | Extrapolation | O(n^2) |
grouped-query-attention | GQA (fewer KV heads) | Memory efficient | O(n^2) |
multi-query-attention | Single KV head shared | Fast inference | O(n^2) |
differential-attention | Attention with derivative signals | Change detection | O(n^2) |
mixture-attention | Multiple attention patterns combined | Hybrid tasks | O(n^2) |
// Standard multi-head attention configuration
interface MultiHeadAttentionConfig {
numHeads: number; // Number of attention heads (default: 8)
headDim: number; // Dimension per head (default: 64)
dropout: number; // Dropout rate (default: 0.1)
qkvBias: boolean; // Include bias in QKV projections
outputBias: boolean; // Include bias in output projection
scaleFactor?: number; // Custom scaling (default: 1/sqrt(headDim))
}
| Type | Description | Use Case | Complexity |
|---|---|---|---|
causal-self-attention | Masked for autoregressive generation | Code completion | O(n^2) |
bidirectional-self-attention | Full context both directions | Understanding | O(n^2) |
relative-position-attention | Relative position encodings | Sequence modeling | O(n^2) |
disentangled-attention | Separate content/position (DeBERTa) | NLU tasks | O(n^2) |
talking-heads-attention | Inter-head communication | Complex reasoning | O(n^2) |
synthesizer-attention | Learned attention patterns | Fast inference | O(n) |
// Causal self-attention for autoregressive tasks
interface CausalSelfAttentionConfig {
blockSize: number; // Maximum sequence length
numHeads: number; // Attention heads
embedDim: number; // Embedding dimension
maskFuture: boolean; // Mask future tokens (default: true)
useSlidingWindow?: number; // Optional sliding window size
}
| Type | Description | Use Case | Complexity |
|---|---|---|---|
cross-attention | Attend to different sequence | Agent coordination | O(n*m) |
perceiver-attention | Latent bottleneck | Large inputs | O(n*l) |
gated-cross-attention | Gated fusion | Multi-modal | O(n*m) |
memory-attention | Attend to memory bank | RAG retrieval | O(n*k) |
hierarchical-cross-attention | Multi-level attention | Document analysis | O(n*m) |
// Cross-attention for agent coordination
interface CrossAttentionConfig {
queryDim: number; // Query dimension (agent A)
keyValueDim: number; // Key/Value dimension (agent B)
numHeads: number; // Attention heads
outputDim: number; // Output dimension
useGating?: boolean; // Gated fusion (default: false)
maxMemorySlots?: number; // Memory bank size (for memory-attention)
}
| Type | Description | Use Case | Complexity |
|---|---|---|---|
bigbird-attention | Block sparse + global + random | Long documents | O(n) |
longformer-attention | Sliding window + global | Long sequences | O(n) |
local-attention | Fixed window only | Local context | O(n*w) |
strided-attention | Strided patterns | Structured data | O(n*s) |
sparse-transformer-attention | Factorized patterns | Images/sequences | O(n*sqrt(n)) |
star-attention | Hub-and-spoke topology | Swarm coordination | O(n*h) |
blockwise-attention | Chunked computation | Memory efficient | O(n^2) chunked |
random-attention | Random sparse sampling | Approximation | O(n*k) |
// Longformer-style attention for long contexts
interface LongformerAttentionConfig {
windowSize: number; // Local attention window (default: 512)
globalTokens: number[]; // Token indices with global attention
dilationRates?: number[]; // Dilation for each layer
numHeads: number;
embedDim: number;
}
// BigBird attention combining multiple sparse patterns
interface BigBirdAttentionConfig {
blockSize: number; // Block size for sparse attention
numGlobalTokens: number; // Number of global attention tokens
numRandomBlocks: number; // Random blocks per row
windowSize: number; // Sliding window size
numHeads: number;
embedDim: number;
}
| Type | Description | Use Case | Complexity |
|---|---|---|---|
linear-attention | Kernel feature maps | Real-time inference | O(n) |
performer-attention | FAVOR+ random features | Large scale | O(n) |
cosformer-attention | Cos-based re-weighting | Efficient softmax | O(n) |
rfa-attention | Random feature attention | Memory efficient | O(n) |
nystromer-attention | Nystrom approximation | Matrix approximation | O(n) |
linformer-attention | Low-rank projection | Compression | O(n) |
// Linear attention with kernel approximation
interface LinearAttentionConfig {
featureDim: number; // Random feature dimension
kernelType: 'elu' | 'relu' | 'softmax' | 'exp';
numHeads: number;
embedDim: number;
useNormalization?: boolean; // Normalize outputs
epsNumericalStability?: number;
}
// Performer FAVOR+ attention
interface PerformerAttentionConfig {
numRandomFeatures: number; // Number of random features (default: 256)
orthogonalFeatures: boolean; // Use orthogonal random features
redraw: boolean; // Redraw features each forward pass
numHeads: number;
embedDim: number;
}
| Type | Description | Use Case | Complexity |
|---|---|---|---|
flash-attention-v2 | IO-aware exact attention | GPU optimization | O(n^2) IO-optimized |
flash-attention-v3 | Hopper architecture optimized | H100/H200 | O(n^2) IO-optimized |
flash-decoding | Optimized for inference | Fast generation | O(n^2) cached |
// Flash Attention v2 configuration
interface FlashAttentionConfig {
causal: boolean; // Causal masking
softmaxScale?: number; // Custom softmax scaling
windowSize?: [number, number]; // Sliding window (left, right)
returnSoftmax: boolean; // Return attention weights
// Hardware optimization
blockSizeQ: number; // Block size for queries (default: 128)
blockSizeKV: number; // Block size for keys/values (default: 64)
numWarps: number; // CUDA warps per block
numStages: number; // Pipeline stages
}
| Type | Description | Use Case | Complexity |
|---|---|---|---|
moe-attention | Expert-routed attention | Task specialization | O(n^2/E) |
soft-moe-attention | Soft expert mixing | Smooth routing | O(n^2) |
switch-attention | Single expert selection | Sparse activation | O(n^2/E) |
expert-choice-attention | Expert selects tokens | Load balanced | O(n^2/E) |
// Mixture-of-Experts attention configuration
interface MoEAttentionConfig {
numExperts: number; // Number of expert attention modules
topK: number; // Experts selected per token (default: 2)
capacityFactor: number; // Expert capacity (default: 1.25)
routerType: 'softmax' | 'sigmoid' | 'top-k';
loadBalancingLoss: boolean; // Add load balancing auxiliary loss
expertDim: number; // Dimension per expert
numHeads: number;
embedDim: number;
}
Problem: Current memory retrieval uses simple cosine similarity which misses contextual relevance.
Solution: Use memory-attention with learned relevance weighting.
// Memory retrieval with attention
class AttentionEnhancedMemory {
private memoryAttention: MemoryAttention;
private memoryBank: Float32Array[];
async retrieve(query: Float32Array, k: number): Promise<MemoryEntry[]> {
// Step 1: Get top-k candidates via HNSW (fast)
const candidates = await this.hnsw.search(query, k * 3);
// Step 2: Rerank with attention (accurate)
const attentionScores = await this.memoryAttention.attend(
query, // Query vector
candidates.map(c => c.embedding), // Key vectors
candidates.map(c => c.content) // Value vectors
);
// Step 3: Return top-k by attention score
return attentionScores
.sort((a, b) => b.score - a.score)
.slice(0, k);
}
}
Configuration:
{
attention: {
memoryRetrieval: {
type: 'memory-attention',
numHeads: 4,
useGating: true,
maxMemorySlots: 1000
}
}
}
Problem: Large context windows cause quadratic memory/compute growth.
Solution: Use longformer-attention or bigbird-attention for O(n) scaling.
// Context window manager with sparse attention
class OptimizedContextManager {
private attention: LongformerAttention;
constructor(config: LongformerAttentionConfig) {
this.attention = new LongformerAttention({
windowSize: 512, // Local window
globalTokens: [0], // CLS token global
numHeads: 8,
embedDim: 768
});
}
async processLongContext(
tokens: Float32Array,
maxLength: number = 32768
): Promise<Float32Array> {
// O(n) attention instead of O(n^2)
return this.attention.forward(tokens, {
slidingWindowMask: true,
globalMask: this.buildGlobalMask(tokens)
});
}
}
Performance Comparison:
| Context Length | Standard MHA | Longformer | Speedup |
|---|---|---|---|
| 4K tokens | 16ms | 4ms | 4x |
| 16K tokens | 256ms | 16ms | 16x |
| 64K tokens | 4096ms | 64ms | 64x |
| 128K tokens | OOM | 128ms | - |
Problem: Code patterns have hierarchical structure (files > classes > functions > statements).
Solution: Use hierarchical-cross-attention with AST-aware masking.
// AST-aware code attention
class CodePatternAttention {
private hierarchicalAttention: HierarchicalCrossAttention;
async matchPatterns(
codeEmbedding: Float32Array,
patternBank: CodePattern[]
): Promise<PatternMatch[]> {
// Build hierarchical representation
const levels = [
this.getFileLevel(codeEmbedding),
this.getClassLevel(codeEmbedding),
this.getFunctionLevel(codeEmbedding),
this.getStatementLevel(codeEmbedding)
];
// Hierarchical attention across all patterns
const matches = await this.hierarchicalAttention.forward({
queries: levels,
keys: patternBank.map(p => p.embedding),
values: patternBank.map(p => p.metadata),
levelMasks: this.buildASTMasks(codeEmbedding)
});
return matches.filter(m => m.confidence > 0.7);
}
}
Problem: Multi-agent communication generates noise; agents need selective attention.
Solution: Use star-attention for hub-and-spoke coordination or cross-attention for peer-to-peer.
// Swarm coordinator with star attention topology
class SwarmAttentionCoordinator {
private starAttention: StarAttention;
constructor() {
this.starAttention = new StarAttention({
numSpokes: 8, // Max worker agents
hubDim: 768, // Coordinator embedding dim
spokeDim: 512, // Worker embedding dim
numHeads: 4
});
}
async coordinateAgents(
coordinatorState: Float32Array,
agentStates: Map<string, Float32Array>
): Promise<CoordinationSignals> {
// Hub attends to all spokes
const signals = await this.starAttention.forward({
hub: coordinatorState,
spokes: Array.from(agentStates.values())
});
// Return selective signals per agent
return {
broadcast: signals.hubOutput,
targeted: signals.spokeOutputs.map((s, i) => ({
agentId: Array.from(agentStates.keys())[i],
signal: s
}))
};
}
}
Topology Options:
| Topology | Attention Type | Best For |
|---|---|---|
| Hierarchical | star-attention | Coordinator + workers |
| Mesh | cross-attention | Peer-to-peer |
| Hierarchical-Mesh | moe-attention | Mixed coordination |
| Adaptive | switch-attention | Dynamic topology |
// v3/@claude-flow/memory/src/attention-enhanced-memory.ts
import { AttentionService } from '@claude-flow/attention';
import { UnifiedMemoryService } from './unified-memory-service.js';
export class AttentionEnhancedMemoryService extends UnifiedMemoryService {
private attentionService: AttentionService;
constructor(config: MemoryConfig & AttentionConfig) {
super(config);
this.attentionService = new AttentionService({
defaultMechanism: config.attention?.type || 'memory-attention',
fallbackMechanism: 'linear-attention',
gpuAcceleration: config.attention?.useGPU ?? true
});
}
async searchSemantic(
query: string,
k: number,
options?: { useAttention?: boolean }
): Promise<MemoryEntry[]> {
// Standard HNSW retrieval
const candidates = await super.searchSemantic(query, k * 3);
if (!options?.useAttention) {
return candidates.slice(0, k);
}
// Rerank with attention
const queryEmbedding = await this.embed(query);
const reranked = await this.attentionService.rerank(
queryEmbedding,
candidates.map(c => c.embedding!),
candidates
);
return reranked.slice(0, k);
}
}
// v3/@claude-flow/intelligence/src/sona-attention.ts
import { AttentionService, FlashAttentionConfig } from '@claude-flow/attention';
export class SONAWithAttention {
private flashAttention: AttentionService;
private moeAttention: AttentionService;
constructor() {
// Use Flash Attention for SONA's self-attention layers
this.flashAttention = new AttentionService({
mechanism: 'flash-attention-v2',
config: {
causal: false,
blockSizeQ: 128,
blockSizeKV: 64
} as FlashAttentionConfig
});
// Use MoE attention for expert routing
this.moeAttention = new AttentionService({
mechanism: 'moe-attention',
config: {
numExperts: 8,
topK: 2,
loadBalancingLoss: true
}
});
}
async adapt(pattern: Pattern): Promise<AdaptationResult> {
// Fast adaptation with Flash Attention (2.49x-7.47x speedup)
const attended = await this.flashAttention.forward(
pattern.embedding,
this.expertWeights
);
// Route to specialized expert
const expertOutput = await this.moeAttention.forward(attended);
return {
adaptedPattern: expertOutput,
expertUsage: this.moeAttention.getExpertUsage()
};
}
}
// v3/@claude-flow/cli/src/hooks/attention-hooks.ts
import { AttentionService } from '@claude-flow/attention';
export const attentionHooks = {
/**
* Pre-retrieval hook: configure attention for query
*/
'pre-retrieval': async (ctx: HookContext) => {
const queryComplexity = await analyzeQueryComplexity(ctx.query);
// Select attention mechanism based on complexity
ctx.attentionConfig = {
mechanism: queryComplexity > 0.7
? 'memory-attention' // Complex queries need full attention
: 'linear-attention', // Simple queries use O(n) attention
numHeads: Math.ceil(queryComplexity * 8),
useGating: queryComplexity > 0.5
};
return ctx;
},
/**
* Post-retrieval hook: log attention patterns
*/
'post-retrieval': async (ctx: HookContext) => {
if (ctx.attentionWeights) {
await storeAttentionPattern({
query: ctx.query,
weights: ctx.attentionWeights,
mechanism: ctx.attentionConfig.mechanism,
timestamp: Date.now()
});
}
return ctx;
},
/**
* Intelligence hook: route to attention mechanism
*/
'intelligence': async (ctx: HookContext) => {
const taskType = detectTaskType(ctx.task);
const mechanisms: Record<string, string> = {
'code-analysis': 'hierarchical-cross-attention',
'memory-retrieval': 'memory-attention',
'long-context': 'longformer-attention',
'swarm-coordination': 'star-attention',
'default': 'standard-mha'
};
ctx.recommendedAttention = mechanisms[taskType] || mechanisms.default;
return ctx;
}
};
// Automatic GPU detection and fallback
interface AttentionAccelerationConfig {
// GPU settings
preferGPU: boolean; // Prefer GPU when available
gpuMemoryLimit?: number; // Max GPU memory (MB)
gpuDeviceId?: number; // Specific GPU device
// Fallback settings
fallbackToLinear: boolean; // Use linear attention on CPU
fallbackThreshold: number; // Sequence length to trigger fallback
// Mixed precision
useFP16: boolean; // Use half precision
useBF16: boolean; // Use bfloat16 (Ampere+)
}
// Runtime acceleration selection
class AttentionAccelerator {
async selectBackend(
mechanism: string,
sequenceLength: number,
config: AttentionAccelerationConfig
): Promise<AttentionBackend> {
// Check GPU availability
const gpuAvailable = await this.checkGPU();
if (gpuAvailable && config.preferGPU) {
// Use Flash Attention on GPU
if (mechanism.includes('flash') || sequenceLength > 1024) {
return new FlashAttentionGPU(config);
}
return new CUDAAttention(mechanism, config);
}
// CPU fallback
if (sequenceLength > config.fallbackThreshold) {
// Switch to linear attention for long sequences on CPU
return new LinearAttentionCPU(config);
}
return new StandardAttentionCPU(mechanism, config);
}
}
// Attention weight quantization for memory efficiency
interface QuantizationConfig {
weightsQuantization: 'none' | 'int8' | 'int4' | 'fp16';
activationsQuantization: 'none' | 'int8' | 'fp16';
dynamicQuantization: boolean; // Quantize per-batch
calibrationSamples?: number; // Samples for static quantization
}
// Memory savings with quantization
// | Precision | Memory | Speedup | Accuracy Loss |
// |-----------|--------|---------|---------------|
// | FP32 | 100% | 1x | 0% |
// | FP16 | 50% | 1.5x | <0.1% |
// | INT8 | 25% | 2x | <0.5% |
// | INT4 | 12.5% | 3x | <2% |
// KV-cache for autoregressive generation
interface KVCacheConfig {
maxCacheSize: number; // Maximum cached tokens
slidingWindow?: number; // Sliding window for bounded cache
compressionRatio?: number; // Compress old cache entries
evictionPolicy: 'lru' | 'fifo' | 'attention-score';
}
class AttentionKVCache {
private keyCache: Map<number, Float32Array>;
private valueCache: Map<number, Float32Array>;
async getCachedAttention(
query: Float32Array,
newKeys: Float32Array,
newValues: Float32Array,
position: number
): Promise<Float32Array> {
// Append new KV to cache
this.keyCache.set(position, newKeys);
this.valueCache.set(position, newValues);
// Compute attention with full cached history
const allKeys = this.getAllKeys();
const allValues = this.getAllValues();
return this.attention.forward(query, allKeys, allValues);
}
async evict(): Promise<void> {
if (this.keyCache.size <= this.config.maxCacheSize) return;
switch (this.config.evictionPolicy) {
case 'lru':
this.evictLRU();
break;
case 'attention-score':
await this.evictByAttentionScore();
break;
default:
this.evictFIFO();
}
}
}
@claude-flow/attention packageAttentionService interfacev3/@claude-flow/attention/
├── package.json
├── src/
│ ├── index.ts # Public API exports
│ ├── attention-service.ts # Main service class
│ ├── types.ts # TypeScript interfaces
│ │
│ ├── mechanisms/
│ │ ├── multi-head/
│ │ │ ├── standard-mha.ts
│ │ │ ├── rotary-mha.ts
│ │ │ ├── alibi-mha.ts
│ │ │ ├── grouped-query.ts
│ │ │ ├── multi-query.ts
│ │ │ ├── differential.ts
│ │ │ └── mixture.ts
│ │ │
│ │ ├── self-attention/
│ │ │ ├── causal.ts
│ │ │ ├── bidirectional.ts
│ │ │ ├── relative-position.ts
│ │ │ ├── disentangled.ts
│ │ │ ├── talking-heads.ts
│ │ │ └── synthesizer.ts
│ │ │
│ │ ├── cross-attention/
│ │ │ ├── standard-cross.ts
│ │ │ ├── perceiver.ts
│ │ │ ├── gated-cross.ts
│ │ │ ├── memory.ts
│ │ │ └── hierarchical.ts
│ │ │
│ │ ├── sparse/
│ │ │ ├── bigbird.ts
│ │ │ ├── longformer.ts
│ │ │ ├── local.ts
│ │ │ ├── strided.ts
│ │ │ ├── sparse-transformer.ts
│ │ │ ├── star.ts
│ │ │ ├── blockwise.ts
│ │ │ └── random.ts
│ │ │
│ │ ├── linear/
│ │ │ ├── linear.ts
│ │ │ ├── performer.ts
│ │ │ ├── cosformer.ts
│ │ │ ├── rfa.ts
│ │ │ ├── nystrom.ts
│ │ │ └── linformer.ts
│ │ │
│ │ ├── flash/
│ │ │ ├── flash-v2.ts
│ │ │ ├── flash-v3.ts
│ │ │ └── flash-decoding.ts
│ │ │
│ │ └── moe/
│ │ ├── moe.ts
│ │ ├── soft-moe.ts
│ │ ├── switch.ts
│ │ └── expert-choice.ts
│ │
│ ├── acceleration/
│ │ ├── gpu-backend.ts
│ │ ├── cpu-backend.ts
│ │ ├── quantization.ts
│ │ └── kv-cache.ts
│ │
│ └── utils/
│ ├── masking.ts
│ ├── position-encoding.ts
│ └── scaling.ts
│
└── tests/
├── multi-head.test.ts
├── sparse.test.ts
├── linear.test.ts
├── flash.test.ts
├── moe.test.ts
└── integration.test.ts
// claude-flow.config.json
{
"attention": {
// Default mechanism for general use
"defaultMechanism": "standard-mha",
// Task-specific overrides
"taskOverrides": {
"memoryRetrieval": "memory-attention",
"longContext": "longformer-attention",
"codeAnalysis": "hierarchical-cross-attention",
"swarmCoordination": "star-attention"
},
// Performance settings
"acceleration": {
"preferGPU": true,
"fallbackToLinear": true,
"fallbackThreshold": 4096
},
// Quantization
"quantization": {
"weights": "fp16",
"activations": "fp16",
"dynamicQuantization": false
},
// Caching
"cache": {
"enabled": true,
"maxSize": 8192,
"evictionPolicy": "attention-score"
}
}
}
# List available attention mechanisms
npx @claude-flow/cli@latest attention list
# Benchmark attention mechanism
npx @claude-flow/cli@latest attention benchmark --mechanism longformer --sequence-length 16384
# Set default attention mechanism
npx @claude-flow/cli@latest attention set-default --mechanism flash-attention-v2
# Show attention statistics
npx @claude-flow/cli@latest attention stats
# Clear attention cache
npx @claude-flow/cli@latest attention cache clear
| Metric | Target | Mechanism |
|---|---|---|
| Memory retrieval latency | <10ms | memory-attention |
| Long context (32K tokens) | <100ms | longformer-attention |
| Swarm coordination | <5ms | star-attention |
| Flash Attention speedup | 2.49x-7.47x | flash-attention-v2/v3 |
| Linear attention throughput | 10x standard | performer-attention |
| Memory overhead (KV cache) | <500MB | configurable |
The @ruvector/[email protected] package now provides native WASM implementations of several attention and intelligence components described in this ADR. These run at near-native speed without GPU requirements.
| This ADR Concept | ruvllm-wasm Class | Status |
|---|---|---|
| HNSW Search (150x-12,500x) | HnswRouterWasm | Published, working (v2.0.1) |
| SONA Adaptation (<0.05ms) | SonaInstantWasm | Published, working |
| KV Cache Management | KvCacheWasm | Published, working |
| LoRA Adaptation | MicroLoraWasm | Published, working (ranks 1-4, <10KB) |
| Memory Pool / Buffer | BufferPoolWasm | Published, working |
| Inference Arena | InferenceArenaWasm | Published, working |
| Component | JS Implementation | WASM Implementation | Expected Speedup |
|---|---|---|---|
| HNSW Search | FlashAttention (JS) | HnswRouterWasm | 5-50x |
| SONA Adapt | SemanticRouter (JS) | SonaInstantWasm | 10-100x |
| LoRA Train | LoRAAdapter (JS) | MicroLoraWasm | 3-10x |
| KV Cache | Manual Map<> | KvCacheWasm | 2-5x |
// Detection and fallback
import { isRuvllmWasmAvailable, initRuvllmWasm } from '../ruvector/ruvllm-wasm.js';
if (await isRuvllmWasmAvailable()) {
await initRuvllmWasm();
// Use WASM-native HNSW, SONA, LoRA
} else {
// Fallback to existing JS implementations
}
HnswRouterWasm.addPattern() bugwasm_random()).{__wbg_ptr: number}) — use .toJson() or named accessors.initSync requires object form: initSync({ module: bytes }) (not raw bytes).@ruvector/[email protected] on npmStatus: Proposed (WASM implementations partially available) Priority: High Estimated Effort: 5 weeks (reduced with WASM components) Dependencies: ADR-006 (Memory), ADR-017 (RuVector) Updated: 2026-03-17