hadoop-yarn-project/hadoop-yarn/hadoop-yarn-capacity-scheduler-ui/src/main/webapp/docs/design_doc.md
JIRA: YARN-11885
The YARN Capacity Scheduler UI is a modern web interface for managing Apache Hadoop YARN Capacity Scheduler configurations. It provides visual tools for queue management, placement rules, capacity planning, and staged configuration changes with validation before applying them to a live YARN cluster.
Visual Queue Tree Management
Placement Rules Editor
Staged Changes System
Node Labels & Partitions
Real-Time Validation System
Global Scheduler Settings
Read-Only Mode Support
yarn.webapp.scheduler-ui.read-only.enable)The application follows a client-side SPA architecture with clear separation of concerns:
┌─────────────────────────────────────────────────────────────┐
│ React UI Layer │
│ (Routes, Components, Forms, Tree Visualization) │
│ - Queue tree with XYFlow │
│ - Property editors with React Hook Form │
│ - Placement rules builder │
│ - Node labels management │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ State Management Layer │
│ (Zustand Store with Immer - Sliced Architecture) │
│ - 8 feature slices sharing single state tree │
│ - Staged changes before apply │
│ - Cross-queue validation │
│ - Shared API client instance │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ API Client Layer │
│ (YarnApiClient + MSW for development mocking) │
│ - Auto-detects YARN security mode │
│ - Handles authentication (simple vs kerberos) │
│ - MSW: static fixtures / cluster proxy / off │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ YARN ResourceManager REST API │
│ /ws/v1/cluster/* endpoints │
│ - Scheduler data and configuration │
│ - Mutation via PUT /scheduler-conf │
│ - Node labels management │
└─────────────────────────────────────────────────────────────┘
Key Architectural Principles:
Location: src/stores/schedulerStore.ts
The application uses a single Zustand store with Immer middleware for immutable updates. The store is composed of multiple feature slices that share a single state tree and API client instance:
Store Slices (src/stores/slices/):
schedulerDataSlice - Core scheduler data
SchedulerInfo)queueDataSlice - Queue hierarchy utilities
queueSelectionSlice - UI selection state
stagedChangesSlice - Pending changes management
placementRulesSlice - Placement rule authoring
nodeLabelsSlice - Node label operations
capacityEditorSlice - Capacity editing
searchSlice - Queue search
Pattern: All slices use Immer middleware, allowing direct mutation syntax:
set((state) => {
state.stagedChanges.push(newChange);
state.configData.set(key, value);
});
Location: src/lib/api/YarnApiClient.ts
HTTP client for YARN ResourceManager REST APIs with the following features:
Key Capabilities:
?user.name=yarn)yarn.scheduler.capacity.ui.readonly)Primary Methods:
getScheduler() - Fetch queue hierarchy with live metricsgetSchedulerConf() - Fetch configuration propertiesupdateSchedulerConf(updateInfo) - Update configuration (mutation)validateSchedulerConf(updateInfo) - Validate changes before applyinggetNodeLabels() / addNodeLabels() / removeNodeLabels()getNodeToLabels() / replaceNodeToLabels()getNodes() - Cluster nodes informationYARN API Endpoints Used:
GET /ws/v1/cluster/schedulerGET /ws/v1/cluster/scheduler-confPUT /ws/v1/cluster/scheduler-confPOST /ws/v1/cluster/scheduler-conf/validateGET /ws/v1/cluster/scheduler-conf/versionGET /ws/v1/cluster/get-node-labelsPOST /ws/v1/cluster/add-node-labelsPOST /ws/v1/cluster/remove-node-labelsGET /ws/v1/cluster/get-node-to-labelsPOST /ws/v1/cluster/replace-node-to-labelsGET /ws/v1/cluster/nodesGET /conf?name=<property>Property Descriptors (src/config/properties/):
The UI has extensive metadata about YARN scheduler properties:
Each property descriptor includes:
{
name: 'capacity', // Short name without prefix
displayName: 'Capacity', // UI label
description: 'Queue capacity...', // Help text
type: 'number' | 'string' | 'enum' | 'boolean',
category: 'capacity', // Property category
defaultValue: '0', // Default if not set
required: false, // Validation
validationRules: [ // Schema validation
{ type: 'range', min: 0, max: 100, message: 'Must be 0-100' }
],
showWhen: (context) => boolean, // Conditional visibility
enableWhen: (context) => boolean, // Conditional enable
enumValues: [...], // For enum types
enumDisplay: 'choiceCard' | 'toggle', // UI style
displayFormat: { // Numeric formatting
suffix: ' MB',
decimals: 2
}
}
Property Key Format: Hierarchical yarn.scheduler.capacity.<queue-path>.<property>
Examples:
yarn.scheduler.capacity.root.production.capacityyarn.scheduler.capacity.maximum-applicationsyarn.scheduler.capacity.root.dev.accessible-node-labels.gpu.capacitySchemas (src/config/schemas/): Zod schemas for common formats (capacities, ACLs, percentages)
Validation Rules (src/config/validation-rules.ts): Business validation rules with cross-queue logic
Core Principle: Changes are never applied immediately. All modifications go through staging.
Workflow:
1. User edits property
↓
2. stageQueueChange() creates StagedChange object
↓
3. Validation runs (property + cross-queue)
↓
4. User reviews in "Staged Changes" panel
↓
5. User clicks "Apply"
↓
6. applyStagedChanges() converts to YARN mutations
↓
7. POST /scheduler-conf/validate (optional)
↓
8. PUT /scheduler-conf with SchedConfUpdateInfo
StagedChange Structure (src/types/staged-change.ts):
{
id: string; // Unique identifier
type: 'add' | 'update' | 'remove';
queuePath: string | 'global'; // Queue path or 'global'
property: string; // Property name
oldValue?: string; // Previous value
newValue?: string; // New value
timestamp: number; // When staged
label?: string; // For node label changes
validationErrors?: ValidationIssue[];
}
Mutation Builder (src/features/staged-changes/utils/mutationBuilder.ts):
Translates staged changes into YARN's SchedConfUpdateInfo format:
{
"add-queue": [{
"queue-name": "root.new-queue",
"params": {
"entry": [
{ "key": "capacity", "value": "50" },
{ "key": "state", "value": "RUNNING" }
]
}
}],
"update-queue": [{
"queue-name": "root.existing",
"params": {
"entry": [{ "key": "capacity", "value": "75" }]
}
}],
"remove-queue": "root.old-queue",
"global-updates": [{
"entry": [
{ "key": "yarn.scheduler.capacity.maximum-applications", "value": "5000" }
]
}]
}
Multi-Layered Architecture (src/features/validation/):
┌────────────────────────────────────────────────────┐
│ Layer 1: Schema Validation │
│ (Property descriptors - format, range, regex) │
└──────────────────┬─────────────────────────────────┘
▼
┌────────────────────────────────────────────────────┐
│ Layer 2: Business Validation Rules │
│ (validation-rules.ts - cross-field logic) │
└──────────────────┬─────────────────────────────────┘
▼
┌────────────────────────────────────────────────────┐
│ Layer 3: Cross-Queue Validation │
│ (crossQueue.ts - dependency-aware validation) │
│ - Detects affected queues │
│ - Validates parent/children/siblings │
│ - Ensures capacity sums, mode consistency │
└────────────────────────────────────────────────────┘
Key Components:
service.ts - Main validation orchestration
validateField() - Single property validation with contextvalidateQueue() - All properties in a queuehasBlockingIssues() - Check for blocking errorscrossQueue.ts - Cross-queue validation engine
validatePropertyChange() - Validates a property change with cross-queue awarenessvalidateStagedChanges() - Validates all staged changes (or a filtered subset)ruleCategories.ts - Rule categorization
CROSS_QUEUE_RULES - Affects multiple queues (re-validate dependencies)QUEUE_SPECIFIC_RULES - Only validates single queueWARNING_ONLY_RULES - Never blocks applying changesutils/affectedQueues.ts - Dependency detection
Validation Rules Examples (from validation-rules.ts):
CAPACITY_SUM - Sibling capacities must sum correctlyMAX_CAPACITY_CONSTRAINT - Maximum capacity >= capacityCONSISTENT_CAPACITY_MODE - Siblings use same mode (legacy mode)PARENT_CHILD_CAPACITY_CONSTRAINT - Child resources ≤ parentPARENT_CHILD_CAPACITY_MODE - Absolute mode inheritance (legacy)WEIGHT_MODE_TRANSITION_FLEXIBLE_AQC - Auto-queue compatibilityValidation Context:
{
queuePath: string;
fieldName: string;
fieldValue: unknown;
config: Map<string, string>; // Full config + staged
schedulerData?: SchedulerInfo;
stagedChanges: StagedChange[];
legacyModeEnabled: boolean;
}
Data Model (src/types/scheduler.ts):
Queue Path Format: Dot-separated hierarchical identifiers
rootroot.production, root.production.criticalrootQueue Traversal Utilities (queueDataSlice.ts):
Framework: React Router v7 in SPA mode (ssr: false)
Routes (file-based in src/app/routes/):
layout.tsx - Root layout with navigationhome.tsx - Queue tree visualization (XYFlow)placement-rules.tsx - Placement rules editornode-labels.tsx - Node labels managementglobal-settings.tsx - Global scheduler settingsLocation: src/lib/api/mocks/handlers.ts
MSW enables three modes controlled by VITE_API_MOCK_MODE:
static (default in dev)
public/mock/ws/v1/cluster/*.jsoncluster
VITE_CLUSTER_PROXY_TARGEThttp://rm-host:8088off
MSW boots automatically in dev mode via src/app/entry.client.tsx.
babel-plugin-react-compiler) - Automatic memoization and optimization~/)Documentation: docs/development/extending-scheduler-properties.md
Steps:
src/config/properties/queue-properties.ts:{
name: 'my-new-property', // Short name without prefix
displayName: 'My New Property', // UI label
description: 'What this property controls and how it affects scheduling',
type: 'number', // 'string' | 'number' | 'boolean' | 'enum'
category: 'scheduling', // See PropertyCategory type
defaultValue: '10',
required: false,
validationRules: [
{
type: 'range',
message: 'Must be between 0 and 100',
min: 0,
max: 100
}
],
// Optional: Conditional visibility
showWhen: (context) => {
return context.legacyModeEnabled;
},
// Optional: Conditional enable
enableWhen: (context) => {
const state = context.config.get(
buildPropertyKey(context.queuePath, 'state')
);
return state === 'RUNNING';
},
}
Property automatically appears in the property editor UI - no additional wiring needed
Add validation rules if needed (see section 4.2)
Update tests in src/config/__tests__/propertyDefinitions.test.ts
Available Categories:
resource - CPU, memory limitsscheduling - Ordering, prioritiessecurity - ACLs, user/group permissionscore - Fundamental queue settingsapplication-limits - Application count limitsplacement - Queue selection rulescontainer-allocation - Container sizingasync-scheduling - Async mode settingscapacity - Capacity and max-capacitydynamic-queues - Auto-queue creationnode-labels - Label-specific settingspreemption - Preemption policiesFor enum types:
{
name: 'ordering-policy',
type: 'enum',
enumValues: [
{ value: 'fifo', label: 'FIFO', description: 'First-in, first-out' },
{ value: 'fair', label: 'Fair', description: 'Fair sharing' },
{ value: 'priority', label: 'Priority', description: 'By priority' }
],
enumDisplay: 'toggle', // 'toggle' (pills) or 'choiceCard' (large cards)
}
For numeric inputs with formatting:
{
name: 'maximum-allocation-mb',
type: 'number',
displayFormat: {
suffix: ' MB',
decimals: 0
},
}
Documentation: docs/development/adding-validation-rules.md
Steps:
src/config/validation-rules.ts:{
id: 'MY_NEW_RULE',
description: 'Brief explanation of the constraint being enforced',
level: 'error', // or 'warning'
triggers: ['my-new-property', 'related-property'], // Properties that trigger
evaluate: (context) => evaluateMyNewRule(context),
}
function evaluateMyNewRule(context: ValidationContext): ValidationIssue[] {
const issues: ValidationIssue[] = [];
// Skip for template queues if not applicable
if (isTemplateQueuePath(context.queuePath)) {
return issues;
}
// Skip if rule only applies in legacy mode
if (!context.legacyModeEnabled) {
return issues;
}
// Get current value
const myValue = context.fieldValue as number;
// Get related value from config
const relatedKey = buildPropertyKey(context.queuePath, 'related-property');
const relatedValue = parseFloat(context.config.get(relatedKey) || '0');
// Validation logic
if (myValue > relatedValue) {
issues.push({
queuePath: context.queuePath,
field: context.fieldName,
message: `My property (${myValue}) must not exceed related property (${relatedValue})`,
severity: 'error',
rule: 'my-new-rule', // lowercase kebab-case
});
}
return issues;
}
src/features/validation/ruleCategories.ts:// If rule affects multiple queues (parent, children, or siblings)
export const CROSS_QUEUE_RULES = [
// ... existing
'my-new-rule',
];
// If rule only validates the single queue being edited
export const QUEUE_SPECIFIC_RULES = [
// ... existing
'my-new-rule',
];
// If rule should never block applying changes (informational only)
export const WARNING_ONLY_RULES = [
// ... existing
'my-new-rule',
];
src/features/validation/utils/affectedQueues.ts):export function getAffectedQueuesForRule(
rule: string,
changedQueuePath: string,
schedulerData?: SchedulerInfo,
): string[] {
switch (rule) {
case 'my-new-rule':
// Return list of queue paths that need re-validation
// Example: parent and all siblings
const parent = getParentQueuePath(changedQueuePath);
const siblings = getSiblingQueues(changedQueuePath, schedulerData);
return [parent, ...siblings.map((q) => q.queuePath)];
// ... other rules
}
}
src/config/__tests__/validation-rules.test.tsKey Validation Patterns:
context.legacyModeEnabledisTemplateQueuePath() to skip if not applicableValidation Issue Structure:
{
queuePath: string; // Which queue has the issue
field: string; // Which property is invalid
message: string; // User-friendly error message
severity: 'error' | 'warning';
rule: string; // Rule identifier (kebab-case)
}
Similar to queue properties, but in src/config/properties/global-properties.ts:
{
name: 'yarn.scheduler.capacity.my-global-setting', // Full property key
displayName: 'My Global Setting',
description: 'What this global setting controls',
type: 'boolean',
category: 'core',
defaultValue: 'false',
required: false,
}
Property automatically appears in the Global Settings page (src/app/routes/global-settings.tsx).
Configuration System:
src/config/properties/queue-properties.ts - Queue property definitionssrc/config/properties/global-properties.ts - Global property definitionssrc/config/validation-rules.ts - Business validation rulessrc/config/schemas/ - Zod validation schemasValidation System:
src/features/validation/service.ts - Main validation entry pointsrc/features/validation/crossQueue.ts - Cross-queue validation enginesrc/features/validation/ruleCategories.ts - Rule categorizationsrc/features/validation/utils/affectedQueues.ts - Dependency detectionsrc/features/validation/utils/dedupeIssues.ts - Issue deduplicationState Management:
src/stores/schedulerStore.ts - Main Zustand storesrc/stores/slices/ - Feature slices (8 slices)API & Types:
src/lib/api/YarnApiClient.ts - YARN REST API clientsrc/types/ - TypeScript type definitionsUtilities:
src/utils/propertyUtils.ts - Property key constructionsrc/utils/capacityUtils.ts - Capacity parsing and validationsrc/utils/treeUtils.ts - Queue tree traversal (flattenQueueTree, traverseQueueTree, findQueueByPath)src/utils/nodeLabelUtils.ts - Node label name normalizationsrc/lib/errors/readOnlyGuard.ts - Read-only mode enforcement helpersnpm install
# Create .env file (see .env.example)
npm run dev # Starts at http://localhost:5173
# Development
npm run dev # Start dev server
# Testing
npm run test # Run tests in watch mode
npm run test:run # Single test run (CI mode)
npm run test:coverage # Generate coverage report
# Type Checking & Building
npm run typecheck # Generate types and type check
npm run build # Production build to ./build
npm start # Serve production build
# Code Quality
npm run lint # Lint TypeScript files
npm run lint:fix # Auto-fix linting issues
npm run format # Format code with Prettier
npm run format:check # Check formatting
# API Mock Mode (static, cluster, or off)
VITE_API_MOCK_MODE=static
# For cluster mode: proxy target
VITE_CLUSTER_PROXY_TARGET=http://rm-host:8088
# YARN username for simple auth
VITE_YARN_USER_NAME=yarn
# Read-only mode testing (development only)
VITE_READONLY_MODE=false
yarn-scheduler-ui/
├── src/
│ ├── app/ # React Router entry and routes
│ ├── components/ # Shared UI components
│ ├── config/ # Configuration system
│ │ ├── properties/ # Property descriptors
│ │ ├── schemas/ # Zod schemas
│ │ └── validation-rules.ts # Business validation
│ ├── features/ # Feature modules
│ │ ├── queue-management/ # Queue tree visualization
│ │ │ ├── components/ # QueueCardNode, CapacityEditorDialog, etc.
│ │ │ ├── hooks/ # useCapacityEditor, useQueueActions
│ │ │ └── utils/ # capacityDisplay, capacityEditor, etc.
│ │ ├── property-editor/ # Queue property editing
│ │ │ └── components/ # PropertyPanel, PropertyFormField, etc.
│ │ ├── staged-changes/ # Change review and mutation
│ │ ├── validation/ # Cross-queue validation engine
│ │ ├── placement-rules/ # Placement rule builder
│ │ ├── node-labels/ # Node label management
│ │ ├── template-config/ # Auto-queue templates
│ │ ├── queue-comparison/ # Queue comparison tool
│ │ └── global-settings/ # Global scheduler settings
│ ├── hooks/ # Shared React hooks
│ ├── lib/ # Libraries and utilities
│ │ ├── api/ # YarnApiClient + MSW
│ │ └── errors/ # Error handling + readOnlyGuard
│ ├── stores/ # Zustand state management
│ │ ├── schedulerStore.ts # Main store
│ │ └── slices/ # 8 feature slices
│ ├── types/ # TypeScript definitions
│ ├── utils/ # Utility functions (treeUtils, nodeLabelUtils, etc.)
│ └── testing/ # Test utilities
├── public/mock/ # Mock API responses
├── docs/ # Documentation
└── [config files] # Vite, React Router, Vitest, etc.
The YARN Scheduler UI is a modern web application that provides a visual interface for managing YARN Capacity Scheduler configurations. Key highlights: