docs/en/release_notes/release-4.1.md
:::danger
Container Image Issue (v4.1.0)
Due to an unstable load order issue in the v4.1.0 container image, BE processes may fail to start reliably in container environments. Container environment users should NOT upgrade to v4.1.0. Please wait for v4.1.1, which includes the fix (#71825).
:::
:::warning
Downgrade Notes
After upgrading StarRocks to v4.1, DO NOT downgrade to any v4.0 version below v4.0.6.
Due to internal changes in data layout introduced in v4.1 (related to tablet splitting and distribution mechanisms), clusters upgraded to v4.1 may generate metadata and storage structures that are not fully compatible with earlier versions. As a result, downgrade from v4.1 is only supported to v4.0.6 or later. Downgrading to versions prior to v4.0.6 is not supported. This limitation is due to backward compatibility constraints in how earlier versions interpret tablet layout and distribution metadata.
:::
Release Date: April 13, 2026
New Multi-Tenant Data Management
Shared-data clusters now support range-based data distribution and automatic splitting and merging of tablets. Tablets can be automatically split when they become oversized or hotspots, without requiring schema changes, SQL modifications, or data re-ingestion. This feature can significantly improve usability, directly addressing data skew and hotspot issues in multi-tenant workloads. #65199 #66342 #67056 #67386 #68342 #68569 #66743 #67441 #68497 #68591 #66672 #69155
Large-Capacity Tablet Support (Phase 1)
Supports significantly larger per-tablet data capacity for shared-data clusters, with a long-term target of 100 GB per tablet. Phase 1 focuses on enabling parallel Compaction and parallel MemTable finalization within a single Lake tablet, reducing ingestion and Compaction overhead as tablet size grows. #66586 #68677
Fast Schema Evolution V2
Shared-data clusters now support Fast Schema Evolution V2, which enables second-level DDL execution for schema operations, and further extends the support to materialized views. #65726 #66774 #67915
[Beta] Inverted Index on shared-data
Enables built-in inverted indexes for shared-data clusters to accelerate text filtering and full-text search workloads. #66541
Cache Observability
Query-level cache hit ratio is now exposed in audit logs and the monitoring system for better cache transparency and latency diagnosis. Additional Data Cache metrics include memory and disk quota usage, and page cache statistics. #63964
Added segment metadata filter for Lake tables to skip irrelevant segments based on sort key range during scans, reducing I/O for range-predicate queries. #68124
Supports fast cancel for Lake DeltaWriter, reducing latency for cancelled ingestion jobs in shared-data clusters. #68877
Added support for interval-based scheduling for automated cluster snapshots. #67525
Supports pipeline execution for MemTable flush and merge, improving ingestion throughput for cloud-native tables in shared-data clusters. #67878
Supports dry_run mode for repairing cloud-native tables, allowing users to preview repair actions before execution. #68494
Added a thread pool for publish transactions in shared-nothing clusters, improving publish throughput. #67797
Supports dynamically modifying the datacache.enable property for cloud-native tables. #69011
Iceberg DELETE Support
Supports writing position delete files for Iceberg tables, enabling DELETE operations on Iceberg tables directly from StarRocks. The support covers the full pipeline of Plan, Sink, Commit, and Audit. #67259 #67277 #67421 #67567
TRUNCATE for Hive and Iceberg Tables
Supports TRUNCATE TABLE on external Hive and Iceberg tables. #64768 #65016
Incremental materialized view on Iceberg
Extends the support for incremental materialized view refresh to Iceberg append-only tables, enabling query acceleration without full table refresh. #65469 #62699
VARIANT Type for Semi-Structured Data in Iceberg
Supports the VARIANT data type in Iceberg Catalog for flexible, schema-on-read storage and querying of semi-structured data. Supports read, write, type casting, and Parquet integration. #63639 #66539
Iceberg v3 Support
Added support for Iceberg v3 default value feature and row lineage. #69525 #69633
Iceberg Table Maintenance Procedures
Added support for rewrite_manifests procedure and extended expire_snapshots and remove_orphan_files procedures with additional arguments for finer-grained table maintenance. #68817 #68898
Iceberg $properties Metadata Table
Added support for querying Iceberg table properties via the $properties metadata table. #68504
Supports reading file path and row position metadata columns from Iceberg tables. #67003
Supports reading _row_id from Iceberg v3 tables, and supports global late materialization for Iceberg v3. #62318 #64133
Supports creating Iceberg views with custom properties, and displays properties in SHOW CREATE VIEW output. #65938
Supports querying Paimon tables with a specific branch, tag, version, or timestamp. #63316
Supports complex types (ARRAY, MAP, STRUCT) for Paimon tables. #66784
Supports Paimon views. #56058
Supports TRUNCATE for Paimon tables. #67559
Supports Partition Transforms with parentheses syntax when creating Iceberg tables. #68945
Supports ALTER TABLE REPLACE PARTITION COLUMN for Iceberg tables. #70508
Supports Iceberg global shuffle based on Transform Partition for improved data organization. #70009
Supports dynamically enabling global shuffle for Iceberg table sink. #67442
Introduced a Commit queue for Iceberg table sink to avoid concurrent Commit conflicts. #68084
Added host-level sorting for Iceberg table sink to improve data organization and reading performance. #68121
Enabled additional optimizations in ETL execution mode by default, improving performance for INSERT INTO SELECT, CREATE TABLE AS SELECT, and similar batch operations without explicit configuration. #66841
Added commit audit information for INSERT and DELETE operations on Iceberg tables. #69198
Supports enabling or disabling view endpoint operations in Iceberg REST Catalog. #66083
Optimized cache lookup efficiency in CachingIcebergCatalog. #66388
Supports EXPLAIN on various Iceberg catalog types. #66563
Supports partition projection for tables in AWS Glue Catalog tables. #67601
Added resource share type support for AWS Glue GetDatabases API. #69056
Supports Azure ABFS/WASB path mapping with endpoint injection (azblob/adls2). #67847
Added a database metadata cache for JDBC catalog to reduce remote RPC overhead and impact of external system failures. #68256
Added schema_resolver property for JDBC catalog to support custom schema resolution. #68682
Supports column comments for PostgreSQL tables in information_schema. #70520
Improved Oracle and PostgreSQL JDBC type mapping. #70315 #70566
Recursive CTE
Supports Recursive Common Table Expressions (CTEs) for hierarchical traversals, graph queries, and iterative SQL computations. #65932
Improved Skew Join v2 rewrite with statistics-based skew detection, histogram support, and NULL-skew awareness. #68680 #68886
Improved COUNT DISTINCT over windows and added support for fused multi-distinct aggregations. #67453
Supports explicit skew hint for window functions, with automatic optimization of window functions with skewed partition keys by splitting into UNION. #68739 #67944
Supports materialization hints for CTEs. #70802
Enabled Global Lazy Materialization by default, improving query performance by deferring column reads until needed. #70412
Supports EXPLAIN and EXPLAIN ANALYZE for INSERT statements in Trino Parser. #70174
Supports EXPLAIN for query queue visibility. #69933
array_top_n: Returns the top N elements from an array ranked by value. #63376arrays_zip: Combines multiple arrays element-wise into an array of structs. #65556json_pretty: Formats a JSON string with indentation. #66695json_set: Sets a value at a specified path within a JSON string. #66193initcap: Converts the first letter of each word to uppercase. #66837sum_map: Sums MAP values across rows with the same key. #67482current_timezone: Returns the current session timezone. #63653current_warehouse: Returns the name of the current warehouse. #66401sec_to_time: Converts the number of seconds to a TIME value. #62797ai_query: Calls an external AI model from SQL for inference workloads. #61583min_n / max_n: Aggregate functions that return the top N minimum/maximum values. #63807regexp_position: Returns the position of a regular expression match in a string. #67252is_json_scalar: Returns whether a JSON value is a scalar. #66050get_json_scalar: Extracts a scalar value from a JSON string. #68815raise_error: Raises a user-defined error in SQL expressions. #69661uuid_v7: Generates time-ordered UUID v7 values. #67694STRING_AGG: Syntactic sugar for GROUP_CONCAT. #64704array_sort for custom sort ordering. #66607lead/lag/first_value/last_value window functions. #63547MULTIPLY/DIVIDE for interval operations. #68407ALTER TASK statements for task management. #68675CREATE FUNCTION ... AS <sql_body>. #67558STRUCT_CAST_BY_NAME SQL mode for name-based struct field matching. #69845last_query_id() in ANALYZE PROFILE for easy query profile analysis. #64557warehouses, cpu_weight_percent, and exclusive_cpu_weight attributes for resource groups to improve multi-warehouse CPU resource isolation. #66947information_schema.fe_threads system view to inspect the FE thread state. #65431ClusterSummaryActionV2 API endpoint to provide a structured cluster overview. #68836@@run_mode to query the current cluster run mode (shared-data or shared-nothing). #69247query_queue_v2 by default for improved query queue management. #67462skip_black_list session variable to bypass backend blacklist verification when needed. #67467enable_table_metrics_collect option for the metrics API. #68691table_query_timeout as a table-level property. #67547information_schema.loads for better load job visibility. #67879spark-core_2.12 to 3.5.7. #70862The following issues have been fixed:
DefaultValueColumnIterator for complex types. #71142shared_ptr cycle between BatchUnit and FetchTaskContext. #71126set_finishing. #70851visitDictionaryGetExpr when dictionary backing table is dropped. #71109IcebergCatalog.getPartitionLastUpdatedTime when snapshot is expired. #68925null_counts empty. #68463CatalogRecycleBin.asyncDeleteForTables for shared-nothing clusters. #68275DROP FUNCTION IF EXISTS ignoring ifExists flag. #69216array_map crash when processing null literal array. #70629to_base64. #70623proc_file. #68997lag/lead window functions now supports column references in addition to constant values. #60209query_queue_v2 is now enabled by default. #67462enable_sql_transaction by default. #63535