docs/en/release_notes/release-3.2.md
Release Date: April 30, 2025
Fixed the following issues:
files() function. #56606Release date: February 14, 2025
max_by and min_by. #54961Fixed the following issues:
Release date: January 8, 2025
Fixed the following issues:
max(count(distinct)) when low-cardinality optimization is enabled. #53403Release date: December 13, 2024
Fixed the following issues:
loadRowsRate field returned 0 after executing SHOW ROUTINE LOAD. #52151Files() function read columns that were not queried. #52210array_map function caused BE to crash. #52909storage_cooldown_time property for materialized views did not take effect when set to maximum. #52079Release date: October 23, 2024
information_schema.routine_load_jobs from Follower FE nodes. #51763Fixed the following issues:
array_map function causes a crash when excessive constant parameters are used. #51244information_schema.fe_locks causes a crash. #51742Release date: September 9, 2024
Fixed the following issues:
Release date: August 23, 2024
BYTE_ARRAY data with a logical_type of JSON in Parquet files to the JSON type in StarRocks. #49385information_schema.columns supports the GENERATION_EXPRESSION field. #49734Fixed the following issues:
"persistent_index_type" = "CLOUD_NATIVE" causes a crash. #48149page_off information causes an array out-of-bounds crash. #48230TINYINT values in ORC format files return None on the aarch64 architecture. #49517l0 snapshots for Primary Key Persistent Index may cause data loss. #48045partition_linve_nubmer does not take effect. #49213partition_line_number using ALTER TABLE do not take effect. #49437meta directory in the FE startup script. If the directory does not exist, it will be automatically created. #48940load_process_max_memory_hard_limit_ratio for data loading. If memory usage exceeds the limit, subsequent loading tasks will fail. #48495Release date: July 11, 2024
user_admin role users from resetting the password of the root user. #47801\t and \n as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302skip.header.line.count property. #47001JAVA_OPTS parameters. If versions other than JDK_9 or JDK_11 are used, users need to configure JAVA_OPTS directly. #4749516 (instead of 2 based on the formula 2*BE or CN count). If users want to set a smaller bucket number when creating a small table, they must set it explicitly. #47005max(2*BE or CN count, bucket number calculated based on the largest historical partition data volume). The previous rule was to calculate the bucket number based on the largest historical partition data volume. #47949Fixed the following issues:
Release date: June 7, 2024
Fixed the following issues:
Multiple entries with same key is returned to queries with non-deterministic functions. #46602Release date: May 24, 2024
compression and Content-Encoding. Supported compression algorithms including GZIP, BZIP2, LZ4_FRAME, and ZSTD. #43732query_pool_spill_mem_limit_threshold. Once the threshold is reached, intermediate results of queries will be spilled to disks to reduce memory usage, thus avoiding OOM.Fixed the following issues:
Release date: April 18, 2024
Fixed the following issue:
Release date: April 12, 2024
:::tip
This version has been taken offline due to privilege issues in querying external tables in external catalogs such as Hive and Iceberg.
Problem: When a user queries data from an external table in an external catalog, access to this table is denied even when the user has the SELECT privilege on this table. SHOW GRANTS also shows that the user has this privilege.
Impact scope: This problem only affects queries on external tables in external catalogs. Other queries are not affected.
Temporary workaround: The query succeeds after the SELECT privilege on this table is granted to the user again. But SHOW GRANTS will return duplicate privilege entries. After an upgrade to v3.2.6, users can run REVOKE to remove one of the privilege entries.
:::
IS NULL operator, they are considered NULL values following SQL language. For example, true is returned for SELECT parse_json('{"a": null}') -> 'a' IS NULL (before this behavior change, false is returned). #42765Fixed the following issues:
enable_persistent_index property of a Primary Key table. #42890Release date: March 12, 2024
:::tip
This version has been taken offline due to privilege issues in querying external tables in external catalogs such as Hive and Iceberg.
Problem: When a user queries data from an external table in an external catalog, access to this table is denied even when the user has the SELECT privilege on this table. SHOW GRANTS also shows that the user has this privilege.
Impact scope: This problem only affects queries on external tables in external catalogs. Other queries are not affected.
Temporary workaround: The query succeeds after the SELECT privilege on this table is granted to the user again. But SHOW GRANTS will return duplicate privilege entries. After an upgrade to v3.2.6, users can run REVOKE to remove one of the privilege entries.
:::
milliseconds_diff. #38171catalog, which specifies the catalog to which the session belongs. #41329information_schema.partitions_meta, which records detailed metadata of partitions. #39265sys.fe_memory_usage, which records the memory usage for StarRocks. #40464cbo_decimal_cast_string_strict is used to control how CBO converts data from the DECIMAL type to the STRING type. The default value true indicates that the logic built in v2.5.x and later versions prevails and the system implements strict conversion (namely, the system truncates the generated string and fills 0s based on the scale length). The DECIMAL type is not strictly filled in earlier versions, causing different results when comparing the DECIMAL type and the STRING type. #40619enable_iceberg_metadata_cache has been changed to false. From v3.2.1 to v3.2.3, this parameter is set to true by default, regardless of what metastore service is used. In v3.2.4 and later, if the Iceberg cluster uses AWS Glue as metastore, this parameter still defaults to true. However, if the Iceberg cluster uses other metastore service such as Hive metastore, this parameter defaults to false. #41826root user to the user who creates the materialized views. This change does not affect existing materialized views. #40670cbo_eq_base_type to adjust the rule used for the comparison. For example, users can set cbo_eq_base_type to decimal, and StarRocks then compares the columns as numeric values. #40619s3_compatible_fs_list to specify which S3-compatible object storage can be accessed via AWS SDK, and supports using the parameter fallback_to_hadoop_fs_list to specify non-S3-compatible object storages that require access via HDFS Schema (this method requires the use of vendor-provided JAR packages). #41123agg_type of BITMAP-type columns in an Aggregate table can be set to replace_if_not_null in order to support updates only to a few columns of the table. #42034Fixed the following issues:
Release date: February 8, 2024
enable_strict_order_by. When this variable is set to the default value TRUE, an error is reported for such a query pattern: Duplicate alias is used in different expressions of the query and this alias is also a sorting field in ORDER BY, for example, select distinct t1.* from tbl1 t1 order by t1.k1;. The logic is the same as that in v2.3 and earlier. When this variable is set to FALSE, a loose deduplication mechanism is used, which processes such queries as valid SQL queries. #37910enable_materialized_view_for_insert, which controls whether materialized views rewrite the queries in INSERT INTO SELECT statements. The default value is false. #37505query_mem_limit instead of exec_mem_limit. Setting the value of query_mem_limit to 0 indicates no limit. #34120http_worker_threads_num, which specifies the number of threads for HTTP server to deal with HTTP requests. The default value is 0. If the value for this parameter is set to a negative value or 0, the actual thread number is twice the number of CPU cores. #37530lake_pk_compaction_max_input_rowsets, which controls the maximum number of input rowsets allowed in a Primary Key table compaction task in a shared-data StarRocks cluster. This helps optimize resource consumption for compaction tasks. #39611connector_sink_compression_codec, which specifies the compression algorithm used for writing data into Hive tables or Iceberg tables, or exporting data with Files(). Valid algorithms include GZIP, BROTLI, ZSTD, and LZ4. #37912routine_load_unstable_threshold_second. #36222pindex_major_compaction_limit_per_disk to configure the maximum concurrency of compaction on a disk. This addresses the issue of uneven I/O across disks due to compaction. This issue can cause excessively high I/O for certain disks. The default value is 1. #36681enable_lazy_delta_column_compaction. The default value is true, indicating that StarRocks does not perform frequent compaction operations on delta columns. #36654default_mv_refresh_immediate, which specifies whether to immediately refresh the materialized view after the materialized view is created. The default value is true. #37093default_mv_refresh_partition_numto 1. This indicates that when multiple partitions need to be updated during a materialized view refresh, the task will be split in batches, refreshing only one partition at a time. This helps reduce resource consumption during each refresh. #36560starlet_use_star_cache to true. This indicates that Data Cache is enabled by default in shared-data clusters. If, before upgrading, you have manually configured the BE/CN configuration item starlet_cache_evict_high_water to X, you must configure the BE/CN configuration item starlet_star_cache_disk_size_percent to (1.0 - X) * 100. For example, if you have set starlet_cache_evict_high_water to 0.3 before upgrading, you must set starlet_star_cache_disk_size_percent to 70. This ensures that both file data cache and Data Cache will not exceed the disk capacity limit. #38200yyyy-MM-ddTHH:mm and yyyy-MM-dd HH:mm to support TIMESTAMP partition fields in Apache Iceberg tables. #39986storage_medium to the view information_schema.be_tablets. #37070SET_VAR in multiple sub-queries. #36871LatestSourcePosition is added to the return result of SHOW ROUTINE LOAD to record the position of the latest message in each partition of the Kafka topic, helping check the latencies of data loading. #38298% or _, the LIKE operator is converted into the = operator. #37515Fixed the following issues:
storage_page_cache_limit in certain circumstances. #37740bitmap_to_string may return incorrect results due to data type overflow. #37405SELECT ... FROM ... INTO OUTFILE is executed to export data into CSV files, the error "Unmatched number of columns" is reported if the FROM clause contains multiple constants. #38045Release date: December 30, 2023
Fixed the following issue:
Release date: December 21, 2023
object_dependencies to the database sys. It contains the lineage information of asynchronous materialized views. #35060max_tablet_rowset_num for setting the maximum allowed number of rowsets. This metric helps detect possible compaction issues and thus reduces the occurrences of the error "too many versions". #36539enable_stream_load_verbose_log is added. The default value is false. With this parameter set to true, StarRocks can record the HTTP requests and responses for Stream Load jobs, making troubleshooting easier. #36113GROUP_CONCAT_LEGACY is added to the session variable sql_mode to provide compatibility with the implementation logic of the group_concat function in versions earlier than v2.5. #36150aws.s3.access_key and aws.s3.access_secret for AWS S3 in Broker Load jobs are hidden in audit logs. #36571be_tablets view in the information_schema database provides a new field INDEX_DISK, which records the disk usage (measured in bytes) of persistent indexes. #35615OtherMsg, which shows information about the last failed task. #35806Fixed the following issues:
Release date: December 1, 2023
Asynchronous materialized view
columns_from_path.Added the following functions:
StarRocks supports access control through Apache Ranger, providing a higher level of data security and allowing the reuse of existing services of external data sources. After integrating with Apache Ranger, StarRocks enables the following access control methods:
For more information, see Manage permissions with Apache Ranger.
Asynchronous materialized view
query_rewrite_consistency for asynchronous materialized view creation. This property defines the query rewrite rules based on the consistency check.force_external_table_query_rewrite for external catalog-based asynchronous materialized view creation. This property defines whether to allow force query rewrite for asynchronous materialized views created upon external catalogs.fast_schema_evolution. After this feature is enabled, the execution efficiency of adding or dropping columns is significantly improved. This mode is disabled by default (Default value is false). You cannot modify this property for existing tables using ALTER TABLE.SET_VAR. #35283large_decimal_underlying_type = "panic"|"double"|"decimal" to set the rules to deal with DECIMAL type overflow. panic indicates returning an error immediately, double indicates converting the data to DOUBLE type, and decimal indicates converting the data to DECIMAL(38,s).To be updated.
catalog_metadata_cache_sizeenable_backup_materialized_viewenable_colocate_mv_indexenable_fast_schema_evolutionjson_file_size_limitlake_enable_ingest_slowdownlake_ingest_slowdown_thresholdlake_ingest_slowdown_ratiolake_compaction_score_upper_boundmv_auto_analyze_asyncprimary_key_disk_schedule_timestatistic_auto_collect_small_table_rowsstream_load_task_keep_max_numstream_load_task_keep_max_secondenable_pipeline_load.enable_sync_publish is changed from false to true.enable_persistent_index_by_default is changed from false to true.Data Cache-related configuration changes.
datacache_enable to replace block_cache_enable.datacache_mem_size to replace block_cache_mem_size.datacache_disk_size to replace block_cache_disk_size.datacache_disk_path to replace block_cache_disk_path.datacache_meta_path to replace block_cache_meta_path.datacache_block_size to replace block_cache_block_size.datacache_checksum_enable to replace block_cache_checksum_enable.datacache_direct_io_enable to replace block_cache_direct_io_enable.datacache_max_concurrent_inserts to replace block_cache_max_concurrent_inserts.datacache_max_flying_memory_mb.datacache_engine to replace block_cache_engine.block_cache_max_parcel_memory_mb.block_cache_report_stats.block_cache_lru_insertion_point.After renaming Block Cache to Data Cache, StarRocks has introduced a new set of BE parameters prefixed with datacache to replace the original parameters prefixed with block_cache. After upgrade to v3.2, the original parameters will still be effective. Once enabled, the new parameters will override the original ones. The mixed usage of new and original parameters is not supported, as it may result in some configurations not taking effect. In the future, StarRocks plans to deprecate the original parameters with the block_cache prefix, so we recommend you use the new parameters with the datacache prefix.
Added the following BE configuration items:
spill_max_dir_bytes_ratiostreaming_agg_limited_memory_sizestreaming_agg_chunk_buffer_sizeRemoved the following BE configuration items:
tc_use_memory_mintc_free_memory_ratetc_gc_periodtc_max_total_thread_cache_byteDefault value modifications:
disable_column_pool is changed from false to true.thrift_port is changed from 9060 to 0.enable_load_colocate_mv is changed from false to true.enable_pindex_minor_compaction is changed from false to true.enable_per_bucket_optimizeenable_write_hive_external_tablehive_temp_staging_dirspill_revocable_max_bytesthrift_plan_protocolenable_pipeline_query_statisticenable_deliver_batch_fragmentsenable_scan_block_cache is renamed as enable_scan_datacache.enable_populate_block_cache is renamed as enable_populate_datacache.Added reserved keywords OPTIMIZE and PREPARE.
Fixed the following issues:
information_schema.columns. #33431msg:Fail to parse columnsFromPath, expected: [rec_dt]. #32720DATA_TYPE and COLUMN_TYPE for BINARY or VARBINARY data types are displayed as unknown in the information_schema.columns view. #32678bucket_size when creating tables. This allows the system to dynamically adjust the number of tablets based on cluster information and the size of loaded data. Please note that once this optimization is enabled, if you need to roll back your cluster to v3.1 or earlier, you must delete tables with this optimization enabled and manually execute a metadata checkpoint (by executing ALTER SYSTEM CREATE IMAGE). Otherwise, the rollback will fail.enable_pipeline_engine=true in the FE configuration file fe.conf). Failure to do so will result in errors for non-Pipeline queries.