docs/en/release_notes/release-3.1.md
Release Date: January 3, 2025
Fixed the following issues:
Release date: December 16, 2024
Fixed the following issues:
EXPORT with Broker to file:// resulted in a file rename error, causing the export to fail. #52544Release date: September 4, 2024
Fixed the following issues:
count(*) on certain tables returns NULL. #49288partition_linve_nubmer does not take effect. #49213Release date: July 29, 2024
\t and \n as row and column delimiters. Users do not need to convert them to their hexadecimal ASCII codes. #47302Fixed the following issues:
COM_CHANGE_USER does not support conn_attr. #4779616 (instead of 2 based on the formula 2*BE or CN count). If users want to set a smaller bucket number when creating a small table, they must set it explicitly. #47005Release date: June 26, 2024
Fixed the following issues:
Release date: May 30, 2024
call frontend service failed reason=xxx, making it unclear what the specific issue was. The error messages are now optimized to include specific reasons, such as timeout or server busy. #44153Fixed the following issues:
information_schema.task_runs fails frequently when many asynchronous tasks exist. #45520Release date: April 28, 2024
information_schema using DROP TABLE. #43556Fixed the following issues:
str_to_map may cause BEs to crash. #43930show proc '/routine_loads' is stuck due to deadlock. #44249pending_task_run_count displayed on the page of leaderFE_IP:8030 is incorrect. The displayed number is the sum of Pending and Running tasks, not Pending tasks. In addition, the information of the metric refresh_pending cannot be displayed using followerFE_IP:8030. #43052information_schema.task_runs fails frequently. #43052Invalid plan: PhysicalTopNOperator error. #44185:::tip
This version has been taken offline due to privilege issues in querying external tables in external catalogs such as Hive and Iceberg.
Problem: When a user queries data from an external table in an external catalog, access to this table is denied even when the user has the SELECT privilege on this table. SHOW GRANTS also shows that the user has this privilege.
Impact scope: This problem only affects queries on external tables in external catalogs. Other queries are not affected.
Temporary workaround: The query succeeds after the SELECT privilege on this table is granted to the user again. But SHOW GRANTS will return duplicate privilege entries. After an upgrade to v3.1.11, users can run REVOKE to remove one of the privilege entries.
:::
Release date: March 29, 2024
regexp_extract_all. #42178IS NULL operator, they are considered NULL values following SQL language. For example, true is returned for SELECT parse_json('{"a": null}') -> 'a' IS NULL (before this behavior change, false is returned). #42815Fixed the following issues:
orc_use_column_names to true, which specifies to read ORC files from Hive based on mapping by column name. #42905Release date: March 8, 2024
information_schema.partitions_meta, which records detailed metadata of partitions. #41101sys.fe_memory_usage, which records the memory usage for StarRocks. #41083root user to the user who creates the materialized views. This change does not affect existing materialized views. #40698cbo_eq_base_type to adjust the default rule used for the comparison. For example, users can set cbo_eq_base_type to decimal, and StarRocks then compares the columns as numeric values. #41712s3_compatible_fs_list to specify which S3-compatible object storage can be accessed via AWS SDK, and supports using the parameter fallback_to_hadoop_fs_list to specify non-S3-compatible object storage that require access via HDFS Schema (this method necessitates the use of vendor-provided JAR packages). #41612current_catalog, current_schema, to_char, from_hex, to_date, to_timestamp, and index. #41505 #41270 #40838cbo_materialized_view_rewrite_related_mvs_limit is added to control the maximum number of candidate materialized views allowed during query planning. The default value of this session variable is 64. This session variable helps mitigate the excessive resource consumption caused by a large number of candidate materialized views for a query during the query planning. #39829agg_type of BITMAP-type columns in an Aggregate table can be set to replace_if_not_null to support updates only to a few columns of the table. #42102cbo_eq_base_type is optimized to support specifying the implicit conversion rule applied to the comparison of data that contains both string and numeric data types. By default, such data is compared as strings. #40619path parameter in the SQL statement for creating a file external table supports wildcards (*). However, like the DATA INFILE parameter in the SQL statement for creating a Broker Load job, the path parameter supports using wildcards (*) to match at most one level of directory or file. #40844Fixed the following issues:
Release date: February 5, 2024
lake_pk_compaction_max_input_rowsets, which controls the maximum number of input rowsets allowed in a Primary Key table compaction task in a shared-data StarRocks cluster. This helps optimize resource consumption for compaction tasks. #39611datacache.partition_duration property for cloud-native tables created with the list partitioning strategy. This property controls the validity period of the data cache and can be dynamically configured. #35681 #38509update_compaction_per_tablet_min_interval_seconds. This parameter is originally used only to control the frequency of compaction tasks on Primary Key tables. After the optimization, it can also be used to control the frequency of major compaction tasks on Primary Key table indexes. #39640Fixed the following issues:
Release date: January 12, 2024
unnest_bitmap. #38136enable_materialized_view_for_insert, which controls whether materialized views rewrite the queries in INSERT INTO SELECT statements. The default value is false. #37505enable_new_publish_mechanism is changed to a static parameter. You must restart the FE after you modify the parameter settings. #35338enable_strict_order_by. When this variable is set to the default value TRUE, an error is reported for such a query pattern: Duplicate alias is used in different expressions of the query and this alias is also a sorting field in ORDER BY, for example, select distinct t1.* from tbl1 t1 order by t1.k1;. The logic is the same as that in v2.3 and earlier. When this variable is set to FALSE, a loose deduplication mechanism is used, which processes such queries as valid SQL queries. #37910routine_load_unstable_threshold_second. #36222http_worker_threads_num, which specifies the number of threads for HTTP server to deal with HTTP requests. The default value is 0. If the value for this parameter is set to a negative value or 0, the actual thread number is twice the number of CPU cores. #37530pindex_major_compaction_limit_per_disk to configure the maximum concurrency of compaction on a disk. This addresses the issue of uneven I/O across disks due to compaction. This issue can cause excessively high I/O for certain disks. The default value is 1. #36681transaction_read_only and tx_read_only to specify the transaction access mode, which are compatible with MySQL versions 5.7.20 and above. #37249default_mv_refresh_immediate, which specifies whether to immediately refresh the materialized view after the materialized view is created. The default value is true. #37093lake_enable_vertical_compaction_fill_data_cache, which specifies whether to allow compaction tasks to cache data on local disks in a shared-data cluster. The default value is false. #37296datacache.partition_duration property, which controls the validity period of the hot data in the data cache. #35681date_trunc, adddate, and time_slice functions support setting the interval parameter to values that are accurate to the millisecond and microsecond. #36386% or _, the LIKE operator is converted into the = operator. #37515LatestSourcePosition is added to the return result of SHOW ROUTINE LOAD to record the position of the latest message in each partition of the Kafka topic, helping check the latencies of data loading. #38298spill_mem_limit_threshold, to control the memory usage threshold (percentage) at which a resource group triggers the spilling of intermediate results when the system variable spill_mode is set to auto. The valid range is (0, 1). The default value is 1, indicating the threshold does not take effect. #37707Fixed the following issues:
storage_page_cache_limit in certain circumstances. #37740bitmap_to_string may return incorrect results due to data type overflow. #37405enable_sync_publish is set to TRUE, queries on data that is written after the BEs crash and then restart may fail. #37398TABLE_CATALOG field in views of the StarRocks Information Schema is null. #37570SELECT ... FROM ... INTO OUTFILE is executed to export data into CSV files, the error "Unmatched number of columns" is reported if the FROM clause contains multiple constants. #38045Release date: December 18, 2023
p is not specified, this function returns only date and time accurate to the second. #36676max_tablet_rowset_num for setting the maximum allowed number of rowsets. This metric helps detect possible compaction issues and thus reduces the occurrences of the error "too many versions". #36539enable_stream_load_verbose_log is added. The default value is false. With this parameter set to true, StarRocks can record the HTTP requests and responses for Stream Load jobs, making troubleshooting easier. #36113enable_lazy_delta_column_compaction is added. The default value is true, indicating that StarRocks does not perform frequent compaction operations on delta columns. #36654enable_mv_automatic_active_check is added to control whether the system automatically checks and re-activates the asynchronous materialized views that are set inactive because their base tables (views) had undergone Schema Change or had been dropped and re-created. The default value is true. #36463GROUP_CONCAT_LEGACY is added to the session variable sql_mode to provide compatibility with the implementation logic of the group_concat function in versions earlier than v2.5. #36150OtherMsg, which shows information about the last failed task. #35806aws.s3.access_key and aws.s3.access_secret for AWS S3 in Broker Load jobs are hidden in audit logs. #36571be_tablets view in the information_schema database provides a new field INDEX_DISK, which records the disk usage (measured in bytes) of persistent indexes #35615Fixed the following issues:
enable_collect_query_detail_info is set to true. #35945Release date: November 28, 2023
COLUMNS view in the system database INFORMATION_SCHEMA can display ARRAY, MAP, and STRUCT columns. #33431Fixed the following issues:
show proc '/current_queries'; is being executed and meanwhile a query begins to be executed, BEs may crash. #34316INFORMATION_SCHEMA is queried by using the database driver MariaDB ODBC, the CATALOG_NAME column returned in the schemata view holds only null values. #34627/) at the end of the HDFS storage path causes the backup and restore of the data from HDFS to fail. #34601enable_load_profile to true makes Stream Load jobs prone to fail. #34544partition_live_number property added by using the ALTER TABLE statement does not take effect. #34842recover_with_empty_tablet to true may cause FEs to crash. #33071enable_statistics_collect_profile, which controls whether to generate profiles for statistics queries. The default value is false. #33815mysql_server_version is now mutable. The new setting can take effect for the current session without requiring an FE restart. #34033update_compaction_ratio_threshold, which controls the maximum proportion of data that a compaction can merge for a Primary Key table in a StarRocks shared-data cluster. The default value is 0.5. We recommend shrinking this value if a single tablet becomes excessively large. For a StarRocks shared-nothing cluster, the proportion of data that a compaction can merge for a Primary Key table is still automatically adjusted. #35129cbo_decimal_cast_string_strict, which controls how the CBO converts data from the DECIMAL type to the STRING type. If this variable is set to true, the logic built in v2.5.x and later versions prevails and the system implements strict conversion (namely, the system truncates the generated string and fills 0s based on the scale length). If this variable is set to false, the logic built in versions earlier than v2.5.x prevails and the system processes all valid digits to generate a string. The default value is true. #34208cbo_eq_base_type, which specifies the data type used for data comparison between DECIMAL-type data and STRING-type data. The default value is VARCHAR, and DECIMAL is also a valid value. #34208big_query_profile_second_threshold. When the session variable enable_profile is set to false and the amount of time taken by a query exceeds the threshold specified by the big_query_profile_second_threshold variable, a profile is generated for that query. #33825Release date: November 2, 2023
enable_query_tablet_affinity, which controls whether to direct multiple queries against the same tablet to a fixed replica. This session variable is set to false by default. #33049is_role_in_session, which is used to check whether the specified roles are activated in the current session. It supports checking nested roles granted to a user. #32984enable_group_level_query_queue (default value: false). When the global-level or resource group-level resource consumption reaches a predefined threshold, new queries are placed in queue, and will be run when both the global-level resource consumption and the resource group-level resource consumption fall below their thresholds.
concurrency_limit for each resource group to limit the maximum number of concurrent queries allowed per BE.max_cpu_cores for each resource group to limit the maximum CPU consumption allowed per BE.plan_cpu_cost_range and plan_mem_cost_range, for resource group classifiers.
plan_cpu_cost_range: the CPU consumption range estimated by the system. The default value NULL indicates no limit is imposed.plan_mem_cost_range: the memory consumption range estimated by the system. The default value NULL indicates no limit is imposed.concurrency_limit, new queries are rejected or placed in queue.Fixed the following issues:
information_schema.COLUMNS. As a result, DELETE operations cannot be performed when data is loaded by using Flink Connector. #31458DATA_TYPE and COLUMN_TYPE for BINARY or VARBINARY data types are displayed as unknown in the information_schema.columns view. #32678enable_sync_publish which is set to true by default is added. When this parameter is set to true, the Publish phase of a data load into a Primary Key table returns the execution result only after the Apply task finishes. As such, the data loaded can be queried immediately after the load job returns a success message. However, setting this parameter to true may cause data loads into Primary Key tables to take a longer time. (Before this parameter is added, the Apply task is asynchronous with the Publish phase.) #27055:::tip
This version has been taken offline.
:::
Release date: September 25, 2023
Executing SQL commands with invalid comments now returns results consistent with MySQL. #30210
Fixed the following issues:
max_broker_load_job_concurrency using the ADMIN SET FRONTEND CONFIG command does not take effect. #29964 #29720Unexpected exception: Unknown properties: {persistent_index_type=LOCAL} is thrown. #30255null. #30647NOT NULL but have no default value specified, an error "Unsupported dataFormat value is : \N" is thrown. #30799group_concat_max_len which controls the default maximum length of the string returned by the group_concat function is changed from unlimited to 1024.Release date: August 25, 2023
Fixed the following issues:
show_data for cloud-native tables are incorrect. #29473Default field values returned by the SHOW FULL COLUMNS statement for columns of the BITMAP or HLL data type are incorrect. #29510For a newly deployed StarRocks v3.1 cluster, you must have the USAGE privilege on the destination external catalog if you want to run SET CATALOG to switch to that catalog. You can use GRANT to grant the required privileges.
For a v3.1 cluster upgraded from an earlier version, you can run SET CATALOG with inherited privilege.
Release date: August 18, 2023
Supports implicit conversions for all compound predicates and for all expressions in the WHERE clause. You can enable or disable implicit conversions by using the session variable enable_strict_type. The default value of this session variable is false.
Fixed the following issues:
Release date: August 7, 2023
loads to the Information_schema database. Users can query the results of Broker Load and Insert jobs from the loads table.log_rejected_record_num parameter in their load job to specify the maximum number of data rows that can be logged.BUCKETS) that StarRocks has provided since v2.5.7, users no longer need to consider bucket configurations, and table creation statements are greatly simplified. In big data and high performance-demanding scenarios, however, we recommend that users continue using hash bucketing, because this way they can use bucket pruning to accelerate queries.Added the following storage volume-related statements: CREATE STORAGE VOLUME, ALTER STORAGE VOLUME, DROP STORAGE VOLUME, SET DEFAULT STORAGE VOLUME, DESC STORAGE VOLUME, SHOW STORAGE VOLUMES.
Supports altering table comments using ALTER TABLE. #21035
Added the following functions:
ORDER BY, array_generate, element_at, cardinalityAdded privilege items related to storage volumes and privilege items related to external catalogs, and supports using GRANT and REVOKE to grant and revoke these privileges.
Optimized the data cache in shared-data StarRocks clusters. The optimized data cache allows for specifying the range of hot data. It can also prevent queries against cold data from occupying the local disk cache, thereby ensuring the performance of queries against hot data.
ORDER BY to specify a sort key.colocate_group, storage_medium, and storage_cooldown_time.properties("session.<variable_name>" = "<value>") syntax to flexibly adjust view refreshing strategies.mv_rewrite_staleness_second property at materialized view creation.REFRESH MATERIALIZED VIEW WITH SYNC MODE syntax to synchronously invoke materialized view refresh tasks.ALTER MATERIALIZED VIEW {ACTIVE | INACTIVE} to enable or disable a materialized view. Materialized views that are disabled (in the INACTIVE state) cannot be refreshed or used for query rewrite, but can be directly queried.ALTER MATERIALIZED VIEW SWAP WITH to swap two materialized views. Users can create a new materialized view and then perform an atomic swap with an existing materialized view to implement schema changes on the existing materialized view.[_SYNC_MV_], allowing for walking around issues that some queries cannot be properly rewritten in rare circumstances.CASE-WHEN, CAST, and mathematical operations, which make materialized views suitable for more business scenarios.Fixed the following issues:
WHERE clause, if these SQL queries have the same semantics but the order of the tables in each SQL query is different, some of these SQL queries may fail to be rewritten to benefit from the related materialized views. #22875GROUP BY clause. #19640\) and a semicolon (;) cannot be properly parsed. #16552storage_cache_ttl parameter is deleted from the table creation syntax used for shared-data StarRocks clusters. Now the data in the local cache is evicted based on the LRU algorithm.disable_storage_page_cache and alter_tablet_worker_count and the FE configuration item lake_compaction_max_tasks are changed from immutable parameters to mutable parameters.block_cache_checksum_enable is changed from true to false.enable_new_load_on_memory_limit_exceeded is changed from false to true.max_running_txn_num_per_db is changed from 100 to 1000.http_max_header_size is changed from 8192 to 32768.tablet_create_timeout_second is changed from 1 to 10.max_routine_load_task_num_per_be is changed from 5 to 16, and error information will be returned if a large number of Routine Load tasks are created.quorom_publish_wait_time_ms is renamed as quorum_publish_wait_time_ms, and the FE configuration item async_load_task_pool_size is renamed as max_broker_load_job_concurrency.routine_load_thread_pool_size is deprecated. Now the routine load thread pool size per BE node is controlled only by the FE configuration item max_routine_load_task_num_per_be.txn_commit_rpc_timeout_ms and the system variable tx_visible_wait_timeout are deprecated.max_broker_concurrency and load_parallel_instance_num are deprecated.max_routine_load_job_num is deprecated. Now StarRocks dynamically infers the maximum number of Routine Load tasks supported by each individual BE node based on the max_routine_load_task_num_per_be parameter and provides suggestions on task failures.thrift_port is renamed as be_port.task_consume_second and task_timeout_second, are added to control the maximum amount of time to consume data and the timeout duration for individual load tasks within a Routine Load job, making job adjustment more flexible. If users do not specify these two properties in their Routine Load job, the FE configuration items routine_load_task_consume_second and routine_load_task_timeout_second prevail.enable_resource_group is deprecated because the Resource Group feature is enabled by default since v3.1.0.