Back to Clickhouse

2025 Changelog

docs/changelogs/v25.4.1.2934-stable.md

26.4.1.1-new80.2 KB
Original Source

2025 Changelog

ClickHouse release v25.4.1.2934-stable (589918f385f) FIXME as compared to v25.4.1.1-new (6849013e378)

Backward Incompatible Change

  • Check all columns in a materialized view match the target table if allow_materialized_view_with_bad_select is false. #74481 (Christoph Wurm).
  • Fixes cases where dateTrunc is used with negative date/datetime arguments. #77622 (Yarik Briukhovetskyi).
  • The legacy MongoDB integration has been removed. Server setting use_legacy_mongodb_integration became obsolete and now does nothing. #77895 (Robert Schulze).
  • Enhance SummingMergeTree validation to skip aggregation for columns used in partition or sort keys. #78022 (Pervakov Grigorii).

New Feature

  • Backported in #79232: Add a new option to MergeTree SETTINGS which specifies a default compression codec in case the CREATE query does not explicitly define one for the given columns. This closes #42005. #66394 (gvoelfin).
  • Serialize query plan for Distributed queries. A new setting serialize_query_plan is added. When enabled, queries from Distributed table will use a serialized query plan for remote query execution. This introduces a new packet type to TCP protocol, <process_query_plan_packet>true</process_query_plan_packet> should be added to the server config to allow processing this packet. #69652 (Nikolai Kochetov).
  • Add setting to query Iceberg tables as of a specific timestamp. #71072 (Brett Hoerner).
  • Support low cardinality decimal data types, fix #72256. #72833 (zhanglistar).
  • Support DeltaLake table engine for AzureBlobStorage. Fixes #68043. #74541 (Smita Kulkarni).
  • Add bind_host setting to set the source IP address for clickhouse client connections. #74741 (Todd Yocum).
  • Introduce parametrized_view_parameters in system.tables. Closes https://github.com/clickhouse/clickhouse/issues/66756. #75112 (NamNguyenHoai).
  • Allow altering a database comment. Closes #73351 ### Documentation entry for user-facing changes. #75622 (NamNguyenHoai).
  • Backported in #79361: Support correlated subqueries as an argument of EXISTS expression in the WHERE clause. Closes #72459. #76078 (Dmitry Novik).
  • Scram SHA256 & update postgres wire auth. #76839 (scanhex12).
  • Support IcebergMetadataFilesCache, which will store manifest files/list and metadata.json in one cache. #77156 (Han Fei).
  • Add functions arrayLevenshteinDistance, arrayLevenshteinDistanceWeighted, and arraySimilarity. #77187 (Mikhail f. Shiryaev).
  • Allows a user to query the state of an Iceberg table as it existed at a previous point in time. #77439 (Daniil Ivanik).
  • Added CPU slot scheduling for workloads, see https://clickhouse.com/docs/operations/workload-scheduling#cpu_scheduling for details. #77595 (Sergei Trifonov).
  • The hasAll() function can now take advantage of the tokenbf_v1, ngrambf_v1 full-text skipping indices. #77662 (UnamedRus).
  • JSON data type is production-ready. See https://jsonbench.com/. Dynamic and Varaint data types are production ready. #77785 (Alexey Milovidov).
  • Added an in-memory cache for deserialized vector similarity indexes. This should make repeated approximate nearest neighbor (ANN) search queries faster. The size of the new cache is controlled by server settings vector_similarity_index_cache_size and vector_similarity_index_cache_max_entries. This feature supersedes the skipping index cache feature of earlier releases. #77905 (Shankar Iyer).
  • Backported in #79343: TBD. #78041 (Igor Nikonov).
  • Functions sparseGrams and sparseGramsHashes with UTF8 versions added. Author: scanhex12. #78176 (Pervakov Grigorii).
  • Introduce toInterval function. This function accepts 2 arguments (value and unit), and converts the value to a specific Interval type. #78723 (Andrew Davis).

Performance Improvement

  • Optimize performance with lazy projection to avoid reading unused columns. #55518 (Xiaozhe Yu).
  • Support async io prefetch for NativeORCBlockInputFormat, which improves overall performance by hiding remote io latency. Speedup ratio could reach 1.47x in my test case. #70534 (李扬).
  • Preallocate memory used by async inserts to improve performance. #74945 (Ilya Golshtein).
  • Trivial opt on wrapInNullable to avoid unnecessary null map allocation. #76489 (李扬).
  • Optimize arraySort. #76850 (李扬).
  • Speed-up building JOIN result by de-virtualizing calls to col->insertFrom(). #77350 (Alexander Gololobov).
  • Merge marks of the same part and write them to the query condition cache at one time to reduce the consumption of locks. #77377 (zhongyuankai).
  • Optimize s3Cluster performance for queries with one bracket expansion. #77686 (Tomáš Hromada).
  • Optimize order by single nullable or low-cardinality columns. #77789 (李扬).
  • Disable filesystem_cache_prefer_bigger_buffer_size when the cache is used passively, such as for merges. #77898 (Kseniia Sumarokova).
  • Implement trivial count optimization for Iceberg. Now queries with count() and without any filters should be faster. Closes #77639. #78090 (alesapin).
  • Support Iceberg data pruning based on lower_bound and uppert_bound values for columns. Fixes #77638. #78242 (alesapin).
  • Optimize memory usage for NativeReader. #78442 (Azat Khuzhin).
  • Trivial optimization: do not rewrite count(if()) to countIf if CAST is required. Close #78564. #78565 (李扬).
  • Backported in #79163: Now we use number of replicas to determine task size for reading with parallel replicas enabled. This provides better work distribution between replicas when the amount of data to read is not really big. #78695 (Nikita Taranov).
  • Backported in #79190: Merge equality conditions from filter query plan step into JOIN condition if possible to allow using them as hash table keys. #78877 (Dmitry Novik).

Improvement

  • Decrease the amount of Keeper requests by eliminating the use of single get requests, which could have caused a significant load on Keeper with the increased number of replicas, in places where multiRead is available. #56862 (Nikolay Degterinsky).
  • Reject queries when the server is overloaded. The decision is made based on the ratio of wait time (OSCPUWaitMicroseconds) to busy time (OSCPUVirtualTimeMicroseconds). The query is dropped with some probability, when this ratio is between min_os_cpu_wait_time_ratio_to_throw and max_os_cpu_wait_time_ratio_to_throw (those are query level settings). #63206 (Alexey Katsman).
  • Refreshes of refreshable materialized views now appear in system.query_log. #71333 (Michael Kolupaev).
  • Backported in #79175: clickhouse-local will retain its databases after restart if you specify the --path command line argument. This closes #50647. This closes #49947. #71722 (Alexey Milovidov).
  • Enabled a backoff logic for all types of replicated tasks. It will provide the ability to reduce CPU usage, memory usage, and log file sizes. Added new settings max_postpone_time_for_failed_replicated_fetches_ms, max_postpone_time_for_failed_replicated_merges_ms and max_postpone_time_for_failed_replicated_tasks_ms which are similar to max_postpone_time_for_failed_mutations_ms. #74576 (MikhailBurdukov).
  • Use dynamic sharding for JOIN if the JOIN key is a prefix of PK for both parts. This optimization is enabled with query_plan_join_shard_by_pk_ranges setting (disabled by default). #74733 (Nikolai Kochetov).
  • Enabled SSH protocol back. Fixed some critical vulnerabilities so that it is no longer possible to use custom pager or specify server-logs-file. Disabled the ability to pass client options through the environment variables by default (it is still possible via ssh-server.enable_client_options_passing in config.xml). Supported progress table, query cancellation, completion, profile events progress, stdin and send_logs_level option. This closes: #74340. #74989 (Nikita Mikhaylov).
  • Support for a refresh in readonly MergeTree tables. #76467 (Alexey Milovidov).
  • Add query_id to system.errors. Related ticket #75815. #76581 (Vladimir Baikov).
  • Support JSON type and subcolumns reading from View. #76903 (Pavel Kruglov).
  • Adding Support for Converting UInt128 to IPv6. This allows the bitAnd operation and arithmatics for IPv6 and conversion back to IPv6. Closes #76752. This allows the result from bitAnd operation on IPv6 to be converted back to IPv6, as well. See: https://github.com/ClickHouse/ClickHouse/pull/57707. #76928 (Muzammil Abdul Rehman).
  • Don't parse special Bool values in text formats inside Variant type by default. It can be enabled using setting allow_special_bool_values_inside_variant. #76974 (Pavel Kruglov).
  • Support configurable per task waiting time of low priority query in session level and in server level. #77013 (VicoWu).
  • Use FixedString for PostgreSQL's CHARACTER, CHAR and BPCHAR. #77304 (Pablo Marcos).
  • Support using a remote disk for databases to store metadata files. #77365 (Tuan Pham Anh).
  • Implement comparison for values of JSON data type. Now JSON objects can be compared similarly to Maps. #77397 (Pavel Kruglov).
  • Support ALTER TABLE ... ATTACH|DETACH|MOVE|REPLACE PARTITION for the plain_rewritable disk. #77406 (Julia Kartseva).
  • Support query parameters inside additional_table_filters setting. After the change, the following query would succeed:. #77680 (wxybear).
  • Better permission support by system.kafka_consumers Internal librdkafka errors are presented as exceptions. #77700 (Ilya Golshtein).
  • User-defined functions (UDFs) can now be marked as deterministic via a new tag in their XML definition. Also, the query cache now checks if UDFs called within a query are deterministic. If this is the case, it caches the query result. (Issue #59988). #77769 (Jimmy Aguilar Mena).
  • Added Buffer table engine parameters validation. #77840 (Pervakov Grigorii).
  • Add config enable_hdfs_pread to enable or disable hdfs pread. #77885 (kevinyhzou).
  • Add profile events for number of zookeeper 'multi' read and write requests. #77888 (JackyWoo).
  • Allow creating and inserting into temp table when disable_insertion_and_mutation is on. #77901 (Xu Jia).
  • Decrease max_insert_delayed_streams_for_parallel_write (to 100). #77919 (Azat Khuzhin).
  • Add ability to configure number of columns that merges can flush in parallel using max_merge_delayed_streams_for_parallel_write (this should reduce memory usage for vertical merges to S3 about 25x times). #77922 (Azat Khuzhin).
  • Fix year parsing in joda syntax like 'yyy'. #77973 (李扬).
  • Attaching parts of MergeTree tables will be performed in their block order, which is important for special merging algorithms, such as ReplacingMergeTree. This closes #71009. #77976 (Alexey Milovidov).
  • Query masking rules are now able to throw a LOGICAL_ERROR in case if the match happened. This will help to check if pre-defined password is leaking anywhere in logs. #78094 (Nikita Mikhaylov).
  • Added column index_length_column to information_schema.tables for better compatibility with MySQL. #78119 (Paweł Zakrzewski).
  • Introduce two new metrics: TotalMergeFailures and NonAbortedMergeFailures. These metrics are needed to detect the cases where too many merges fail within a short period. #78150 (Miсhael Stetsyuk).
  • Fix incorrect S3 uri parsing when key is not specified on path style. #78185 (Arthur Passos).
  • Fix incorrect values of BlockActiveTime, BlockDiscardTime, BlockWriteTime, BlockQueueTime, and BlockReadTime asynchronous metrics (before the change 1 second was incorrectly reported as 0.001). #78211 (filimonov).
  • Respect loading_retries limit for errors during push to materialized view for StorageS3(Azure)Queue. Before that such errors were retried indefinitely. #78313 (Kseniia Sumarokova).
  • In StorageDeltaLake with delta-kernel-rs implementation, fix performance and progress bar. #78368 (Kseniia Sumarokova).
  • Vector similarity index could over-allocate main memory by up to 2x. This fix reworks the memory allocation strategy, reducing the memory consumption and improving the effectiveness of the vector similarity index cache. (issue #78056). #78394 (Shankar Iyer).
  • Introduce a setting schema_type for system.metric_log table with schema type. There are three allowed schemas: wide -- current schema, each metric/event in a separate column (most effective for reads of separate columns), transposed -- similar to system.asynchronous_metric_log, metrics/events are stored as rows, and the most interesting transposed_with_wide_view -- create underlying table with transposed schema, but also introduce a view with wide schema which translates queries to underlying table. In transposed_with_wide_view subsecond resolution for view is not supported, event_time_microseconds is just an alias for backward compatibility. #78412 (alesapin).
  • Support include, from_env, from_zk for runtime disks. Closes #78177. #78470 (Kseniia Sumarokova).
  • Add several convenient ways to resolve root metadata.json file in an iceberg table function and engine. Closes #78455. #78475 (Daniil Ivanik).
  • Support partition pruning in delta lake. #78486 (Kseniia Sumarokova).
  • Support password based auth in SSH protocol in ClickHouse. #78586 (Nikita Mikhaylov).
  • Add a dynamic warning to the system.warnings table for long running mutations. #78658 (Bharat Nallan).
  • Backported in #79298: Added field condition to system table system.query_condition_cache. It stores the plaintext condition whose hash is used as a key in the query condition cache. #78671 (Robert Schulze).
  • Drop connections if the CPU is massively overloaded. The decision is made based on the ratio of wait time (OSCPUWaitMicroseconds) to busy time (OSCPUVirtualTimeMicroseconds). The query is dropped with some probability, when this ratio is between min_os_cpu_wait_time_ratio_to_drop_connection and max_os_cpu_wait_time_ratio_to_drop_connection. #78778 (Alexey Katsman).
  • Backported in #79156: Add table settings for SASL configuration and credentials to the Kafka table engine. This allows configuring SASL-based authentication to Kafka and Kafka-compatible systems directly in the CREATE TABLE statement rather than having to use configuration files or named collections. #78810 (Christoph Wurm).
  • Allow empty value on hive partitioning. #78816 (Arthur Passos).
  • Fix IN clause type coercion for BFloat16 (i.e. SELECT toBFloat16(1) IN [1, 2, 3]; now returns 1). Closes #78754. #78839 (Raufs Dunamalijevs).
  • Do not check parts on other disks for MergeTree if disk= is set. #78855 (Azat Khuzhin).
  • Make data types in used_data_type_families in system.query_log canonical. #78972 (Kseniia Sumarokova).
  • Backported in #79063: Support for a refresh in readonly MergeTree tables. #79033 (Alexey Milovidov).
  • Backported in #79239: Enabled the query condition cache by default. #79080 (Alexey Milovidov).
  • Backported in #79139: Remove settings during recoverLostReplica same as it was done in: https://github.com/ClickHouse/ClickHouse/pull/78637. #79113 (Nikita Mikhaylov).
  • Backported in #79260: Support altering database on cluster. #79242 (Tuan Pham Anh).

Bug Fix (user-visible misbehavior in an official stable release)

  • Fix incorrect projection analysis when count(nullable) is used in aggregate projections. This fixes #74495 . This PR also adds some logs around projection analysis to clarify why a projection is used or why not. #74498 (Amos Bird).
  • Fix "Part <...> does not contain in snapshot of previous virtual parts. (PART_IS_TEMPORARILY_LOCKED)" during DETACH PART. #76039 (Aleksei Filatov).
  • Fix not working skip indexes with expression with literals in analyzer and remove trivial casts during indexes analysis. #77229 (Pavel Kruglov).
  • Fix a bug when close_session query parameter didn't have any effect leading to named sessions being closed only after session_timeout. #77336 (Alexey Katsman).
  • Fixed receiving messages from nats server without attached mv. #77392 (Dmitry Novikov).
  • Fix logical error while reading from empty FileLog via merge table function, close #75575. #77441 (Vladimir Cherkasov).
  • Use default format settings in Dynamic serialization from shared variant. #77572 (Pavel Kruglov).
  • Fix checking if the table data path exists on the local disk. #77608 (Tuan Pham Anh).
  • Fix sending constant values to remote for some types. #77634 (Pavel Kruglov).
  • Fix crash because of expired context in StorageS3(Azure)Queue. #77720 (Kseniia Sumarokova).
  • Hide credentials in RabbitMQ, Nats, Redis, AzureQueue table engines. #77755 (Kseniia Sumarokova).
  • Fix undefined behaviour on NaN comparison in ArgMin/ArgMax. #77756 (Raúl Marín).
  • Regularly check if merges and mutations were cancelled even in case when the operation doesn't produce any blocks to write. #77766 (János Benjamin Antal).
  • Backported in #79280: Fixed refreshable materialized view in Replicated database not working on newly added replicas. #77774 (Michael Kolupaev).
  • Reverted. #77843 (Vladimir Cherkasov).
  • Fix possible crash when NOT_FOUND_COLUMN_IN_BLOCK error occurs. #77854 (Vladimir Cherkasov).
  • Fix crash that happens in the StorageSystemObjectStorageQueueSettings while filling data. #77878 (Bharat Nallan).
  • Disable fuzzy search for history in SSH server (since it requires skim). #78002 (Azat Khuzhin).
  • Fixes a bug that a vector search query on a non-indexed column was returning incorrect results if there was another vector column in the table with a defined vector similarity index. (Issue #77978). #78069 (Shankar Iyer).
  • Fix "The requested output format {} is binary... Do you want to output it anyway? [y/N]" prompt. #78095 (Azat Khuzhin).
  • Fix of a bug in case of toStartOfInterval with zero origin argument. #78096 (Yarik Briukhovetskyi).
  • Disallow specifying an empty session_id query parameter for HTTP interface. #78098 (Alexey Katsman).
  • Fix metadata override in Database Replicated which could have happened due to a RENAME query executed right after an ALTER query. #78107 (Nikolay Degterinsky).
  • Fix crash in NATS engine. #78108 (Dmitry Novikov).
  • Do not try to create history_file in embedded client for SSH. #78112 (Azat Khuzhin).
  • Fix system.detached_tables displaying incorrect information after RENAME DATABASE or DROP TABLE queries. #78126 (Nikolay Degterinsky).
  • Fix for checks for too many tables with Database Replicated after https://github.com/ClickHouse/ClickHouse/pull/77274. Also, perform the check before creating the storage to avoid creating unaccounted nodes in ZooKeeper in the case of RMT or KeeperMap. #78127 (Nikolay Degterinsky).
  • Fix possible crash due to concurrent S3Queue metadata initialization. #78131 (Azat Khuzhin).
  • groupArray* functions now produce BAD_ARGUMENTS error for Int-typed 0 value of max_size argument, like it's already done for UInt one, instead of trying to execute with it. #78140 (Eduard Karacharov).
  • Prevent crash on recoverLostReplica if the local table is removed before it's detached. #78173 (Raúl Marín).
  • Fix "alterable" column in system.s3_queue_settings returning always false. #78187 (Kseniia Sumarokova).
  • Mask azure access signature to be not visible to user or in logs. #78189 (Kseniia Sumarokova).
  • Fix prefetch of substreams with prefixes in Wide parts. #78205 (Pavel Kruglov).
  • Fixed crashes / incorrect result for mapFromArrays in case of LowCardinality(Nullable) type of keys array. #78240 (Eduard Karacharov).
  • Fix delta-kernel auth options. #78255 (Kseniia Sumarokova).
  • Not schedule RefreshMV task if a replica's disable_insertion_and_mutation is true. A task is some insertion, it will failed if disable_insertion_and_mutation is true. #78277 (Xu Jia).
  • Validate access to underlying tables for the Merge engine. #78339 (Pervakov Grigorii).
  • FINAL modifier can be lost for Distributed engine table. #78428 (Yakov Olkhovskiy).
  • Bitmapmin return uint32_max when the bitmap is empty(uint64_max when input type >= 8bits), which matches the behavior of empty roaring_bitmap's minimum(). #78444 (wxybear).
  • Revert "Apply preserve_most attribute at some places in code" since it may lead to crashes. #78449 (Azat Khuzhin).
  • Use insertion columns for INFILE schema inference. #78490 (Pervakov Grigorii).
  • Disable parallelize query processing right after reading FROM when distributed_aggregation_memory_efficient enabled, it may lead to logical error. Closes #76934. #78500 (flynn).
  • Set at least one stream for reading in case there are zero planned streams after applying max_streams_to_max_threads_ratio setting. #78505 (Eduard Karacharov).
  • In storage S3Queue fix logical error "Cannot unregister: table uuid is not registered". Closes #78285. #78541 (Kseniia Sumarokova).
  • ClickHouse is now able to figure out its cgroup v2 on systems with both cgroups v1 and v2 enabled. #78566 (Grigory Korolev).
  • ObjectStorage cluster table functions failed when used with table level-settings. #78587 (Daniil Ivanik).
  • Better checks for transactions is not supported by ReplicatedMergeTree on INSERTs. #78633 (Azat Khuzhin).
  • Apply query settings during attach. #78637 (Raúl Marín).
  • Fix crash when invalid path was specified in iceberg_metadata_file_path. #78688 (alesapin).
  • In DeltaLake table engine with delta-kernel implementation fix case when read schema is different from table schema and there are partition columns at the same time leading to not found column error. #78690 (Kseniia Sumarokova).
  • Fix a problem when after scheduling to close a named session (but before timeout expiration), creation of a new named session with the same name led to it being closed at a time point when the first session was scheduled to close. #78698 (Alexey Katsman).
  • Backported in #79077: Fixed several types of SELECT queries that read from tables with MongoDB engine or mongodb table function: queries with implicit conversion of const value in WHERE clause (e.g. WHERE datetime = '2025-03-10 00:00:00') ; queries with LIMIT and GROUP BY. Previously, they could return the wrong result. #78777 (Anton Popov).
  • Don't block table shutdown while running CHECK TABLE. #78782 (Raúl Marín).
  • Keeper fix: fix ephemeral count in all cases. #78799 (Antonio Andelic).
  • Fix bad cast in StorageDistributed when using table functions other than view(). Closes #78464. #78828 (Konstantin Bogdanov).
  • Fix formatting for tupleElement(*, 1). Closes #78639. #78832 (Konstantin Bogdanov).
  • Dictionaries of type ssd_cache now reject zero or negative block_size and write_buffer_size parameters (issue #78314). #78854 (Elmi Ahmadov).
  • Fix crash in REFRESHABLE MV in case of ALTER after incorrect shutdown. #78858 (Azat Khuzhin).
  • Fix parsing of bad DateTime values in CSV format. #78919 (Pavel Kruglov).
  • Backported in #79274: Keeper fix: Avoid triggering watches on failed multi requests. #79247 (Antonio Andelic).

Build/Testing/Packaging Improvement

BugFix

  • Fix reading iceberg table failed when min-max value is NULL. Closes #78740. ### Documentation entry for user-facing changes. #78764 (flynn).

NO CL ENTRY

NOT FOR CHANGELOG / INSIGNIFICANT