Back to Clickhouse

2025 Changelog

docs/changelogs/v25.8.1.5101-lts.md

26.4.1.1-new145.8 KB
Original Source

2025 Changelog

ClickHouse release v25.8.1.5101-lts (4f2b50b8c92) FIXME as compared to v25.8.1.1-new (d4d4c9f77fa)

Backward Incompatible Change

  • Disable quoting 64 bit integers in JSON formats by default. #74079 (Pavel Kruglov).
  • Infer Array(Dynamic) instead of unnamed Tuple for arrays of values with different types in JSON. To use previous behaviour, disable setting input_format_json_infer_array_of_dynamic_from_array_of_different_types. #80859 (Pavel Kruglov).
  • Move S3 latency metrics to histograms for homogeneity and simplicity. #82305 (Miсhael Stetsyuk).
  • Require backticks around identifiers with dots in default expressions to prevent them from being parsed as compound identifiers. #83162 (Pervakov Grigorii).
  • Lazy materialization is enabled only with analyzer to avoid maintenance without analyzer, — which, in our experience, have some issues (for example, when using indexHint() in conditions). #83791 (Igor Nikonov).
  • Write values of Enum type as BYTE_ARRAY with ENUM logical type in Parquet output format by default. #84169 (Pavel Kruglov).
  • Enable MergeTree setting write_marks_for_substreams_in_compact_parts by default. It significantly improves performance of subcolumns reading from newly created Compact parts. Servers with version less then 25.5 won't be able to read new Compact parts. #84171 (Pavel Kruglov).
  • The previous concurrent_threads_scheduler default value was round_robin, which proved unfair in the presence of a high number of single-threaded queries (e.g., INSERTs). This change makes a safer alternative fair_round_robin scheduler, the default. #84747 (Sergei Trifonov).
  • ClickHouse supports PostgreSQL-style heredoc syntax: $tag$ string contents... $tag$, also known as dollar-quoted string literals. In previous versions, there were fewer restrictions on tags: they could contain arbitrary characters, including punctuation and whitespace. This introduces parsing ambiguity with identifiers that can also start with a dollar character. At the same time, PostgreSQL only allows word characters for tags. To resolve the problem, we now restrict heredoc tags only to contain word characters. Closes #84731. #84846 (Alexey Milovidov).
  • The functions azureBlobStorage, deltaLakeAzure, and icebergAzure have been updated to properly validate AZURE permissions. All cluster-variant functions (-Cluster functions) now verify permissions against their corresponding non-clustered counterparts. Additionally, the icebergLocal and deltaLakeLocal functions now enforce FILE permission checks. #84938 (Nikita Mikhaylov).
  • Enables allow_dynamic_metadata_for_data_lakes setting (Table Engine level setting) by default. #85044 (Daniil Ivanik).

New Feature

  • Not only for Merge tables, but all tables support the _table virtual column. #63665 (Xiaozhe Yu).
  • System table to keep erroneous incoming messages from engines like kafka. #68873 (Ilya Golshtein).
  • This PR introduces the restore database replica functionality for replicated databases, similar to the existing functionality for restore in ReplicatedMergeTree. #73100 (Konstantin Morozov).
  • Implement support for ArrowFlight RPC protocol by adding: - new table function arrowflight(): SELECT * FROM arrowflight('host:port', 'dataset_name'). #74184 (zakr600).
  • PostgreSQL protocol COPY command support. #74344 (Konstantin Vedernikov).
  • Basic support for the PromQL dialect is added. To use it, set dialect='promql' in clickhouse-client, point it to the TimeSeries table using the setting promql_table_name='X' and execute queries like rate(ClickHouseProfileEvents_ReadCompressedBytes[1m])[5m:1m]. In addition you can wrap the PromQL query with SQL: SELECT * FROM prometheusQuery('up', ...);. So far only functions rate, delta and increase are supported. No unary/binary operators. No HTTP API. #75036 (Vitaly Baranov).
  • Add support for hive partition style writes and refactor reads implementation (hive partition columns are no longer virtual). #76802 (Arthur Passos).
  • Add zookeeper_connection_log system table to store historical information about ZooKeeper connections. #79494 (János Benjamin Antal).
  • Server setting cpu_slot_preemption enables preemptive CPU scheduling for workloads and ensures max-min fair allocation of CPU time among workloads. New workload settings for CPU throttling are added: max_cpus, max_cpu_share and max_burst_cpu_seconds. More details: https://clickhouse.com/docs/operations/workload-scheduling#cpu_scheduling. #80879 (Sergei Trifonov).
  • Drop TCP connection after configured number of queries or time threshold. Resolves #68000. #81472 (Kenny Sun).
  • Reading from projections is implemented for parallel replicas. A new setting parallel_replicas_support_projection has been added to control whether projection support is enabled. To simplify the implementation, support for projection is only enabled when parallel_replicas_local_plan is active. #82807 (zoomxi).
  • Support DESCRIBE SELECT in addition to DESCRIBE (SELECT ...). #82947 (Yarik Briukhovetskyi).
  • Force secure connection for mysql_port and postgresql_port. #82962 (tiandiwonder).
  • Support position deletes for Iceberg TableEngine. #83094 (Daniil Ivanik).
  • Users can now do case-insensitive JSON key lookups using JSONExtractCaseInsensitive (and other variants of JSONExtract). #83770 (Alistair Evans).
  • AI Powered SQL generation can now infer from env ANTHROPIC_API_KEY and OPENAI_API_KEY if available, this is to make it so that we can have a zero config option to use this feature. #83787 (Kaushik Iska).
  • Introduction of system.completions table. Closes #81889. #83833 (|2ustam).
  • Iceberg writes for create. Closes #83927. #83983 (Konstantin Vedernikov).
  • Glue catalogs for writes. #84136 (Konstantin Vedernikov).
  • Added a new function nowInBlock64. Example usage: SELECT nowInBlock64(6) returns 2025-07-29 17:09:37.775725. #84178 (Halersson Paris).
  • Add extra_credentials to AzureBlobStorage to authenticate with client_id and tenant_id. #84235 (Pablo Marcos).
  • Added function dateTimeToUUIDv7 to convert a DateTime value to a UUIDv7. Example usage: SELECT dateTimeToUUIDv7(toDateTime('2025-08-15 18:57:56')) returns 0198af18-8320-7a7d-abd3-358db23b9d5c. #84319 (samradovich).
  • timeSeriesDerivToGrid and timeSeriesPredictLinearToGrid aggregate functions to re-sample data to a time grid defined by the specified start timestamp, end timestamp, and step; calculates PromQL-like deriv and predict_linear, respectively. #84328 (Stephen Chi).
  • Support C# client for mysql protocol. This closes #83992. #84397 (Konstantin Vedernikov).
  • New syntax added GRANT READ ON S3('s3://foo/.*') TO user. #84503 (pufit).
  • Added Hash as a new output format. It calculates a single hash value for all columns and rows of the result. This is useful for calculating a "fingerprint" of the result, for example, in use cases where data transfer is a bottleneck. Example: SELECT arrayJoin(['abc', 'def']), 42 FORMAT Hash returns e5f9e676db098fdb9530d2059d8c23ef. #84607 (Robert Schulze).
  • Iceberg Rest catalogs for writes. #84684 (Konstantin Vedernikov).
  • Add the ability to set up arbitrary watches in Keeper Multi queries. #84964 (Mikhail Artemenko).
  • Merge all iceberg position delete files into data files. This will reduce amount and sizes of parquet files in iceberg storage. Syntax: OPTIMIZE TABLE table_name. #85250 (Konstantin Vedernikov).
  • Support partially aggregated metrics. #85328 (Mikhail Artemenko).
  • Support drop table for iceberg (Removing from REST/Glue catalogs + removing metadata about table). #85395 (Konstantin Vedernikov).
  • Support alter delete mutations for iceberg in merge-on-read format. #85549 (Konstantin Vedernikov).
  • Support writes into DeltaLake. Closes #79603. #85564 (Kseniia Sumarokova).
  • Write more iceberg statistics (column sizes, lower and upper bounds) in metadata (manifest entries) for min-max pruning. #85746 (Konstantin Vedernikov).
  • Support add/drop/modify columns in iceberg for simple types. #85769 (Konstantin Vedernikov).

Experimental Feature

Performance Improvement

  • azureBlobStorage table engine: cache and reuse managed identity authentication tokens when possible to avoid throttling. #79860 (Nick Blakely).
  • Added new logic (controlled by the setting enable_producing_buckets_out_of_order_in_aggregation, enabled by default) that allows sending some buckets out of order during memory-efficient aggregation. When some aggregation buckets take significantly longer to merge than others, it improves performance by allowing the initiator to merge buckets with higher bucket id-s in the meantime. The downside is potentially higher memory usage (shouldn't be significant). #80179 (Nikita Taranov).
  • New parquet reader implementation. It's generally faster and supports page-level filter pushdown and PREWHERE. Currently experimental. Use setting input_format_parquet_use_native_reader_v3 to enable. #82789 (Michael Kolupaev).
  • Process max_joined_block_rows outside of hash JOIN main loop. Slightly better performance for ALL JOIN. #83216 (Nikolai Kochetov).
  • Replace curl http client with poco http client for azure blob storage. Introduced multiple settings for this clients which mirror settings from S3. Introduced aggressive connect timeouts for both Azure and S3. Improved introspection into Azure profile events and metrics. New client is enabled by default, provide much better latencies for cold queries on top of Azure Blob Storage. Old Curl client can be returned back by setting azure_sdk_use_native_client=false. #83294 (alesapin).
  • Significantly improve performance of JSON subcolumns reading from shared data in MergeTree by implementing new serializations for JSON shared data in MergeTree. #83777 (Pavel Kruglov).
  • Process higher granularity min-max indexes first. Closes #75381. #83798 (Maruth Goyal).
  • Vector search queries using a vector similarity index complete with lower latency due to reduced storage reads and reduced CPU usage. #83803 (Shankar Iyer).
  • Implement addManyDefaults for If combinators. #83870 (Raúl Marín).
  • Calculate serialized key columnarly when group by multiple string or number columns. #83884 (李扬).
  • Try -falign-functions=64 in attempt for more stable perf tests. #83920 (Azat Khuzhin).
  • The bloom filter index is now used for conditions like has([c1, c2, ...], column), where column is not of an Array type. This improves performance for such queries, making them as efficient as the IN operator. #83945 (Doron David).
  • Reduce unnecessary memcpy calls in CompressedReadBufferBase::readCompressedData. #83986 (Raúl Marín).
  • All LEFT/INNER JOINs will be automatically converted to RightAny if the right side is functionally determined by the join key columns (all rows have unique join key values). #84010 (Nikita Taranov).
  • Processes indexes in increasing order of file size. The net index ordering prioritizes minmax and vector indexes (due to simplicity and selectivity respectively), and small indexes thereafter. Within the minmax/vector indexes smaller indexes are also preferred. #84094 (Maruth Goyal).
  • Optimize largestTriangleThreeBuckets by removing temporary data. #84479 (Alexey Milovidov).
  • Optimize string deserialization by simplifying the code. Closes #38564. #84561 (Alexey Milovidov).
  • Previously, the text index data would be separated into multiple segments (each segment size by default was 256 MiB). This might reduce the memory consumption while building the text index, however this increases the space requirement on the disk and increase the query response time. #84590 (Elmi Ahmadov).
  • Fixed the calculation of the minimal task size for parallel replicas. #84752 (Nikita Taranov).
  • Improved performance of applying patch parts in Join mode. #85040 (Anton Popov).
  • Remove zero byte. Closes #85062. A few minor bugs were fixed. Functions structureToProtobufSchema, structureToCapnProtoSchema didn't correctly put a zero-terminating byte and were using a newline instead of it. That was leading to a missing newline in the output, and could lead to buffer overflows while using other functions that depend on the zero byte (such as logTrace, demangle, extractURLParameter, toStringCutToZero, and encrypt/decrypt). The regexp_tree dictionary layout didn't support processing strings with zero bytes. The formatRowNoNewline function, called with Values format or with any other format without a newline at the end of rows, erroneously cuts the last character of the output. Function stem contained an exception-safety error that could lead to a memory leak in a very rare scenario. The initcap function worked in the wrong way for FixedString arguments: it didn't recognize the start of the word at the start of the string if the previous string in a block ended with a word character. Fixed a security vulnerability of the Apache ORC format, which could lead to the exposure of uninitialized memory. Changed behavior of the function replaceRegexpAll and the corresponding alias, REGEXP_REPLACE: now it can do an empty match at the end of the string even if the previous match processed the whole string, such as in the case of ^a*|a*$ or ^|.* - this corresponds to the semantic of JavaScript, Perl, Python, PHP, Ruby, but differs to the semantic of PostgreSQL. Implementation of many functions has been simplified and optimized. Documentation for several functions was wrong and has now been fixed. Keep in mind that the output of byteSize for String columns and complex types, which consisted of String columns, has changed (from 9 bytes per empty string to 8 bytes per empty string), and this is normal. #85063 (Alexey Milovidov).
  • Optimize the materialization of constants in cases when we do this materialization only to return a single row. #85071 (Alexey Milovidov).
  • Improve parallel files processing with delta-kernel-rs backend. #85642 (Azat Khuzhin).

Improvement

  • Show the number of ranges to be read in the output of EXPLAIN indexes = 1. #79938 (Christoph Wurm).
  • Introduce settings to set ORC compression block size, and update its default value from 64KB to 256KB to keep consistent with spark or hive. #80602 (李扬).
  • Add columns_substreams.txt file to Wide part to track all substreams stored in the part. It helps to track dynamic streams in JSON and Dynamic types and so avoid reading sample of these columns to get the list of dynamic streams (for example for columns sizes calculation). Also now all dynamic streams are reflected in system.parts_columns. #81091 (Pavel Kruglov).
  • Add a CLI flag --show_secrets to clickhouse format to hide sensitive data by default. #81524 (Nikolai Ryzhov).
  • S3 read and write requests are throttled on the HTTP socket level (instead of whole S3 requests) to avoid issues with max_remote_read_network_bandwidth_for_server and max_remote_write_network_bandwidth_for_server throttling. #81837 (Sergei Trifonov).
  • A new setting, enable_add_distinct_to_in_subqueries, has been introduced. When enabled, ClickHouse will automatically add DISTINCT to subqueries in IN clauses for distributed queries. This can significantly reduce the size of temporary tables transferred between shards and improve network efficiency. Note: This is a trade-off—while network transfer is reduced, additional merging (deduplication) work is required on each node. Enable this setting when network transfer is a bottleneck and the merging cost is acceptable. #81908 (fhw12345).
  • Introduced the optimize_rewrite_regexp_functions setting (enabled by default), which allows the optimizer to rewrite certain replaceRegexpAll, replaceRegexpOne, and extract calls into simpler and more efficient forms when specific regular expression patterns are detected. (issue #81981). #81992 (Amos Bird).
  • Rendezvous hashing for improve cache locality. #82511 (Anton Ivashkin).
  • Allow to mix different collations for the same column in different windows. #82877 (Yakov Olkhovskiy).
  • Add support of remote-() table functions with parallel replicas if cluster is provided in address_expression argument. Also, fixes #73295. #82904 (Igor Nikonov).
  • Set all log messages for writing backup files to TRACE. #82907 (Hans Krutzer).
  • User-defined functions with unusual names and codecs can be formatted inconsistently by the SQL formatter. This closes #83092. #83644 (Alexey Milovidov).
  • Users can now use Time and Time64 types inside the JSON type. #83784 (Yarik Briukhovetskyi).
  • Joins with parallel replicas now use the join logical step. In case of any issues with join queries using parallel replicas, try SET query_plan_use_new_logical_join_step=0 and report an issue. #83801 (Vladimir Cherkasov).
  • Add max_joined_block_size_bytes in addition to max_joined_block_size_rows to limit the memory usage of JOINs with heavy columns. #83869 (Nikolai Kochetov).
  • Reduce query memory tracking overhead for executable user-defined functions. #83929 (Eduard Karacharov).
  • Fix compatibility for cluster_function_process_archive_on_multiple_nodes. #83968 (Kseniia Sumarokova).
  • Support changing mv insert settings on S3Queue table level. Added new S3Queue level settings: min_insert_block_size_rows_for_materialized_views and min_insert_block_size_bytes_for_materialized_views. By default profile level settings will be used and S3Queue level settings will override those. #83971 (Kseniia Sumarokova).
  • Added profile event MutationAffectedRowsUpperBound that shows the number of affected rows in a mutation (e.g., the total number of rows that satisfy the condition in ALTER UPDATE or ALTER DELETE query. #83978 (Anton Popov).
  • Use information from cgroup (if applicable, i.e. memory_worker_use_cgroup and cgroups are available) to adjust memory tracker (memory_worker_correct_memory_tracker). #83981 (Azat Khuzhin).
  • Splits FormatParserGroup on two independent structs, the first one is responsible for shared compute and IO resources, the second one is responsible for shared filter resources (filter ActionDag, KeyCondition). This is done for more flexible shared usage of these structures by different threads. #83997 (Daniil Ivanik).
  • Implement internal delta-kernel-rs filtering (statistics and partition pruning) in storage DeltaLake. #84006 (Kseniia Sumarokova).
  • Implement AWS S3 authentication with an explicitly provided IAM role. Implement OAuth for GCS. These features were recently only available in ClickHouse Cloud and are now open-sourced. Synchronize some interfaces such as serialization of the connection parameters for object storages. #84011 (Alexey Milovidov).
  • Made the table columns in the web UI (play) resizable. #84012 (Doron David).
  • MongoDB: Implicit parsing of strings to numeric types. Previously, if a string value was received from a MongoDB source for a numeric column in a ClickHouse table, an exception was thrown. Now, the engine attempts to parse the numeric value from the string automatically. Closes #81167. #84069 (Kirill Nikiforov).
  • Highlight digit groups in Pretty formats for Nullable numbers. #84070 (Alexey Milovidov).
  • Dashboard: the tooltip will not overflow the container at the top. #84072 (Alexey Milovidov).
  • Slightly better-looking dots on the dashboard. #84074 (Alexey Milovidov).
  • Dashboard now has a slightly better favicon. #84076 (Alexey Milovidov).
  • All the allocations done by external libraries are now visible to ClickHouse's memory tracker and accounted properly. This may result in "increased" reported memory usage for certain queries or failures with MEMORY_LIMIT_EXCEEDED. #84082 (Nikita Mikhaylov).
  • Web UI: Give browsers a chance to save the password. Also, it will remember the URL values. #84087 (Alexey Milovidov).
  • Add support for applying extra ACL on specific Keeper nodes using apply_to_children config. #84137 (Antonio Andelic).
  • Fix usage of "compact" Variant discriminators serialization in MergeTree. Perviously it wasn't used in some cases when it could be used. #84141 (Pavel Kruglov).
  • Added a server setting, logs_to_keep to database replicated settings, that allows changing the default logs_to_keep parameter for replicated databases. Lower values reduce the number of ZNodes (especially if there are many databases), while higher values allow a missing replica to catch up after a longer period of time. #84183 (Alexey Khatskevich).
  • Add a setting json_type_escape_dots_in_keys to escape dots in JSON keys during JSON type parsing. The setting is disabled by default. #84207 (Pavel Kruglov).
  • Check if connection is cancelled before checking for EOF to prevent reading from closed connection. Fixes #83893. #84227 (Raufs Dunamalijevs).
  • Disable skipping indexes that depend on columns updated on the fly or by patch parts more granularly. Now, skipping indexes are not used only in parts affected by on-the-fly mutations or patch parts; previously, those indexes were disabled for all parts. #84241 (Anton Popov).
  • Slightly better colors of text selection in Web UI. The difference is significant only for selected table cells in the dark mode. In previous versions, there was not enough contrast between the text and the selection background. #84258 (Alexey Milovidov).
  • Improved server shutdown handling for client connections by simplifying internal checks. #84312 (Raufs Dunamalijevs).
  • Added a setting delta_lake_enable_expression_visitor_logging to turn off expression visitor logs as they can be too verbose even for test log level when debugging something. #84315 (Kseniia Sumarokova).
  • Cgroup-level and system-wide metrics are reported now altogether. Cgroup-level metrics have names CGroup<Metric> and OS-level metrics (collected from procfs) have names OS<Metric>. #84317 (Nikita Taranov).
  • Slightly better charts in Web UI. Not much, but better. #84326 (Alexey Milovidov).
  • Change the default of the Replicated database setting max_retries_before_automatic_recovery to 10, so it will recover faster in some cases. #84369 (Alexander Tokmakov).
  • Fix formatting of CREATE USER with query parameters (i.e. CREATE USER {username:Identifier} IDENTIFIED WITH no_password). #84376 (Azat Khuzhin).
  • Replace tab characters with spaces when pasting in interactive clickhouse-client. Closes #83922. #84412 (xiaohuanlin).
  • Introduce backup_restore_s3_retry_initial_backoff_ms, backup_restore_s3_retry_max_backoff_ms, backup_restore_s3_retry_jitter_factor to configure the S3 retry backoff strategy used during backup and restore operations. #84421 (Julia Kartseva).
  • Allocate the minimum amount of memory needed for encrypted_buffer for encrypted named collections. #84432 (Pablo Marcos).
  • S3Queue ordered mode fix: quit earlier if shutdown was called. #84463 (Kseniia Sumarokova).
  • Support iceberg writes to read from pyiceberg. #84466 (Konstantin Vedernikov).
  • Allow set values type casting when pushing down IN / GLOBAL IN filters over KeyValue storage primary keys (e.g., EmbeddedRocksDB, KeeperMap). #84515 (Eduard Karacharov).
  • Bump chdig to 25.7.1. #84521 (Azat Khuzhin).
  • Low-level errors during UDF execution now fail with error code UDF_EXECUTION_FAILED, whereas previously different error codes could be returned. #84547 (Xu Jia).
  • Add get_acl command to KeeperClient. #84641 (Antonio Andelic).
  • Adds snapshot version to data lake table engines. #84659 (Pete Hampton).
  • This change adds a dimensional metric for the size of ConcurrentBoundedQueue, labelled by the queue type (i.e. what the queue is there for) and queue id (i.e. randomly generated id for the current instance of the queue). #84675 (Miсhael Stetsyuk).
  • The system.columns table now provides column as an alias for the existing name column. #84695 (Yunchi Pang).
  • Improved support for bloom filter indexes (regular, ngram, and token) to be utilized when the first argument is a constant array (the set) and the second is the indexed column (the subset), enabling more efficient query execution. #84700 (Doron David).
  • New MergeTree setting search_orphaned_parts_drives to limit scope to look for parts e.g. by disks with local metadata. #84710 (Ilya Golshtein).
  • Add 4LW in Keeper, lgrq, for toggling request logging of received requests. #84719 (Antonio Andelic).
  • Reduce contention on storage lock in Keeper. #84732 (Antonio Andelic).
  • Allow to use any storage policy (i.e. object storage, such as S3) for external aggregation/sorting. #84734 (Azat Khuzhin).
  • Match external auth forward_headers in case-insensitive way. #84737 (ingodwerust).
  • Views, created by ephemeral users, will now store a copy of an actual user and will no longer be invalidated after the ephemeral user is deleted. #84763 (pufit).
  • The encrypt_decrypt tool now supports encrypted ZooKeeper connections. #84764 (Roman Vasin).
  • Add format string column to system.errors. This column is needed to group by the same error type in alerting rules. #84776 (Miсhael Stetsyuk).
  • Updated clickhouse-format to accept --highlight as an alias for --hilite. - Updated clickhouse-client to accept --hilite as an alias for --highlight. - Updated clickhouse-format documentation to reflect the change. #84806 (Rishabh Bhardwaj).
  • Fix iceberg reading by field ids for complex types. #84821 (Konstantin Vedernikov).
  • Add missing support of read_in_order_use_virtual_row for WHERE. It allows to skip reading more parts for queries with filters that were not fully pushed to PREWHERE. #84835 (Nikolai Kochetov).
  • Introduce a new backup_slow_all_threads_after_retryable_s3_error setting to reduce pressure on S3 during retry storms caused by errors such as SlowDown, by slowing down all threads once a single retryable error is observed. #84854 (Julia Kartseva).
  • Skip creating and renaming the old temp table of non-append RMV DDLs in Replicated DBs. #84858 (Tuan Pham Anh).
  • Limit Keeper log entry cache size by number of entries using keeper_server.coordination_settings.latest_logs_cache_entry_count_threshold and keeper_server.coordination_settings.commit_logs_cache_entry_count_threshold. #84877 (Antonio Andelic).
  • Allow using simdjson on unsupported architectures (previously leads to CANNOT_ALLOCATE_MEMORY errors). #84966 (Azat Khuzhin).
  • Eliminated full scans for the cases when index analysis results in empty ranges for parallel replicas reading. #84971 (Eduard Karacharov).
  • The vector similarity index now supports binary quantization. Binary quantization significantly reduces the memory consumption and speeds up the process of building a vector index (due to faster distance calculation). Also, the existing setting vector_search_postfilter_multiplier was made obsolete and replaced by a more general setting : vector_search_index_fetch_multiplier. #85024 (Shankar Iyer).
  • Async log: Make limits tuneable and add introspection. #85105 (Raúl Marín).
  • Enable correlated subqueries support by default. #85107 (Dmitry Novik).
  • Add database_replicated settings defining the default values of DatabaseReplicatedSettings. If the setting is not present in the Replicated DB create query, the value from this setting is used. #85127 (Tuan Pham Anh).
  • Iceberg: support writing version-hint file. This closes #85097. #85130 (Konstantin Vedernikov).
  • Allow key value arguments in s3 or s3Cluster table engine/function, e.g. for example s3('url', CSV, structure = 'a Int32', compression_method = 'gzip'). #85134 (Kseniia Sumarokova).
  • Support compressed .metadata.json file via iceberg_metadata_compression_method setting. It supports all clickhouse compression methods. This closes #84895. #85196 (Konstantin Vedernikov).
  • Added setting delta_lake_snapshot_version to allow reading specific snapshot version in table engine DeltaLake. #85295 (Kseniia Sumarokova).
  • Collect all removed objects to execute single object storage remove operation. #85316 (Mikhail Artemenko).
  • Iceberg's current implementation of positional delete files keeps all data in RAM. This can be quite expensive if the positional delete files are large, which is often the case. My implementation keeps only the last row-group of Parquet delete files in RAM, which is significantly cheaper. #85329 (Konstantin Vedernikov).
  • Chdig: fix leftovers on the screen, fix crash after edit query in editor, search in path for editor, update to 25.8.1. #85341 (Azat Khuzhin).
  • Allows asynchronously iterating objects from Iceberg table without storing objects for each data file explicitly. #85369 (Daniil Ivanik).
  • Add missing partition_columns_in_data_file to azure configuration. #85373 (Arthur Passos).
  • Allow zero step in functions timeSeries*ToGrid() This is part #3 of https://github.com/ClickHouse/ClickHouse/pull/75036. #85390 (Vitaly Baranov).
  • Added show_data_lake_catalogs_in_system_tables flag to manage adding data lake tables in system.tables resolves #85384. #85411 (Smita Kulkarni).
  • Add 2 new TimeSeries functions: - timeSeriesRange(start_timestamp, end_timestamp, step), - timeSeriesFromGrid(start_timestamp, end_timestamp, step, values),. #85435 (Vitaly Baranov).
  • Added support for macro expansion in remote_fs_zero_copy_zookeeper_path. #85437 (Mikhail Koviazin).
  • AI in clickhouse-client will look slightly better. #85447 (Alexey Milovidov).
  • Enable trace_log.symbolize for old deployments by default. #85456 (Azat Khuzhin).
  • Execute non-correlated EXISTS as a scalar subquery. This allows using a scalar subquery cache and constant-folding the result, which is helpful for indexes. For compatibility, the new setting execute_exists_as_scalar_subquery=1 is added. #85481 (Nikolai Kochetov).
  • Support resolution of more cases for compound identifiers. Particularly, it improves the compatibility of ARRAY JOIN with the old analyzer. Introduce a new setting analyzer_compatibility_allow_compound_identifiers_in_unflatten_nested to keep the old behaviour. #85492 (Nikolai Kochetov).
  • Adds an option --max-concurrency for the clickhouse-benchmark tool that enables a mode with a gradual increase in the number of parallel queries. #85623 (Sergei Trifonov).
  • Ignore UNKNOWN_DATABASE while obtaining table columns sizes for system.columns. #85632 (Azat Khuzhin).
  • Added limit (table setting max_uncompressed_bytes_in_patches) for total uncompressed bytes in patch parts. It prevents significant slowdowns of SELECT queries after lightweight updates and prevents possible misuse of lightweight updates. #85641 (Anton Popov).
  • Add a parameter column to system.grants to determine source type for GRANT READ/WRITE and the table engine for GRANT TABLE ENGINE. #85643 (MikhailBurdukov).
  • Fix parsing of a trailing comma in columns of the CREATE DICTIONARY query after a column with parameters, for example, Decimal(8). Closes #85586. #85653 (Nikolay Degterinsky).
  • Support inner arrays for the function nested. #85719 (Nikolai Kochetov).
  • Support Iceberg Equality Deletes. #85843 (Han Fei).
  • Approximate vector search with vector similarity indexes is now GA. #85888 (Robert Schulze).
  • Backported in #86221: Slow down S3 client threads on retryable errors in S3 Object Storage. This extends the previous setting backup_slow_all_threads_after_retryable_s3_error to S3 disks and renames it to the more general s3_slow_all_threads_after_retryable_error. #85918 (Julia Kartseva).
  • Backported in #86239: Mark settings allow_experimental_variant/dynamic/json and enable_variant/dynamic/json as obsolete. Now all three types are enabled unconditionally. #85934 (Pavel Kruglov).
  • Backported in #86377: Improved S3(Azure)Queue table engine to allow it to survive zookeeper connection loss without potential duplicates. Requires enabling S3Queue setting use_persistent_processing_nodes (changeable by ALTER TABLE MODIFY SETTING). #85995 (Kseniia Sumarokova).

Bug Fix (user-visible misbehavior in an official stable release)

  • Make DISTINCT window aggregates run in linear time and fix a bug in sumDistinct. Closes #79792. Closes #52253. #79859 (Nihal Z. Miaji).
  • This pr fixes the metadata resolution when querying iceberg tables through rest catalog. ... #80562 (Saurabh Kumar Ojha).
  • Fix markReplicasActive in DDLWorker and DatabaseReplicatedDDLWorker. #81395 (Tuan Pham Anh).
  • Fix rollback of Dynamic column on parsing failure. #82169 (Pavel Kruglov).
  • If function trim called with all-constant inputs now produces a constant output string. (Bug #78796). #82900 (Robert Schulze).
  • Fix logical error with duplicate subqueries when optimize_syntax_fuse_functions is enabled, close #75511. #83300 (Vladimir Cherkasov).
  • Fixed incorrect result of queries with WHERE ... IN (<subquery>) clause and enabled query condition cache (setting use_query_condition_cache). #83445 (LB7666).
  • Historically, gcs function did not require any access to use. Now it will check GRANT READ ON S3 permission for usage. Closes #70567. #83503 (pufit).
  • Skip unavailable nodes during INSERT SELECT from s3Cluster() into replicated MergeTree. #83676 (Igor Nikonov).
  • Fix write with append (in MergeTree used for experimental transactions) with plain_rewritable/plain metadata types, previously they were simply ignored. #83695 (Tuan Pham Anh).
  • Mask Avro schema registry authentication details to be not visible to user or in logs. #83713 (János Benjamin Antal).
  • Fix the issue where, if a MergeTree table is created with add_minmax_index_for_numeric_columns=1 or add_minmax_index_for_string_columns=1, the index is later materialized during an ALTER operation, and it prevents the Replicated database from initializing correctly on a new replica. #83751 (Nikolay Degterinsky).
  • Fixed parquet writer outputting incorrect statistics (min/max) for Decimal types. #83754 (Michael Kolupaev).
  • Fix sort of NaN values in LowCardinality(Float32|Float64|BFloat16) type. #83786 (Pervakov Grigorii).
  • When restoring from backup, the definer user may not be backed up, which will cause the whole backup to be broken. To fix this, we postpone the permissions check on the target table's creation during restore and only check it during runtime. #83818 (pufit).
  • Fix crash in client due to connection left in disconnected state after bad INSERT. #83842 (Azat Khuzhin).
  • Allow referencing any table in view(...) argument of remote table function with enabled analyzer. Fixes #78717. Fixes #79377. #83844 (Dmitry Novik).
  • Onprogress call in jsoneachrowwithprogress is synchronized with finalization. #83879 (Sema Checherinda).
  • This closes #81303. #83892 (Konstantin Vedernikov).
  • Fix colorSRGBToOKLCH/colorOKLCHToSRGB for mix of const and non-const args. #83906 (Azat Khuzhin).
  • Fix writing JSON paths with NULL values in RowBinary format. #83923 (Pavel Kruglov).
  • Overflow large values (>2106-02-07) when casting from Date to DateTime64 is fixed. #83982 (Yarik Briukhovetskyi).
  • Always apply filesystem_prefetches_limit (not only from MergeTreePrefetchedReadPool). #83999 (Azat Khuzhin).
  • Fix rare bug when MATERIALIZE COLUMN query could lead to unexpected files in checksums.txt and eventually detached data parts. #84007 (alesapin).
  • Fix the logical error Expected single dictionary argument for function while doing JOIN on an inequality condition when one of the columns is LowCardinality and the other is a constant. Closes #81779. #84019 (Alexey Milovidov).
  • Fix crash with clickhouse client when used in interactive mode with syntax highlighting. #84025 (Bharat Nallan).
  • Fixed wrong results when the query condition cache is used in conjunction with recursive CTEs (issue #81506). #84026 (zhongyuankai).
  • Handle exceptions properly in periodic parts refresh. #84083 (Azat Khuzhin).
  • Fix filter merging into JOIN condition in cases when equality operands have different types or they reference constants. Fixes #83432. #84145 (Dmitry Novik).
  • Fix rare clickhouse crash when table has projection, lightweight_mutation_projection_mode = 'rebuild' and user execute lighweight delete which deletes ALL rows from any block in table. #84158 (alesapin).
  • Fix deadlock caused by background cancellation checker thread. #84203 (Antonio Andelic).
  • Fix infinite recursive analysis of invalid WINDOW definitions. Fixes #83131. #84242 (Dmitry Novik).
  • Fixed a bug that was causing incorrect Bech32 Encoding and Decoding. The bug wasn't caught originally due to an online implementation of the algorithm used for testing having the same issue. #84257 (George Larionov).
  • Fixed incorrect construction of empty tuples in the array() function. This fixes #84202. #84297 (Amos Bird).
  • Fix LOGICAL_ERROR for queries with parallel replicas and multiple INNER joins followed by RIGHT join. Do not use parallel replicas for such queries. #84299 (Vladimir Cherkasov).
  • Previously, set indexes didn't consider Nullable columns while checking if granules passed the filter (issue #75485). #84305 (Elmi Ahmadov).
  • Now ClickHouse read tables from Glue Catalog where table type specified in lower case. #84316 (alesapin).
  • Do not try to substitute table functions to its cluster alternative in presence of a JOIN or subquery. #84335 (Konstantin Bogdanov).
  • Fix logger usage in IAccessStorage. #84365 (Konstantin Bogdanov).
  • Fixed a logical error in lightweight updates that update all columns in the table. #84380 (Anton Popov).
  • Codec DoubleDelta codec can now only be applied to columns of numeric type. In particular FixedString columns can no longer be compressed using DoubleDelta. (fixes #80220). #84383 (Jimmy Aguilar Mena).
  • The comparison against nan value was not using the correct ranges during MinMax index evaluation. #84386 (Elmi Ahmadov).
  • Fix reading Variant column with lazy materialization. #84400 (Pavel Kruglov).
  • Make zoutofmemory hardware error, otherwise it will throw logical error. see https://github.com/clickhouse/clickhouse-core-incidents/issues/877. #84420 (Han Fei).
  • Fixed server crash when a user created with no_password attempts to login after the server setting allow_no_password was changed to 0. #84426 (Shankar Iyer).
  • Fix out-of-order writes to Keeper changelog. Previously, we could have in-flight writes to changelog, but rollback could cause concurrent change of the destination file. This would lead to inconsistent logs, and possible data loss. #84434 (Antonio Andelic).
  • Now if all TTL are removed from table MergeTree will do nothing related to TTL. #84441 (alesapin).
  • Parallel distributed INSERT SELECT with LIMIT was allowed which is not correct, it leads to data duplication in target table. #84477 (Igor Nikonov).
  • Fix pruning files by virtual column in data lakes. #84520 (Kseniia Sumarokova).
  • Fix leaks for keeper with rocksdb storage (iterators was not destroyed). #84523 (Azat Khuzhin).
  • Fix ALTER MODIFY ORDER BY not validating TTL columns in sorting keys. TTL columns are now properly rejected when used in ORDER BY clauses during ALTER operations, preventing potential table corruption. #84536 (xiaohuanlin).
  • Change pre-25.5 value of allow_experimental_delta_kernel_rs to false for compatibility. #84587 (Kseniia Sumarokova).
  • Stops taking schema from manifest files but stores relevant schemas for each snapshot independently. Infer relevant schema for each data file from its corresponding snapshot. Previous behaviour violated Iceberg specification for manifest files entries with existing status. #84588 (Daniil Ivanik).
  • Fixed issue where Keeper setting rotate_log_storage_interval = 0 would cause ClickHouse to crash. (issue #83975). #84637 (George Larionov).
  • Fix logical error from S3Queue "Table is already registered". Closes #84433. Broken after https://github.com/ClickHouse/ClickHouse/pull/83530. #84677 (Kseniia Sumarokova).
  • Lock 'mutex' when getting zookeeper from 'view' in RefreshTask. #84699 (Tuan Pham Anh).
  • Fix CORRUPTED_DATA error when lazy columns are used with external sort. #84738 (János Benjamin Antal).
  • Fix column pruning with delta-kernel in storage DeltaLake. Closes #84543. #84745 (Kseniia Sumarokova).
  • Refresh credentials in delta-kernel in storage DeltaLake. #84751 (Kseniia Sumarokova).
  • Fix starting superfluous internal backups after connection problems. #84755 (Vitaly Baranov).
  • Fixed issue where querying a delayed remote source could result in vector out of bounds. #84820 (George Larionov).
  • The ngram and no_op tokenizers no longer crash the (experimental) text index for empty input tokens. #84849 (Robert Schulze).
  • Fixed lightweight updates for tables with ReplacingMergeTree and CollapsingMergeTree engines. #84851 (Anton Popov).
  • Correctly store all settings in table metadata for tables using object queue engine. #84860 (Antonio Andelic).
  • Fix total watches count returned by Keeper. #84890 (Antonio Andelic).
  • Fixed lightweight updates for tables with ReplicatedMergeTree engine created on servers with a version lower than 25.7. #84933 (Anton Popov).
  • Fixed lightweight updates for tables with non-replicated MergeTree engine after running a ALTER TABLE ... REPLACE PARTITION query. #84941 (Anton Popov).
  • Fixes column name generation for boolean literals to use "true"/"false" instead of "1"/"0", preventing column name conflicts between boolean and integer literals in queries. #84945 (xiaohuanlin).
  • Fix memory tracking drift from background schedule pool and executor. #84946 (Azat Khuzhin).
  • Fix potential inaccurate sorting issues in the Merge table engine. #85025 (Xiaozhe Yu).
  • Implement missing APIs for DiskEncrypted. #85028 (Azat Khuzhin).
  • Add a check if a correlated subquery is used in a distributed context to avoid a crash. Fixes #82205. #85030 (Dmitry Novik).
  • Now Iceberg doesn't try to cache relevant snapshot version between select queries and always try to resolve snapshot honestly. Earlier attempt to cache iceberg snapshot led to problems with usage of Iceberg table with time travel. #85038 (Daniil Ivanik).
  • Fixed double-free in AzureIteratorAsync. #85064 (Nikita Taranov).
  • Improve error message on attempt to create user identified with JWT. #85072 (Konstantin Bogdanov).
  • Fixed cleanup of patch parts in ReplicatedMergeTree. Previously, the result of a lightweight update may temporarily not be visible on the replica until the merged or mutated part that materializes the patch parts is downloaded from another replica. #85121 (Anton Popov).
  • Fixing illegal_type_of_argument in mv when types are different. #85135 (Sema Checherinda).
  • Fix segfault in delta-kernel implementation. #85160 (Kseniia Sumarokova).
  • Fix recovering replicated databases when moving the metadata file takes a long time. #85177 (Tuan Pham Anh).
  • Fix Not-ready Set for IN (subquery) inside additional_table_filters expression setting. #85210 (Nikolai Kochetov).
  • Get rid of unnecessary getStatus() calls during SYSTEM DROP REPLICA queries. Fixes the case when a table is dropped in the background, and the Shutdown for storage is called exception is thrown. #85220 (Nikolay Degterinsky).
  • Fix race in DeltaLake engine delta-kernel implementation. #85221 (Kseniia Sumarokova).
  • Fix reading partitioned data with disabled delta-kernel in DeltaLake engine. It was broken in 25.7 (https://github.com/ClickHouse/ClickHouse/pull/81136). #85223 (Kseniia Sumarokova).
  • Added missing table name length checks in CREATE OR REPLACE and RENAME queries. #85326 (Michael Kolupaev).
  • Fix the creation of RMV on a new replica of the Replicated database if DEFINER is dropped. #85327 (Nikolay Degterinsky).
  • Fix iceberg writes for complex types. #85330 (Konstantin Vedernikov).
  • Writing lower and upper bounds are not supported for complex types. #85332 (Konstantin Vedernikov).
  • Fix logical error while reading from object storage functions through Distributed table or remote table function. Fixes: #84658, Fixes #85173, Fixes #52022. #85359 (alesapin).
  • Fix backup of parts with broken projections. #85362 (Antonio Andelic).
  • Forbid using _part_offset column in projection in releases until it is stabilized. #85372 (Sema Checherinda).
  • Fix crash and data corruption during ALTER UPDATE for JSON. #85383 (Pavel Kruglov).
  • Queries with parallel replicas which uses reading reverse in order optimization can produce incorrect result. #85406 (Igor Nikonov).
  • Fix possible UB (crashes) in case of MEMORY_LIMIT_EXCEEDED during String deserialization. #85440 (Azat Khuzhin).
  • Fix incorrect metrics KafkaAssignedPartitions and KafkaConsumersWithAssignment. #85494 (Ilya Golshtein).
  • Fixed processed bytes stat being underestimated when PREWHERE (explicit or automatic) is used. #85495 (Michael Kolupaev).
  • Fix early return condition for S3 request rate slowdown: require either s3_slow_all_threads_after_network_error or backup_slow_all_threads_after_retryable_s3_error to be true to enable slowdown behavior when all threads are paused due to a retryable error, instead of requiring both. #85505 (Julia Kartseva).
  • This pr fixes the metadata resolution when querying iceberg tables through rest catalog. ... #85531 (Saurabh Kumar Ojha).
  • Backported in #86136: 1. LowCardinality for hive columns 2. Fill hive columns before virtual columns (required for https://github.com/ClickHouse/ClickHouse/pull/81040) 3. LOGICAL_ERROR on empty format for hive #85528 4. Fix check for hive partition columns being the only columns 5. Assert all hive columns are specified in the schema 6. Partial fix for parallel_replicas_cluster with hive 7. Use ordered container in extractkeyValuePairs for hive utils (required for https://github.com/ClickHouse/ClickHouse/pull/81040). #85538 (Arthur Passos).
  • Fixed rare crash in asynchronous inserts that change settings log_comment or insert_deduplication_token. #85540 (Anton Popov).
  • Parameters like date_time_input_format were ignored when using HTTP with multipart/form-data. #85570 (Sema Checherinda).
  • Fix secrets masking in icebergS3Cluster and icebergAzureCluster table functions. #85658 (MikhailBurdukov).
  • Fix precision loss in JSONExtract when converting JSON numbers to Decimal types. Now numeric JSON values preserve their exact decimal representation, avoiding floating-point rounding errors. #85665 (ssive7b).
  • Fixed LOGICAL_ERROR when using COMMENT COLUMN IF EXISTS in the same ALTER statement after DROP COLUMN. The IF EXISTS clause now correctly skips the comment operation when the column has been dropped within the same statement. #85688 (xiaohuanlin).
  • Fix reading count from cache for delta lake. #85704 (Kseniia Sumarokova).
  • Fix coalescing merge tree segfault for large strings. This closes #84582. #85709 (Konstantin Vedernikov).
  • Update metadata timestamp in iceberg writes. #85711 (Konstantin Vedernikov).
  • Using distributed_depth as an indicator of *Cluster function was incorrect and may lead to data duplication; use client_info.collaborate_with_initiator instead. #85734 (Konstantin Bogdanov).
  • Spark can't read position delete files. #85762 (Konstantin Vedernikov).
  • Fix send_logs_source_regexp (after async logging refactoring in #85105). #85797 (Azat Khuzhin).
  • Fix possible inconsistency for dictionaries with update_field on MEMORY_LIMIT_EXCEEDED errors. #85807 (Azat Khuzhin).
  • Support global constants from WITH statement for the parallel distributed INSERT SELECT with the Distributed destination table. Before, the query could throw an Unknown expression identifier error. #85811 (Nikolai Kochetov).
  • Mask credentials for deltaLakeAzure, deltaLakeCluster, icebergS3Cluster and icebergAzureCluster. #85889 (Julian Maicher).
  • Fix logical error on attempt to CREATE ... AS (SELECT * FROM s3Cluster(...)) with DatabaseReplicated. #85904 (Konstantin Bogdanov).
  • Fixes HTTP requests made by the url() table function to properly include port numbers in the Host header when accessing non-standard ports. This resolves authentication failures when using presigned URLs with S3-compatible services like MinIO running on custom ports, which is common in development environments. (Fixes #85898). #85921 (Tom Quist).
  • Now unity catalog will ignore schemas with weird data types in case of non-delta tables. Fixes #85699. #85950 (alesapin).
  • Recover the Replicated Database forcefully after restoring the database metadata in Keeper. #85960 (Tuan Pham Anh).
  • Fix nullability of fields in iceberg. #85977 (Konstantin Vedernikov).
  • Fixed a bug in Replicated database recovery: if a table name contains the % symbol, it could re-create the table with a different name during recovery. #85987 (Alexander Tokmakov).
  • Fix backup restores failing due to BACKUP_ENTRY_NOT_FOUND error when restoring an empty Memory table. #86012 (Julia Kartseva).
  • Add checks for sharding_key during ALTER of the Distributed table. Previously incorrect ALTER would break the table definition and server restart. #86015 (Nikolay Degterinsky).
  • Don't create empty iceberg delete file. #86061 (Konstantin Vedernikov).
  • Backported in #86201: Fix using wrong default values for path with Enum hint inside JSON. #86065 (Pavel Kruglov).
  • Fix large setting values breaking S3Queue tables and replica restart. #86074 (Nikolay Degterinsky).
  • Backported in #86220: Fix logical error during filesystem cache dynamic resize. Closes #86122. Closes https://github.com/ClickHouse/clickhouse-core-incidents/issues/473. #86130 (Kseniia Sumarokova).
  • Backported in #86149: Use NonZeroUInt64 for logs_to_keep in DatabaseReplicatedSettings. #86142 (Tuan Pham Anh).
  • Backported in #86293: Exception was thrown by a FINAL query with skip index if the table (e.g ReplacingMergeTree) was created with settingindex_granularity_bytes = 0. That exception has been fixed now. #86147 (Shankar Iyer).
  • Backported in #86310: Fix crash in case of const and non-const blocks in one INSERT. #86230 (Azat Khuzhin).
  • Backported in #86320: Fix crash with replaceRegex, a FixedString haystack and an empty needle. #86270 (Raúl Marín).
  • Backported in #86350: Fix crash during ALTER UPDATE Nullable(JSON). #86281 (Pavel Kruglov).
  • Backported in #86328: Fix missing column definer in system.tables. #86295 (Raúl Marín).

Build/Testing/Packaging Improvement

NO CL ENTRY

NOT FOR CHANGELOG / INSIGNIFICANT