Back to Clickhouse

2025 Changelog

docs/changelogs/v25.7.1.3997-stable.md

26.4.1.1-new135.9 KB
Original Source

2025 Changelog

ClickHouse release v25.7.1.3997-stable (28a03cae61b) FIXME as compared to v25.7.1.1-new (c5372bebfd3)

Backward Incompatible Change

  • Changes in this PR: 1. Introduce a new argument unexpected_quoting_character_strategy to the function that controls what happens when a quoting_character is unexpectedly found when reading a non quoted key or value. The value can be one of: invalid, accept or promote. Invalid will discard the key and go back to waiting key state. Accept will treat it as part of the key. Promote will discard previous character and start parsing as a quoted key. I believe the default behavior (and maybe the only behavior) should be INVALID, but in 2023 on https://github.com/ClickHouse/ClickHouse/pull/56423 it was implemented the PROMOTE logic. So, to make it kind of backwards compatible, I created this setting. 2. After parsing a quoted value, only parse the next key if a pair delimiter is found. #80657 (Arthur Passos).
  • Previously, function countMatches would stop counting at the first empty match even if the pattern accepts it. To overcome this issue, countMatches now continues execution by advancing by a single character when an empty match occurs. Users who like to retain the old behavior can enable setting count_matches_stop_at_empty_match. #81676 (Elmi Ahmadov).
  • Previously, BACKUP queries, merges and mutations were not using server-wide throttlers for local (max_local_read_bandwidth_for_server and max_local_write_bandwidth_for_server) and remote (max_remote_read_network_bandwidth_for_server and max_remote_write_network_bandwidth_for_server) traffic, instead they were only throttled by dedicated server settings (max_backup_bandwidth_for_server, max_mutations_bandwidth_for_server and max_merges_bandwidth_for_server). Now, they use both types of throttlers simultaneously. #81753 (Sergei Trifonov).
  • Forbid the creation of a table without insertable columns. #81835 (Pervakov Grigorii).
  • Cluster functions with archives used to send over the whole archives to replicas, making reading within archive not-parallelizable with cluster (e.g. for example with a single archive we would just send it to one of the replicas as a whole to process and all other replicas will just be idle, which is inefficient). Added a new setting cluster_function_process_archive_on_multiple_nodes, by default equal to true. If set to true, increases performance of processing archives in cluster functions. Should be set to false for compatibility and to avoid errors during upgrade to 25.7+ if using cluster functions with archives on earlier versions. #82355 (Kseniia Sumarokova).
  • SYSTEM RESTART REPLICAS query led to the wakeup of tables in the Lazy database, even without access to that database, and it happened while these tables were being concurrently dropped. Note: Now SYSTEM RESTART REPLICAS will only restart replicas in the databases where you have permission to SHOW TABLES, which is natural. #83321 (Alexey Milovidov).

New Feature

  • Add SZ3 as a lossy yet error-bounded compression codec for columns of type Float32 and Float64. #67161 (Konstantin Vedernikov).
  • Support complex types in iceberg schema evolution. #73714 (Konstantin Vedernikov).
  • NumericIndexedVector: new vector data-structure backed by bit-sliced, Roaring-bitmap compression, together with more than 20 functions for building, analysing and point-wise arithmetic. Can cut storage and speed up joins, filters and aggregations on sparse data. Implements #70582 and “Large-Scale Metric Computation in Online Controlled Experiment Platform” paper by T. Xiong and Y. Wang from VLDB 2024. #74193 (FriendLey).
  • Add 'format_schema_source' setting which defines the source of 'format_schema'. #80874 (Tuan Pham Anh).
  • New data types: Time ([H]HH:MM:SS) and Time64 ([H]HH:MM:SS[.fractional]), and some basic cast functions and functions to interact with other data types. Added settings for compatibility with a legacy function ToTime. #81217 (Yarik Briukhovetskyi).
  • The workload setting max_waiting_queries is now supported. It can be used to limit the size of the query queue. If the limit is reached, all subsequent queries will be terminated with the SERVER_OVERLOADED error. #81250 (Oleg Doronin).
  • It's now possible to write USE DATABASE {name}. #81307 (Yarik Briukhovetskyi).
  • Add financial functions: financialInternalRateOfReturnExtended (XIRR), financialInternalRateOfReturn (IRR), financialNetPresentValueExtended (XNPV), financialNetPresentValue (NPV). #81599 (Joanna Hulboj).
  • Added a new system table system.codecs to introspect the available codecs. (issue #81525). #81600 (Jimmy Aguilar Mena).
  • Add the geospatial functions polygonIntersectsCartesian and polygonIntersectsSpherical to check if two polygons intersect. #81882 (Paul Lamb).
  • Added support for lightweight updates for MergeTree-family tables. Lightweight updates can be used by a new syntax: UPDATE <table> SET col1 = val1, col2 = val2, ... WHERE <condition>. Added implementation of lightweight deletes via lightweight updates. It can be enabled by setting lightweight_delete_mode = 'lightweight_update'. #82004 (Anton Popov).
  • Support lag and lead window functions. Closes #9887. #82108 (Dmitry Novik).
  • Support _part_granule_offset virtual column in MergeTree-family tables. This column indicates the zero-based index of the granule/mark each row belongs to within its data part. This addresses #79572. #82341 (Amos Bird).
  • Introduce Iceberg writes for insert queries. #82692 (Konstantin Vedernikov).
  • Add SZ3 as a lossy yet error-bounded compression codec for columns of type Float32 and Float64. #83088 (Konstantin Vedernikov).
  • Add AI-powered SQL generation to ClickHouse client. Users can now generate SQL queries from natural language descriptions by prefixing their query with "??". Supports OpenAI and Anthropic providers with automatic schema discovery. #83314 (Kaushik Iska).
  • Read iceberg data files by field ids. This closes #83065. #83653 (Konstantin Vedernikov).
  • Added SQL functions colorSRGBToOkLCH and colorOkLCHToSRGB for converting colours between the sRGB and OkLCH colour spaces. #83679 (Fgrtue).
  • Backported in #84005: AI Powered SQL generation can now infer from env ANTHROPIC_API_KEY and OPENAI_API_KEY if available, this is to make it so that we can have a zero config option to use this feature. #83787 (Kaushik Iska).

Experimental Feature

  • Added functions searchAny and searchAll which are general purpose tools to search text indexes. #80641 (Elmi Ahmadov).
  • The text index now supports string tokenizer. #81752 (Elmi Ahmadov).
  • Changed the default index granularity value for text indexes to 64. This improves the expected performance for the average test query in internal benchmarks. #82162 (Jimmy Aguilar Mena).
  • The 256-bit bitmap stores the outgoing labels of a state ordered, but outgoing states are saved into disk in the order they appear in the hash table. Therefore, a label would point to a wrong next state while reading from disk. #82783 (Elmi Ahmadov).
  • Currently, FST tree is saved into disk uncompressed. This could lead to slow performance or higher I/O bandwidth while both writing and reading into/from disk. #83093 (Elmi Ahmadov).
  • Remove searchAny, searchAll due to a bug. See #82385. #83117 (Alexey Milovidov).
  • Recently, fuzzer found an issue for the searchAny and searchAll functions (see #82385). This exception occurs whenever searchAny or searchAll functions get different text index parameters for the same column. This scenario is only possible when there are two different text indices for the same column. This PR limits a column to be defined only once for the text index, but wrapper around the column (e.g. lower(column)) is still allowed. #83303 (Elmi Ahmadov).

Performance Improvement

  • Trivial optimization for -If combinator. #78454 (李扬).
  • Introduced an option to offload (de)compression and (de)serialization of blocks into pipeline threads instead of a single thread associated with a network connection. Controlled by the setting enable_parallel_blocks_marshalling. It should speed up distributed queries that transfer significant amounts of data between the initiator and remote nodes. #78694 (Nikita Taranov).
  • Vector search queries using a vector similarity index complete with lower latency due to reduced storage reads and reduced CPU usage. #79103 (Shankar Iyer).
  • Respect merge_tree_min_{rows,bytes}_for_seek in filterPartsByQueryConditionCache to align it with other methods filtering by indexes. #80312 (李扬).
  • Make the pipeline after the TOTALS step multithreaded. #80331 (UnamedRus).
  • Parallel distributed INSERT SELECT is enabled by default in mode where INSERT SELECT executed on each shard independently, see parallel_distributed_insert_select setting. #80425 (Igor Nikonov).
  • Tweak some jemalloc configs to improve performance. #81807 (Antonio Andelic).
  • Fix filter by key for Redis and KeeperMap storages. #81833 (Pervakov Grigorii).
  • Add new setting min_joined_block_size_rows (analogous to min_joined_block_size_bytes; defaults to 65409) to control the minimum block size (in rows) for JOIN input and output blocks (if the join algorithm supports it). Small blocks will be squashed. #81886 (Nikita Taranov).
  • When the aggregation query contains only a single COUNT() function on a NOT NULL column, the aggregation logic is fully inlined during hash table probing. This avoids allocating and maintaining any aggregation state, significantly reducing memory usage and CPU overhead. This partially addresses #81982. #82104 (Amos Bird).
  • Performance of HashJoin optimised by removing the additional loop over hash maps in the typical case of only one key column, also null_map and join_mask checks are eliminated when they're always true/false. #82308 (Nikita Taranov).
  • ATTACH PARTITION no longer leads to the dropping of all caches. #82377 (Alexey Milovidov).
  • Optimize the generated plan for correlated subqueries by removing redundant JOIN operations using equivalence classes. If there are equivalent expressions for all correlated columns, CROSS JOIN is not produced if query_plan_correlated_subqueries_use_substitution setting is enabled. #82435 (Dmitry Novik).
  • Read only required columns in correlated subquery when it appears to be an argument of function EXISTS. #82443 (Dmitry Novik).
  • Introduce async logging. #82516 (Raúl Marín).
  • Compress logs and profile events in the native protocol. On clusters with 100+ replicas, uncompressed profile events take 1..10 MB/sec, and the progress bar is sluggish on slow Internet connections. This closes #82533. #82535 (Alexey Milovidov).
  • Try to speedup QueryTreeHash a bit. #82617 (Nikolai Kochetov).
  • Add alignment in the Counter of ProfileEvents to reduce false sharing. #82697 (Jiebin Sun).
  • Parallel distributed INSERT SELECT is enabled by default in mode where INSERT SELECT executed on each shard independently, see parallel_distributed_insert_select setting. #83040 (Igor Nikonov).
  • The optimizations for null_map and JoinMask from #82308 were applied to the case of JOIN with multiple disjuncts. Also, the KnownRowsHolder data structure was optimized. #83041 (Nikita Taranov).
  • Plain std::vector<std::atomic_bool> is used for join flags to avoid calculating a hash on each access to flags. #83043 (Nikita Taranov).
  • Don't pre-allocate memory for result columns beforehand when HashJoin uses lazy output mode. It is suboptimal, especially when the number of matches is low. Moreover, we know the exact amount of matches after joining is done, so we can preallocate more precisely. #83304 (Nikita Taranov).
  • Minimize memory copy in port headers during pipeline construction. Original PR by heymind. #83381 (Raúl Marín).
  • Improve Keeper with rocksdb initial loading. #83390 (Antonio Andelic).
  • Avoid holding the lock while creating storage snapshot data to reduce lock contention with high concurrent load. #83510 (Duc Canh Le).
  • Improved performance of the ProtobufSingle input format by reusing the serializer when no parsing errors occur. #83613 (Eduard Karacharov).
  • Improve the performance of pipeline building. #83631 (Raúl Marín).
  • Optimize MergeTreeReadersChain::getSampleBlock. #83875 (Raúl Marín).

Improvement

  • Introduced two new access types: READ and WRITE for sources and deprecates all previous access types related to sources. Before GRANT S3 ON *.* TO user, now: GRANT READ, WRITE ON S3 TO user. This also allows to separate READ and WRITE permissions for sources, e.g.: GRANT READ ON * TO user, GRANT WRITE ON S3 TO user. The feature is controlled by a setting access_control_improvements.enable_read_write_grants and disabled by default. #73659 (pufit).
  • Verify the part has consistent checksum.txt file right before committing it. #76625 (Sema Checherinda).
  • Implement methods moveFile and replaceFile in s3_plain_rewritable to support it as a database disk. #79424 (Tuan Pham Anh).
  • Allow backups for PostgreSQL, MySQL & DataLake databases. A backup of such a database would only save the definition and not the data inside of it. #79982 (Nikolay Degterinsky).
  • Support position deletes for Iceberg TableEngine. #80237 (YanghongZhong).
  • Setting allow_experimental_join_condition marked as obsolete. #80566 (Vladimir Cherkasov).
  • Add pressure metrics to ClickHouse async metrics. #80779 (Xander Garbett).
  • Added metrics MarkCacheEvictedBytes, MarkCacheEvictedMarks, MarkCacheEvictedFiles for tracking evictions from the mark cache. (issue #60989). #80799 (Shivji Kumar Jha).
  • Speedup tables listing in data catalogs by asynchronous requests. #81084 (alesapin).
  • Support writing parquet enum as byte array as the spec dictates. I'll write more info later. #81090 (Arthur Passos).
  • An improvement for DeltaLake table engine: delta-kernel-rs has ExpressionVisitor API which is implemented in this PR and is applied to partition column expressions transform (it will replace an old deprecated within the delta-kernel-rs way, which was used before in our code). In the future this ExpressionVisitor will also allow to implement statistics based pruning and some delta-lake proprietary features. Additionally the purpose of this change is to support partition pruning in DeltaLakeCluster table engine (the result of a parsed expression - ActionsDAG - will be serialized and sent from the initiator along with the data path, because this kind of information, which is needed for pruning, is only available as meta information on data files listing, which is done by initiator only, but it has to be applied to data on each reading server). #81136 (Kseniia Sumarokova).
  • Try to preserve element names when deriving supertypes for named tuples. #81345 (lgbo).
  • Allow parameters in CREATE USER queries for usernames. #81387 (Diskein).
  • Now clickhouse supports compressed metadata.json files for Iceberg. Fixes #70874. #81451 (alesapin).
  • The system.formats table now contains extended information about formats, such as HTTP content type, the capabilities of schema inference, etc. #81505 (Alexey Milovidov).
  • Count consumed messages manually to avoid depending on previous committed offset in StorageKafka2. #81662 (János Benjamin Antal).
  • Added clickhouse-keeper-utils, a new command-line tool for managing and analyzing ClickHouse Keeper data. The tool supports dumping state from snapshots and changelogs, analyzing changelog files, and extracting specific log ranges. #81677 (Antonio Andelic).
  • The total and per-user network throttlers are never reset, which ensures that max_network_bandwidth_for_all_users and max_network_bandwidth_for_all_users limits are never exceeded. #81729 (Sergei Trifonov).
  • Support writing geoparquets as output format. #81784 (Konstantin Vedernikov).
  • Forbid to start RENAME COLUMN alter mutation if it will rename some column that right now affected by incomplete data mutation. #81823 (Mikhail Artemenko).
  • This PR introduces jitter to the S3 retry mechanism when the s3_slow_all_threads_after_network_error configuration is enabled. #81849 (zoomxi).
  • Try fix logical error in filesystem cache: "Having zero bytes but range is not finished". #81868 (Kseniia Sumarokova).
  • Function addressToSymbol and system.symbols table will use file offsets instead of virtual memory addresses. #81896 (Alexey Milovidov).
  • Use abseil-cpp 20250512.0. #81945 (Konstantin Bogdanov).
  • Header Connection is send at the end of headers. When we know is the connection should be preserved. #81951 (Sema Checherinda).
  • Use google-protobuf v31.1. #81976 (Konstantin Bogdanov).
  • Tune TCP servers queue (64 by default) based on listen_backlog (4096 by default). #82045 (Azat Khuzhin).
  • Add ability to reload max_local_read_bandwidth_for_server and max_local_write_bandwidth_for_server on fly without restart server. #82083 (Kai Zhu).
  • Add support for clearing all warnings from the system.warnings table using TRUNCATE TABLE system.warnings. #82087 (Vladimir Cherkasov).
  • Fix partition pruning with data lake cluster functions. #82131 (Kseniia Sumarokova).
  • Fix reading partitioned data in DeltaLakeCluster table function. In this PR cluster functions protocol version is increased, allowing to send extra info from initiator to replicas. This extra info contains delta-kernel transform expression, which is needed to parse partition columns (and some other staff in the future, like generated columns, etc). #82132 (Kseniia Sumarokova).
  • Fix a list of problems that can occur when trying to run integration tests on a local host. #82135 (Oleg Doronin).
  • Now database Datalake throw more convenient exception. Fixes #81211. #82304 (alesapin).
  • Improve HashJoin::needUsedFlagsForPerRightTableRow, returns false for cross join. #82379 (lgbo).
  • Allow write/read map columns as array of tuples. #82408 (MikhailBurdukov).
  • Allow ALTER UPDATE in JSON and Dynamic columns. #82419 (Pavel Kruglov).
  • List the licenses of rust crates in system.licenses. #82440 (Raúl Marín).
  • Exclude sensitive data from core dumps. Add two allocators: AWS library compatible AwsNodumpMemoryManager and STL compatible JemallocNodumpSTLAllocator. Both are wrappers of the Jemalloc allocator. They use Jemalloc's extent hooks and madvise to mark memory pages as "don't dump". Used for S3 credentials, user credentials, and some query data. #82441 (Miсhael Stetsyuk).
  • Macros like {uuid} can now be used in the keeper_path setting of the S3Queue table engine. #82463 (Nikolay Degterinsky).
  • Keeper improvement: move changelog files between disk in a background thread. Previously, moving changelog to a different disk would block Keeper globally until the move is finished. This lead to performance degradation if moving is a long operation (e.g. to S3 disk). #82485 (Antonio Andelic).
  • Keeper improvement: add new config keeper_server.cleanup_old_and_ignore_new_acl. If enabled, all nodes will have their ACLs cleared while ACL for new requests will be ignored. If the goal is to completely remove ACL from nodes, it's important to leave the config enabled until a new snapshot is created. #82496 (Antonio Andelic).
  • Removed experimental send_metadata logic related to experimental zero-copy replication. It wasn't ever used and nobody supports this code. Since there were even no tests related to it, there is a high chance that it's broken long time ago. #82508 (alesapin).
  • Added a new server setting s3queue_disable_streaming which disables streaming in tables with S3Queue table engine. This setting is changeable without server restart. #82515 (Kseniia Sumarokova).
  • Color parenthesis in multiple colors for better readability. #82538 (Konstantin Bogdanov).
  • Refactor dynamic resize feature of filesystem cache. Added more logs for introspection. #82556 (Kseniia Sumarokova).
  • SYSTEM RESTART REPLICA may fail due to ZooKeeper connection issues. To avoid forgetting about this table, we now retry until the table is created. #82616 (Nikolay Degterinsky).
  • clickhouse-server without a configuration file will also listen to the PostgreSQL port 9005, like with the default config. #82633 (Alexey Milovidov).
  • Integrate StorageKafka2 to system.kafka_consumers. #82652 (János Benjamin Antal).
  • Estimate complex cnf/dnf, for example, (a < 1 and a > 0) or b = 3, by statistics. #82663 (Han Fei).
  • In ReplicatedMergeTree::executeMetadataAlter, we get the StorageID, and without taking DDLGuard, we try to call IDatabase::alterTable. In between this time we could have technically exchanged the table in question with another table, so when we get the definiton we would get the wrong one. To avoid this we add a separate check for UUIDs to match when we try to call IDatabase::alterTable. #82666 (Nikolay Degterinsky).
  • When attaching a database with a read-only remote disk, manually add table UUIDs into DatabaseCatalog. #82670 (Tuan Pham Anh).
  • Fix the wrong default value for the --reconnect option in clickhouse-benchmark. It was changed by mistake in #79465. #82677 (Alexey Milovidov).
  • Prevent user from using nan and inf with NumericIndexedVector. Fixes #82239 and a little more. #82681 (Raufs Dunamalijevs).
  • After https://github.com/ClickHouse/ClickHouse/pull/73834, the X-ClickHouse-Progress and X-ClickHouse-Summary header formats have been modified to omit zero values. This PR intends to return the previous behaviour for X-ClickHouse-Summary only, because it makes sense. #82727 (Nikita Mikhaylov).
  • Keeper improvement: support specific permissions for world:anyone ACL. #82755 (Antonio Andelic).
  • In previous versions, multiplication of the aggregate function state with IPv4 produced a logical error instead of a proper error code. Closes #82817. #82818 (Alexey Milovidov).
  • Do not allow RENAME COLUMN or DROP COLUMN involving explicitly listed columns to sum in SummingMergeTree. Closes #81836. #82821 (Alexey Milovidov).
  • Improve the precision of conversion from Decimal to Float32. Implement conversion from Decimal to BFloat16. Closes #82660. #82823 (Alexey Milovidov).
  • Fix inconsistent formatting of CREATE DICTIONARY. Closes #82105. #82829 (Alexey Milovidov).
  • Fix inconsistent formatting of TTL when it contains a materialize function. Closes #82828. #82831 (Alexey Milovidov).
  • Fix inconsistent formatting of EXPLAIN AST in a subquery when it contains output options such as INTO OUTFILE. Closes #82826. #82840 (Alexey Milovidov).
  • Fix inconsistent formatting of parenthesized expressions with aliases in the context when no aliases are allowed. Closes #82836. Closes #82837. #82867 (Alexey Milovidov).
  • Scrollbars in the Web UI will look slightly better. #82869 (Alexey Milovidov).
  • clickhouse-server with embedded configuration will allow using the Web UI by providing an HTTP OPTIONS response. #82870 (Alexey Milovidov).
  • Highlight metacharacters in LIKE/REGEXP patterns as you type. We already have it in clickhouse-format and in the echo in clickhouse-client, but now it is done in the command prompt as well. #82871 (Alexey Milovidov).
  • Highlighting in clickhouse-format and in the client's echo will work in the same way as the highlighting in the command line prompt. #82874 (Alexey Milovidov).
  • This PR was reverted. #82884 (Mithun p).
  • Add support for specifying extra Keeper ACL for paths in config. If you want to add extra ACL for a specific path you define it in the config under zookeeper.path_acls. #82898 (Antonio Andelic).
  • Add function to write types into wkb format. #82935 (Konstantin Vedernikov).
  • Now mutations snapshot will be built from the visible parts snapshot. Also mutation counters used in snapshot will be recalculated from the included mutations. #82945 (Mikhail Artemenko).
  • Adds ProfileEvent when Keeper rejects a write due to soft memory limit. #82963 (Xander Garbett).
  • Add columns commit_time, commit_id to system.s3queue_log. #83016 (Kseniia Sumarokova).
  • In some cases, we need to have multiple dimensions to our metrics. For example, counting failed merges or mutations by error codes rather than having a single counter. #83030 (Miсhael Stetsyuk).
  • Consolidate unknown settings warnings in clickhouse client and log them as a summary. #83042 (Bharat Nallan).
  • Clickhouse client now reports the local port when connection error happens. #83050 (Jianfei Hu).
  • Slightly better error handling in AsynchronousMetrics. If the /sys/block directory exists but is not accessible, the server will start without monitoring the block devices. Closes #79229. #83115 (Alexey Milovidov).
  • Support TimestampTZ in Glue catalog. This closes #81654. #83132 (Konstantin Vedernikov).
  • Shutdown SystemLogs after ordinary tables (and before system tables, instead of before ordinary). #83134 (Kseniia Sumarokova).
  • Add logs for s3queue shutdown process. #83163 (Kseniia Sumarokova).
  • There was an incorrect dependency check for the INSERT with MVs that have malformed selects and the user might have received an obscure std::exception instead of a meaningful error with a clear explanation. This is now fixed. This fixes: #82889. #83190 (Nikita Mikhaylov).
  • Async logs: Limit the max number of entries that are hold in the queue. #83214 (Raúl Marín).
  • This closes #81156 Same "feature" in Postgres - https://github.com/ClickHouse/ClickHouse/blob/3470af8f5e8a2b4035f33e769828707430655665/src/Databases/PostgreSQL/DatabasePostgreSQL.cpp#L126. #83298 (Konstantin Vedernikov).
  • Possibility to parse Time and Time64 as MM:SS, M:SS, SS, or S. #83299 (Yarik Briukhovetskyi).
  • When distributed_ddl_output_mode='*_only_active', don't wait for new or recovered replicas that have replication lag bigger than max_replication_lag_to_enqueue. This should help to avoid DDL task is not finished on some hosts when a new replica becomes active after finishing initialization or recovery, but it accumulated huge replication log while initializing. Also, implement SYSTEM SYNC DATABASE REPLICA STRICT query that waits for replication log to become below max_replication_lag_to_enqueue. #83302 (Alexander Tokmakov).
  • Do not output too long descriptions of expression actions in exception messages. Closes #83164. #83350 (Alexey Milovidov).
  • Add ability to parse part's prefix and suffix and also check coverage for non constant columns. #83377 (Mikhail Artemenko).
  • Function reinterpret() function now supports conversion to Array(T) where T is a fixed-size data type (issue #82621). #83399 (Shankar Iyer).
  • Unify parameter names in ODBC and JDBC when using named collections. #83410 (Andrey Zvonov).
  • When the storage is shutting down, getStatus throws an ErrorCodes::ABORTED exception. Previously, this would fail the select query. Now we catch the ErrorCodes::ABORTED exceptions and intentionally ignore them instead. #83435 (Miсhael Stetsyuk).
  • Introduced setting enable_vector_similarity_index which must be enabled to use the vector similarity index. The existing setting allow_experimental_vector_similarity_index is now obsolete. It still works in case someone needs it. #83459 (Robert Schulze).
  • Add process resource metrics (such as UserTimeMicroseconds, SystemTimeMicroseconds, RealTimeMicroseconds) to part_log profile events for MergeParts entries. #83460 (Vladimir Cherkasov).
  • Enable create_if_not_exists, check_not_exists, remove_recursive feature flags in Keeper by default which enable new types of requests. #83488 (Antonio Andelic).
  • Shutdown S3(Azure/etc)Queue streaming before shutting down any tables on server shutdown. #83530 (Kseniia Sumarokova).
  • Enable Date/Date32 as integers in JSON input formats. #83597 (MikhailBurdukov).
  • Added support for the CLICKHOUSE_HOST environment variable to specify the ClickHouse server host, aligning with existing CLICKHOUSE_USER and CLICKHOUSE_PASSWORD environment variables. This allows for easier configuration without modifying client or configuration files directly. #83659 (Doron David).
  • Made exception messages for certain situations for loading and adding projections easier to read. #83728 (Robert Schulze).
  • Introduce a configuration option to skip binary checksum integrity checks for clickhouse-server. Resolves #83637. #83749 (Rafael Roquetto).
  • Backported in #83994: Fix compatibility for cluster_function_process_archive_on_multiple_nodes. #83968 (Kseniia Sumarokova).
  • Backported in #84039: Use information from cgroup (if applicable, i.e. memory_worker_use_cgroup and cgroups are available) to adjust memory tracker (memory_worker_correct_memory_tracker). #83981 (Azat Khuzhin).
  • Backported in #84599: S3Queue ordered mode fix: quit earlier if shutdown was called. #84463 (Kseniia Sumarokova).

Bug Fix (user-visible misbehavior in an official stable release)

  • Recalculate the min-max index when TTL reduces rows to ensure the correctness of algorithms relying on it, such as minmax_count_projection. This resolves #77091. #77166 (Amos Bird).
  • For queries with combination of ORDER BY ... LIMIT BY ... LIMIT N, when ORDER BY is executed as a PartialSorting, the counter rows_before_limit_at_least now reflects the number of rows consumed by LIMIT clause instead of number of rows consumed by sorting transform. #78999 (Eduard Karacharov).
  • Fix excessive granule skipping for filtering over token/ngram indexes with regexp which contains alternation and non-literal first alternative. #79373 (Eduard Karacharov).
  • Fix logical error with <=> operator and Join storage, now query returns proper error code. #80165 (Vladimir Cherkasov).
  • Fix a crash in the loop function when used with the remote function family. Ensure the LIMIT clause is respected in loop(remote(...)). #80299 (Julia Kartseva).
  • Fix incorrect behavior of to_utc_timestamp and from_utc_timestamp functions when handling dates before Unix epoch (1970-01-01) and after maximum date (2106-02-07 06:28:15). Now these functions properly clamp values to epoch start and maximum date respectively. #80498 (Surya Kant Ranjan).
  • For some queries executed with parallel replicas, reading in order optimization(s) could be applied on initiator while can't be applied on remote nodes. It leads to different reading modes used by parallel replicas coordinator (on initiator) and on remoted nodes, which is a logical error. #80652 (Igor Nikonov).
  • Fix logical error during materialize projection when column type was changed to Nullable. #80741 (Pavel Kruglov).
  • Fix incorrent TTL recalculation in TTL GROUP BY when updating TTL. #81222 (Evgeniy Ulasik).
  • Fixed Parquet bloom filter incorrectly applying condition like WHERE function(key) IN (...) as if it were WHERE key IN (...). #81255 (Michael Kolupaev).
  • Fixed possible crash in Aggregator in case of exception during merge. #81450 (Nikita Taranov).
  • Fixed InterpreterInsertQuery::extendQueryLogElemImpl to add backquotes to database and table names when needed (f.g., when names contain special characters like -). #81528 (Ilia Shvyrialkin).
  • Fix IN execution with transform_null_in=1 with null in the left argument and non-nullable subquery result. #81584 (Pavel Kruglov).
  • Don't validate experimental/suspicious types in default/materialize expression execution during reading from existing table. #81618 (Pavel Kruglov).
  • Fix "Context has expired" during merges when dict used in TTL expression. #81690 (Azat Khuzhin).
  • This PR might close #80742. #81722 (zoomxi).
  • Fix the issue where required columns are not read during scalar correlated subquery processing. Fixes #81716. #81805 (Dmitry Novik).
  • Someone littered our code with Kusto. Cleaned it up. This closes #81643. #81885 (Alexey Milovidov).
  • In previous versions, the server returned excessive content for requests to /js. This closes #61890. #81895 (Alexey Milovidov).
  • Previously, MongoDB table engine definitions could include a path component in the host:port argument which was silently ignored. The mongodb integration refuses to load such tables. With this fix we allow loading such tables and ignore path component if MongoDB engine has five arguments, using the database name from arguments. Note: The fix is not applied for newly created tables or queries with mongo table function, as well as for dictionary sources and named collections. #81942 (Vladimir Cherkasov).
  • Fixed possible crash in Aggregator in case of exception during merge. #82022 (Nikita Taranov).
  • Fix filter analysis when only a constant alias column is used in the query. Fixes #79448. #82037 (Dmitry Novik).
  • Fix LOGICAL_ERROR and following crash when using the same column in the TTL for GROUP BY and SET. #82054 (Pablo Marcos).
  • Fix S3 table function argument validation in secret masking, preventing possible LOGICAL_ERROR, close #80620. #82056 (Vladimir Cherkasov).
  • Fix data races in Iceberg. #82088 (Azat Khuzhin).
  • Fix DatabaseReplicated::getClusterImpl. If the first element (or elements) of hosts has id == DROPPED_MARK and there are no other elements for the same shard, the first element of shards will be an empty vector, leading to std::out_of_range. #82093 (Miсhael Stetsyuk).
  • Fixing copy-paste error in arraySimilarity, disallowing the use of UInt32 and Int32 weights. Update tests and docs. #82103 (Mikhail f. Shiryaev).
  • Fix the Not found column error for queries with arrayJoin under WHERE condition and IndexSet. #82113 (Nikolai Kochetov).
  • Fix bug in glue catalog integration. Now clickhouse can read tables with nested data types where some of subcolumns contain decimals, for example: map<string, decimal(9, 2)>. Fixes #81301. #82114 (alesapin).
  • Fix performance degradation in SummingMergeTree that was intorduced in 25.5 in https://github.com/ClickHouse/ClickHouse/pull/79051. #82130 (Pavel Kruglov).
  • When passing settings over uri the last value is considered. #82137 (Sema Checherinda).
  • Fix "Context has expired" for Iceberg. #82146 (Azat Khuzhin).
  • Fix possible deadlock for remote queries when server is under memory pressure. #82160 (Kirill).
  • Fixes overflow in numericIndexedVectorPointwiseAdd, numericIndexedVectorPointwiseSubtract, numericIndexedVectorPointwiseMultiply, numericIndexedVectorPointwiseDivide functions that happened when we applied them to large numbers. #82165 (Raufs Dunamalijevs).
  • Backported in #84398: Fix rollback of Dynamic column on parsing failure. #82169 (Pavel Kruglov).
  • Fix a bug in table dependencies causing Materialized Views to miss INSERT queries. #82222 (Nikolay Degterinsky).
  • Fix possible data-race between suggestion thread and main client thread. #82233 (Azat Khuzhin).
  • Now ClickHouse can read iceberg tables from Glue catalog after schema evolution. Fixes #81272. #82301 (alesapin).
  • Fix the validation of async metrics settings asynchronous_metrics_update_period_s and asynchronous_heavy_metrics_update_period_s. #82310 (Bharat Nallan).
  • Fix logical error when resolving matcher in query with multiple JOINs, close #81969. #82421 (Vladimir Cherkasov).
  • Add expiration to AWS ECS token so it can be reloaded. #82422 (Konstantin Bogdanov).
  • Fixes a bug for NULL arguments in CASE function. #82436 (Yarik Briukhovetskyi).
  • Fix data-races in client (by not using global context) and session_timezone overrides (previously in case of session_timezone was set in i.e. users.xml/client options to non empty and in query context to empty, then, value from users.xml was used, while this is wrong, now query context will always have a priority over global context). #82444 (Azat Khuzhin).
  • Fix disabling boundary alignment for cached buffer in external table engines. It was broken in https://github.com/ClickHouse/ClickHouse/pull/81868. #82493 (Kseniia Sumarokova).
  • Fix the crash if key-value storage is joined with a type-casted key. #82497 (Pervakov Grigorii).
  • Fix hiding named collection values in logs/query_log. Closes #82405. #82510 (Kseniia Sumarokova).
  • Fix a possible crash in logging while terminating a session as the user_id might sometimes be empty. #82513 (Bharat Nallan).
  • Fixes cases where parsing of Time could cause msan issues. This fixes: #82477. #82514 (Yarik Briukhovetskyi).
  • Disallow setting threadpool_writer_pool_size to zero to ensure that server operations don't get stuck. #82532 (Bharat Nallan).
  • Fix LOGICAL_ERROR during row policy expression analysis for correlated columns. #82618 (Dmitry Novik).
  • Fix incorrect usage of parent metadata in mergeTreeProjection table function when enable_shared_storage_snapshot_in_query = 1. This is for #82634. #82638 (Amos Bird).
  • Setting use_skip_indexes_if_final_exact_mode implementation (introduced in 25.6) could fail to select a relevant candidate range depending upon MergeTree engine settings / data distribution. That has been resolved now. #82667 (Shankar Iyer).
  • Functions trim{Left,Right,Both} now support input strings of type "FixedString(N)". For example, SELECT trimBoth(toFixedString('abc', 3), 'ac') now works. #82691 (Robert Schulze).
  • In AzureBlobStorage, for native copy we compare authentication methods, during which if we get an exception, updated the code to fallback to read and copy (i.e. non native copy). #82693 (Smita Kulkarni).
  • Fix deserialization of groupArraySample/groupArrayLast in case of empty elements (deserialization could skip part of the binary if the input was empty, this can lead to corruption during data read and UNKNOWN_PACKET_FROM_SERVER in TCP protocol). This does not affect numbers and date time types. #82763 (Pedro Ferreira).
  • Fix backup of an empty Memory table, causing the backup restore to fail with with BACKUP_ENTRY_NOT_FOUND error. #82791 (Julia Kartseva).
  • Fix exception safety in union/intersect/except_default_mode rewrite. Closes #82664. #82820 (Alexey Milovidov).
  • Keep track of the number of async tables loading jobs. If there are some running jobs, do not update tail_ptr in TransactionLog::removeOldEntries. #82824 (Tuan Pham Anh).
  • Fix data races in Iceberg. #82841 (Azat Khuzhin).
  • Setting use_skip_indexes_if_final_exact_mode optimization (introduced in 25.6) could fail to select a relevant candidate range depending upon MergeTree engine settings / data distribution. That has been resolved now. #82879 (Shankar Iyer).
  • Set salt for auth data when parsing from AST with type SCRAM_SHA256_PASSWORD. #82888 (Tuan Pham Anh).
  • When using a non-caching Database implementation, the metadata of the corresponding table is deleted after the columns are returned and the reference is invalidated. #82939 (buyval01).
  • Fix filter modification for queries with a JOIN expression with a table with storage Merge. Fixes #82092. #82950 (Dmitry Novik).
  • Fix LOGICAL_ERROR in QueryMetricLog: Mutex cannot be NULL. #82979 (Pablo Marcos).
  • Fixed incorrect output of function formatDateTime when formatter %f is used together with variable-size formatters (e.g. %M). #83020 (Robert Schulze).
  • Fix performance degradation with the enabled analyzer when secondary queries always read all columns from the VIEWs. Fixes #81718. #83036 (Dmitry Novik).
  • Fix misleading error message when restoring a backup on a read-only disk. #83051 (Julia Kartseva).
  • Do not check for cyclic dependencies on create table with no dependencies. It fixes performance degradation of the use cases with creation of thousands of tables that was introduced in https://github.com/ClickHouse/ClickHouse/pull/65405. #83077 (Pavel Kruglov).
  • Fixes issue with implicit reading of negative Time values into the table and make the docs not confusing. #83091 (Yarik Briukhovetskyi).
  • Do not use unrelated parts of a shared dictionary in the lowCardinalityKeys function. #83118 (Alexey Milovidov).
  • After https://github.com/ClickHouse/ClickHouse/pull/79963 the usage of subcolumns in Materialized Views got broken and user might have received an error Not found column X in block. This behaviour is fixed. This fixes: #82784. #83221 (Nikita Mikhaylov).
  • Fix crash in client due to connection left in disconnected state after bad INSERT. #83253 (Azat Khuzhin).
  • Fix crash when calculating the size of a block with empty columns. #83271 (Raúl Marín).
  • Fix possible crash in Variant type in UNION. #83295 (Pavel Kruglov).
  • Fix LOGICAL_ERROR in clickhouse-local for unsupported SYSTEM queries. #83333 (Surya Kant Ranjan).
  • Fix no_sign_request for S3 client. It can be used to explicitly avoid signing S3 requests. It can also be defined for specific endpoints using endpoint-based settings. #83379 (Antonio Andelic).
  • Fixes a crash that may happen for a query with a setting 'max_threads=1' when executed under load with CPU scheduling enabled. #83387 (Fan Ziqi).
  • Fix TOO_DEEP_SUBQUERIES exception when CTE definition references another table expression with the same name. #83413 (Dmitry Novik).
  • Fix incorrect behavior when executing REVOKE S3 ON system.* revokes S3 permissions for *.*. This fixes #83417. #83420 (pufit).
  • Do not share async_read_counters between queries. #83423 (Azat Khuzhin).
  • This PR addresses issue #81401 by disabling parallel replicas when a subquery contains the FINAL. #83455 (zoomxi).
  • Resolve minor integer overflow in configuration of setting role_cache_expiration_time_seconds (issue #83374). #83461 (wushap).
  • Fix a bug introduced in https://github.com/ClickHouse/ClickHouse/pull/79963. When inserting into an MV with a definer, the permission check should use the definer's grants. This fixes #79951. #83502 (pufit).
  • Disable bounds-based file pruning for iceberg array element and iceberg map values, including all their nested subfields. #83520 (Daniil Ivanik).
  • Fix possible file cache not initialized errors when it's used as a temporary data storage. #83539 (Bharat Nallan).
  • Keeper fix: update total watch count correctly when ephemeral nodes are deleted on session close. #83583 (Antonio Andelic).
  • Fix incorrect memory around max_untracked_memory. #83607 (Azat Khuzhin).
  • INSERT SELECT with UNION ALL could lead to a null pointer dereference in a corner case. This closes #83618. #83643 (Alexey Milovidov).
  • Backported in #83899: Skip unavailable nodes during INSERT SELECT from s3Cluster() into replicated MergeTree. #83676 (Igor Nikonov).
  • Disallow zero value for max_insert_block_size as it could cause logical error. #83688 (Bharat Nallan).
  • Fix endless loop in estimateCompressionRatio() with block_size_bytes=0. #83704 (Azat Khuzhin).
  • Backported in #84111: Mask Avro schema registry authentication details to be not visible to user or in logs. #83713 (János Benjamin Antal).
  • Fix IndexUncompressedCacheBytes/IndexUncompressedCacheCells/IndexMarkCacheBytes/IndexMarkCacheFiles metrics (previously they were included into metric w/o Cache prefix). #83730 (Azat Khuzhin).
  • Backported in #83919: Fix the issue where, if a MergeTree table is created with add_minmax_index_for_numeric_columns=1 or add_minmax_index_for_string_columns=1, the index is later materialized during an ALTER operation, and it prevents the Replicated database from initializing correctly on a new replica. #83751 (Nikolay Degterinsky).
  • Fix possible abort (due to joining threads from the task) and hopefully hungs (in unit tests) during BackgroundSchedulePool shutdown. #83769 (Azat Khuzhin).
  • Introduce backward compatibility setting to allow new analyzer to reference outer alias in WITH clause in the case of name clashes. Fixes #82700. #83797 (Dmitry Novik).
  • Backported in #84060: When restoring from backup, the definer user may not be backed up, which will cause the whole backup to be broken. To fix this, we postpone the permissions check on the target table's creation during restore and only check it during runtime. #83818 (pufit).
  • Fix deadlock on shutdown due to recursive context locking during library bridge cleanup. #83824 (Azat Khuzhin).
  • Backported in #84226: Allow referencing any table in view(...) argument of remote table function with enabled analyzer. Fixes #78717. Fixes #79377. #83844 (Dmitry Novik).
  • Backported in #83952: Onprogress call in jsoneachrowwithprogress is synchronized with finalization. #83879 (Sema Checherinda).
  • Backported in #83966: Fix colorSRGBToOKLCH/colorOKLCHToSRGB for mix of const and non-const args. #83906 (Azat Khuzhin).
  • Backported in #84131: Fix rare bug when MATERIALIZE COLUMN query could lead to unexpected files in checksums.txt and eventually detached data parts. #84007 (alesapin).
  • Backported in #84342: Fix crash with clickhouse client when used in interactive mode with syntax highlighting. #84025 (Bharat Nallan).
  • Backported in #84274: Fixed wrong results when the query condition cache is used in conjunction with recursive CTEs (issue #81506). #84026 (zhongyuankai).
  • Backported in #84462: Fix filter merging into JOIN condition in cases when equality operands have different types or they reference constants. Fixes #83432. #84145 (Dmitry Novik).
  • Backported in #84290: Fix rare clickhouse crash when table has projection, lightweight_mutation_projection_mode = 'rebuild' and user execute lighweight delete which deletes ALL rows from any block in table. #84158 (alesapin).
  • Backported in #84554: Fix deadlock caused by background cancellation checker thread. #84203 (Antonio Andelic).
  • Backported in #84495: Fixed incorrect construction of empty tuples in the array() function. This fixes #84202. #84297 (Amos Bird).
  • Backported in #84507: Fixed a logical error in lightweight updates that update all columns in the table. #84380 (Anton Popov).
  • Backported in #84455: Make zoutofmemory hardware error, otherwise it will throw logical error. see https://github.com/clickhouse/clickhouse-core-incidents/issues/877. #84420 (Han Fei).
  • Backported in #84654: Fix out-of-order writes to Keeper changelog. Previously, we could have in-flight writes to changelog, but rollback could cause concurrent change of the destination file. This would lead to inconsistent logs, and possible data loss. #84434 (Antonio Andelic).
  • Backported in #84533: Now if all TTL are removed from table MergeTree will do nothing related to TTL. #84441 (alesapin).
  • Backported in #84636: Parallel distributed INSERT SELECT with LIMIT was allowed which is not correct, it leads to data duplication in target table. #84477 (Igor Nikonov).
  • Backported in #84571: Fix pruning files by virtual column in data lakes. #84520 (Kseniia Sumarokova).
  • Backported in #84634: Change pre-25.5 value of allow_experimental_delta_kernel_rs to false for compatibility. #84587 (Kseniia Sumarokova).

Build/Testing/Packaging Improvement

NO CL CATEGORY

NO CL ENTRY

NOT FOR CHANGELOG / INSIGNIFICANT