docs/changelogs/v25.7.1.3997-stable.md
unexpected_quoting_character_strategy to the function that controls what happens when a quoting_character is unexpectedly found when reading a non quoted key or value. The value can be one of: invalid, accept or promote. Invalid will discard the key and go back to waiting key state. Accept will treat it as part of the key. Promote will discard previous character and start parsing as a quoted key. I believe the default behavior (and maybe the only behavior) should be INVALID, but in 2023 on https://github.com/ClickHouse/ClickHouse/pull/56423 it was implemented the PROMOTE logic. So, to make it kind of backwards compatible, I created this setting. 2. After parsing a quoted value, only parse the next key if a pair delimiter is found. #80657 (Arthur Passos).countMatches would stop counting at the first empty match even if the pattern accepts it. To overcome this issue, countMatches now continues execution by advancing by a single character when an empty match occurs. Users who like to retain the old behavior can enable setting count_matches_stop_at_empty_match. #81676 (Elmi Ahmadov).max_local_read_bandwidth_for_server and max_local_write_bandwidth_for_server) and remote (max_remote_read_network_bandwidth_for_server and max_remote_write_network_bandwidth_for_server) traffic, instead they were only throttled by dedicated server settings (max_backup_bandwidth_for_server, max_mutations_bandwidth_for_server and max_merges_bandwidth_for_server). Now, they use both types of throttlers simultaneously. #81753 (Sergei Trifonov).cluster_function_process_archive_on_multiple_nodes, by default equal to true. If set to true, increases performance of processing archives in cluster functions. Should be set to false for compatibility and to avoid errors during upgrade to 25.7+ if using cluster functions with archives on earlier versions. #82355 (Kseniia Sumarokova).SYSTEM RESTART REPLICAS query led to the wakeup of tables in the Lazy database, even without access to that database, and it happened while these tables were being concurrently dropped. Note: Now SYSTEM RESTART REPLICAS will only restart replicas in the databases where you have permission to SHOW TABLES, which is natural. #83321 (Alexey Milovidov).Float32 and Float64. #67161 (Konstantin Vedernikov).ToTime. #81217 (Yarik Briukhovetskyi).max_waiting_queries is now supported. It can be used to limit the size of the query queue. If the limit is reached, all subsequent queries will be terminated with the SERVER_OVERLOADED error. #81250 (Oleg Doronin).financialInternalRateOfReturnExtended (XIRR), financialInternalRateOfReturn (IRR), financialNetPresentValueExtended (XNPV), financialNetPresentValue (NPV). #81599 (Joanna Hulboj).system.codecs to introspect the available codecs. (issue #81525). #81600 (Jimmy Aguilar Mena).polygonIntersectsCartesian and polygonIntersectsSpherical to check if two polygons intersect. #81882 (Paul Lamb).MergeTree-family tables. Lightweight updates can be used by a new syntax: UPDATE <table> SET col1 = val1, col2 = val2, ... WHERE <condition>. Added implementation of lightweight deletes via lightweight updates. It can be enabled by setting lightweight_delete_mode = 'lightweight_update'. #82004 (Anton Popov).lag and lead window functions. Closes #9887. #82108 (Dmitry Novik)._part_granule_offset virtual column in MergeTree-family tables. This column indicates the zero-based index of the granule/mark each row belongs to within its data part. This addresses #79572. #82341 (Amos Bird).insert queries. #82692 (Konstantin Vedernikov).Float32 and Float64. #83088 (Konstantin Vedernikov).colorSRGBToOkLCH and colorOkLCHToSRGB for converting colours between the sRGB and OkLCH colour spaces. #83679 (Fgrtue).searchAny and searchAll which are general purpose tools to search text indexes. #80641 (Elmi Ahmadov).string tokenizer. #81752 (Elmi Ahmadov).text indexes to 64. This improves the expected performance for the average test query in internal benchmarks. #82162 (Jimmy Aguilar Mena).searchAny, searchAll due to a bug. See #82385. #83117 (Alexey Milovidov).searchAny or searchAll functions get different text index parameters for the same column. This scenario is only possible when there are two different text indices for the same column. This PR limits a column to be defined only once for the text index, but wrapper around the column (e.g. lower(column)) is still allowed. #83303 (Elmi Ahmadov).enable_parallel_blocks_marshalling. It should speed up distributed queries that transfer significant amounts of data between the initiator and remote nodes. #78694 (Nikita Taranov).merge_tree_min_{rows,bytes}_for_seek in filterPartsByQueryConditionCache to align it with other methods filtering by indexes. #80312 (李扬).parallel_distributed_insert_select setting. #80425 (Igor Nikonov).min_joined_block_size_rows (analogous to min_joined_block_size_bytes; defaults to 65409) to control the minimum block size (in rows) for JOIN input and output blocks (if the join algorithm supports it). Small blocks will be squashed. #81886 (Nikita Taranov).COUNT() function on a NOT NULL column, the aggregation logic is fully inlined during hash table probing. This avoids allocating and maintaining any aggregation state, significantly reducing memory usage and CPU overhead. This partially addresses #81982. #82104 (Amos Bird).HashJoin optimised by removing the additional loop over hash maps in the typical case of only one key column, also null_map and join_mask checks are eliminated when they're always true/false. #82308 (Nikita Taranov).ATTACH PARTITION no longer leads to the dropping of all caches. #82377 (Alexey Milovidov).CROSS JOIN is not produced if query_plan_correlated_subqueries_use_substitution setting is enabled. #82435 (Dmitry Novik).EXISTS. #82443 (Dmitry Novik).parallel_distributed_insert_select setting. #83040 (Igor Nikonov).null_map and JoinMask from #82308 were applied to the case of JOIN with multiple disjuncts. Also, the KnownRowsHolder data structure was optimized. #83041 (Nikita Taranov).std::vector<std::atomic_bool> is used for join flags to avoid calculating a hash on each access to flags. #83043 (Nikita Taranov).HashJoin uses lazy output mode. It is suboptimal, especially when the number of matches is low. Moreover, we know the exact amount of matches after joining is done, so we can preallocate more precisely. #83304 (Nikita Taranov).READ and WRITE for sources and deprecates all previous access types related to sources. Before GRANT S3 ON *.* TO user, now: GRANT READ, WRITE ON S3 TO user. This also allows to separate READ and WRITE permissions for sources, e.g.: GRANT READ ON * TO user, GRANT WRITE ON S3 TO user. The feature is controlled by a setting access_control_improvements.enable_read_write_grants and disabled by default. #73659 (pufit).moveFile and replaceFile in s3_plain_rewritable to support it as a database disk. #79424 (Tuan Pham Anh).allow_experimental_join_condition marked as obsolete. #80566 (Vladimir Cherkasov).MarkCacheEvictedBytes, MarkCacheEvictedMarks, MarkCacheEvictedFiles for tracking evictions from the mark cache. (issue #60989). #80799 (Shivji Kumar Jha).DeltaLake table engine: delta-kernel-rs has ExpressionVisitor API which is implemented in this PR and is applied to partition column expressions transform (it will replace an old deprecated within the delta-kernel-rs way, which was used before in our code). In the future this ExpressionVisitor will also allow to implement statistics based pruning and some delta-lake proprietary features. Additionally the purpose of this change is to support partition pruning in DeltaLakeCluster table engine (the result of a parsed expression - ActionsDAG - will be serialized and sent from the initiator along with the data path, because this kind of information, which is needed for pruning, is only available as meta information on data files listing, which is done by initiator only, but it has to be applied to data on each reading server). #81136 (Kseniia Sumarokova).CREATE USER queries for usernames. #81387 (Diskein).metadata.json files for Iceberg. Fixes #70874. #81451 (alesapin).system.formats table now contains extended information about formats, such as HTTP content type, the capabilities of schema inference, etc. #81505 (Alexey Milovidov).clickhouse-keeper-utils, a new command-line tool for managing and analyzing ClickHouse Keeper data. The tool supports dumping state from snapshots and changelogs, analyzing changelog files, and extracting specific log ranges. #81677 (Antonio Andelic).max_network_bandwidth_for_all_users and max_network_bandwidth_for_all_users limits are never exceeded. #81729 (Sergei Trifonov).RENAME COLUMN alter mutation if it will rename some column that right now affected by incomplete data mutation. #81823 (Mikhail Artemenko).s3_slow_all_threads_after_network_error configuration is enabled. #81849 (zoomxi).addressToSymbol and system.symbols table will use file offsets instead of virtual memory addresses. #81896 (Alexey Milovidov).abseil-cpp 20250512.0. #81945 (Konstantin Bogdanov).google-protobuf v31.1. #81976 (Konstantin Bogdanov).max_local_read_bandwidth_for_server and max_local_write_bandwidth_for_server on fly without restart server. #82083 (Kai Zhu).system.warnings table using TRUNCATE TABLE system.warnings. #82087 (Vladimir Cherkasov).AwsNodumpMemoryManager and STL compatible JemallocNodumpSTLAllocator. Both are wrappers of the Jemalloc allocator. They use Jemalloc's extent hooks and madvise to mark memory pages as "don't dump". Used for S3 credentials, user credentials, and some query data. #82441 (Miсhael Stetsyuk).{uuid} can now be used in the keeper_path setting of the S3Queue table engine. #82463 (Nikolay Degterinsky).keeper_server.cleanup_old_and_ignore_new_acl. If enabled, all nodes will have their ACLs cleared while ACL for new requests will be ignored. If the goal is to completely remove ACL from nodes, it's important to leave the config enabled until a new snapshot is created. #82496 (Antonio Andelic).send_metadata logic related to experimental zero-copy replication. It wasn't ever used and nobody supports this code. Since there were even no tests related to it, there is a high chance that it's broken long time ago. #82508 (alesapin).s3queue_disable_streaming which disables streaming in tables with S3Queue table engine. This setting is changeable without server restart. #82515 (Kseniia Sumarokova).SYSTEM RESTART REPLICA may fail due to ZooKeeper connection issues. To avoid forgetting about this table, we now retry until the table is created. #82616 (Nikolay Degterinsky).clickhouse-server without a configuration file will also listen to the PostgreSQL port 9005, like with the default config. #82633 (Alexey Milovidov).StorageKafka2 to system.kafka_consumers. #82652 (János Benjamin Antal).(a < 1 and a > 0) or b = 3, by statistics. #82663 (Han Fei).ReplicatedMergeTree::executeMetadataAlter, we get the StorageID, and without taking DDLGuard, we try to call IDatabase::alterTable. In between this time we could have technically exchanged the table in question with another table, so when we get the definiton we would get the wrong one. To avoid this we add a separate check for UUIDs to match when we try to call IDatabase::alterTable. #82666 (Nikolay Degterinsky).--reconnect option in clickhouse-benchmark. It was changed by mistake in #79465. #82677 (Alexey Milovidov).nan and inf with NumericIndexedVector. Fixes #82239 and a little more. #82681 (Raufs Dunamalijevs).X-ClickHouse-Progress and X-ClickHouse-Summary header formats have been modified to omit zero values. This PR intends to return the previous behaviour for X-ClickHouse-Summary only, because it makes sense. #82727 (Nikita Mikhaylov).Decimal to Float32. Implement conversion from Decimal to BFloat16. Closes #82660. #82823 (Alexey Milovidov).CREATE DICTIONARY. Closes #82105. #82829 (Alexey Milovidov).materialize function. Closes #82828. #82831 (Alexey Milovidov).clickhouse-server with embedded configuration will allow using the Web UI by providing an HTTP OPTIONS response. #82870 (Alexey Milovidov).clickhouse-format and in the echo in clickhouse-client, but now it is done in the command prompt as well. #82871 (Alexey Milovidov).clickhouse-format and in the client's echo will work in the same way as the highlighting in the command line prompt. #82874 (Alexey Milovidov).zookeeper.path_acls. #82898 (Antonio Andelic).commit_time, commit_id to system.s3queue_log. #83016 (Kseniia Sumarokova).AsynchronousMetrics. If the /sys/block directory exists but is not accessible, the server will start without monitoring the block devices. Closes #79229. #83115 (Alexey Milovidov).TimestampTZ in Glue catalog. This closes #81654. #83132 (Konstantin Vedernikov).std::exception instead of a meaningful error with a clear explanation. This is now fixed. This fixes: #82889. #83190 (Nikita Mikhaylov).distributed_ddl_output_mode='*_only_active', don't wait for new or recovered replicas that have replication lag bigger than max_replication_lag_to_enqueue. This should help to avoid DDL task is not finished on some hosts when a new replica becomes active after finishing initialization or recovery, but it accumulated huge replication log while initializing. Also, implement SYSTEM SYNC DATABASE REPLICA STRICT query that waits for replication log to become below max_replication_lag_to_enqueue. #83302 (Alexander Tokmakov).reinterpret() function now supports conversion to Array(T) where T is a fixed-size data type (issue #82621). #83399 (Shankar Iyer).getStatus throws an ErrorCodes::ABORTED exception. Previously, this would fail the select query. Now we catch the ErrorCodes::ABORTED exceptions and intentionally ignore them instead. #83435 (Miсhael Stetsyuk).enable_vector_similarity_index which must be enabled to use the vector similarity index. The existing setting allow_experimental_vector_similarity_index is now obsolete. It still works in case someone needs it. #83459 (Robert Schulze).UserTimeMicroseconds, SystemTimeMicroseconds, RealTimeMicroseconds) to part_log profile events for MergeParts entries. #83460 (Vladimir Cherkasov).create_if_not_exists, check_not_exists, remove_recursive feature flags in Keeper by default which enable new types of requests. #83488 (Antonio Andelic).CLICKHOUSE_HOST environment variable to specify the ClickHouse server host, aligning with existing CLICKHOUSE_USER and CLICKHOUSE_PASSWORD environment variables. This allows for easier configuration without modifying client or configuration files directly. #83659 (Doron David).clickhouse-server. Resolves #83637. #83749 (Rafael Roquetto).memory_worker_use_cgroup and cgroups are available) to adjust memory tracker (memory_worker_correct_memory_tracker). #83981 (Azat Khuzhin).minmax_count_projection. This resolves #77091. #77166 (Amos Bird).ORDER BY ... LIMIT BY ... LIMIT N, when ORDER BY is executed as a PartialSorting, the counter rows_before_limit_at_least now reflects the number of rows consumed by LIMIT clause instead of number of rows consumed by sorting transform. #78999 (Eduard Karacharov).loop function when used with the remote function family. Ensure the LIMIT clause is respected in loop(remote(...)). #80299 (Julia Kartseva).to_utc_timestamp and from_utc_timestamp functions when handling dates before Unix epoch (1970-01-01) and after maximum date (2106-02-07 06:28:15). Now these functions properly clamp values to epoch start and maximum date respectively. #80498 (Surya Kant Ranjan).WHERE function(key) IN (...) as if it were WHERE key IN (...). #81255 (Michael Kolupaev).Aggregator in case of exception during merge. #81450 (Nikita Taranov).InterpreterInsertQuery::extendQueryLogElemImpl to add backquotes to database and table names when needed (f.g., when names contain special characters like -). #81528 (Ilia Shvyrialkin).IN execution with transform_null_in=1 with null in the left argument and non-nullable subquery result. #81584 (Pavel Kruglov)./js. This closes #61890. #81895 (Alexey Milovidov).MongoDB table engine definitions could include a path component in the host:port argument which was silently ignored. The mongodb integration refuses to load such tables. With this fix we allow loading such tables and ignore path component if MongoDB engine has five arguments, using the database name from arguments. Note: The fix is not applied for newly created tables or queries with mongo table function, as well as for dictionary sources and named collections. #81942 (Vladimir Cherkasov).Aggregator in case of exception during merge. #82022 (Nikita Taranov).LOGICAL_ERROR, close #80620. #82056 (Vladimir Cherkasov).DatabaseReplicated::getClusterImpl. If the first element (or elements) of hosts has id == DROPPED_MARK and there are no other elements for the same shard, the first element of shards will be an empty vector, leading to std::out_of_range. #82093 (Miсhael Stetsyuk).Not found column error for queries with arrayJoin under WHERE condition and IndexSet. #82113 (Nikolai Kochetov).map<string, decimal(9, 2)>. Fixes #81301. #82114 (alesapin).numericIndexedVectorPointwiseAdd, numericIndexedVectorPointwiseSubtract, numericIndexedVectorPointwiseMultiply, numericIndexedVectorPointwiseDivide functions that happened when we applied them to large numbers. #82165 (Raufs Dunamalijevs).asynchronous_metrics_update_period_s and asynchronous_heavy_metrics_update_period_s. #82310 (Bharat Nallan).NULL arguments in CASE function. #82436 (Yarik Briukhovetskyi).session_timezone overrides (previously in case of session_timezone was set in i.e. users.xml/client options to non empty and in query context to empty, then, value from users.xml was used, while this is wrong, now query context will always have a priority over global context). #82444 (Azat Khuzhin).threadpool_writer_pool_size to zero to ensure that server operations don't get stuck. #82532 (Bharat Nallan).LOGICAL_ERROR during row policy expression analysis for correlated columns. #82618 (Dmitry Novik).mergeTreeProjection table function when enable_shared_storage_snapshot_in_query = 1. This is for #82634. #82638 (Amos Bird).use_skip_indexes_if_final_exact_mode implementation (introduced in 25.6) could fail to select a relevant candidate range depending upon MergeTree engine settings / data distribution. That has been resolved now. #82667 (Shankar Iyer).trim{Left,Right,Both} now support input strings of type "FixedString(N)". For example, SELECT trimBoth(toFixedString('abc', 3), 'ac') now works. #82691 (Robert Schulze).groupArraySample/groupArrayLast in case of empty elements (deserialization could skip part of the binary if the input was empty, this can lead to corruption during data read and UNKNOWN_PACKET_FROM_SERVER in TCP protocol). This does not affect numbers and date time types. #82763 (Pedro Ferreira).Memory table, causing the backup restore to fail with with BACKUP_ENTRY_NOT_FOUND error. #82791 (Julia Kartseva).tail_ptr in TransactionLog::removeOldEntries. #82824 (Tuan Pham Anh).use_skip_indexes_if_final_exact_mode optimization (introduced in 25.6) could fail to select a relevant candidate range depending upon MergeTree engine settings / data distribution. That has been resolved now. #82879 (Shankar Iyer).Merge. Fixes #82092. #82950 (Dmitry Novik).formatDateTime when formatter %f is used together with variable-size formatters (e.g. %M). #83020 (Robert Schulze).lowCardinalityKeys function. #83118 (Alexey Milovidov).Not found column X in block. This behaviour is fixed. This fixes: #82784. #83221 (Nikita Mikhaylov).no_sign_request for S3 client. It can be used to explicitly avoid signing S3 requests. It can also be defined for specific endpoints using endpoint-based settings. #83379 (Antonio Andelic).TOO_DEEP_SUBQUERIES exception when CTE definition references another table expression with the same name. #83413 (Dmitry Novik).REVOKE S3 ON system.* revokes S3 permissions for *.*. This fixes #83417. #83420 (pufit).role_cache_expiration_time_seconds (issue #83374). #83461 (wushap).IndexUncompressedCacheBytes/IndexUncompressedCacheCells/IndexMarkCacheBytes/IndexMarkCacheFiles metrics (previously they were included into metric w/o Cache prefix). #83730 (Azat Khuzhin).add_minmax_index_for_numeric_columns=1 or add_minmax_index_for_string_columns=1, the index is later materialized during an ALTER operation, and it prevents the Replicated database from initializing correctly on a new replica. #83751 (Nikolay Degterinsky).BackgroundSchedulePool shutdown. #83769 (Azat Khuzhin).view(...) argument of remote table function with enabled analyzer. Fixes #78717. Fixes #79377. #83844 (Dmitry Novik).MATERIALIZE COLUMN query could lead to unexpected files in checksums.txt and eventually detached data parts. #84007 (alesapin).lightweight_mutation_projection_mode = 'rebuild' and user execute lighweight delete which deletes ALL rows from any block in table. #84158 (alesapin).array() function. This fixes #84202. #84297 (Amos Bird).zoutofmemory hardware error, otherwise it will throw logical error. see https://github.com/clickhouse/clickhouse-core-incidents/issues/877. #84420 (Han Fei).allow_experimental_delta_kernel_rs to false for compatibility. #84587 (Kseniia Sumarokova).fasttest-only. #82472 (Yakov Olkhovskiy).libarchive 3.8.1. #82648 (Konstantin Bogdanov).libxml2 v2.14.4. #82649 (Konstantin Bogdanov).expat 2.7.1 inside Poco. #82661 (Konstantin Bogdanov).Dockerfile.ubuntu for clickhouse-server to fit requirements in Docker Official Library. #83039 (Mikhail f. Shiryaev).curl clickhouse.com. #83463 (Mikhail f. Shiryaev).busybox binary and install tools in clickhouse/clickhouse-server and official clickhouse images. #83735 (Mikhail f. Shiryaev).reportBrokenPart"'. #81909 (Azat Khuzhin).Aggregator::mergeBlocks"'. #81975 (Nikita Mikhaylov).system.symbols and addressToSymbol"'. #82372 (Azat Khuzhin).test_keeper_snapshot_on_exit. #81922 (Antonio Andelic).test_keeper_invalid_digest. #81925 (Antonio Andelic).La Casa Del Dolor and small fixes. #81998 (Pedro Ferreira).trace_log creation, as well as a manually calculated sipHash hidden deep in the internals. This fixes both. #82109 (Mikhail f. Shiryaev).use_legacy_to_time not important. #82195 (Yarik Briukhovetskyi).use_legacy_to_time. It should be enabled on old instances. #82271 (Nikita Fomichev).searchAll and searchAny as experimental. #82359 (Robert Schulze).La Casa Del Dolor. - Experiment killing running mutations. - Fix when creating a view, make sure columns are available when generating expressions. - Add lag and lead window functions. - Cleaned expression generation to be fairer in BuzzHouse. - Added table.* expressions. #82386 (Pedro Ferreira).03532_crash_in_aggregation_because_of_lost_exception with distr. #82399 (Nikita Taranov).clickhouse_functions_text library and make search functions part of libdbms. #82442 (Elmi Ahmadov).ch dig. #82540 (Alexey Milovidov).docs/changelogs. #82744 (Mikhail f. Shiryaev).clickhouse-test. #82812 (Alexey Milovidov).--test-runs. #82937 (Nikita Fomichev)./metadata during ALTER COMMENT COLUMN queries. So ignore them when creating a new replica. #82952 (Nikolay Degterinsky).JoinMask. #83097 (Nikita Taranov).03315_executable_table_function_threads. #83098 (Alexey Milovidov).00764_max_query_size_allocation.sh. #83170 (Alexey Milovidov).ClickHouse-docker-library could cause issues for the freshly created PRs. #83251 (Mikhail f. Shiryaev).test_keeper_invalid_digest. #83275 (Antonio Andelic).01606_merge_from_wide_to_compact. #83325 (Alexey Milovidov).02483_elapsed_time. #83326 (Alexey Milovidov).ClickHouseVersion in other code modules. #83382 (Mikhail Artemenko).03100_lwu_31_merge_memory_usage. #83385 (Anton Popov).--pre-pull command to the integration tests jobs, it was deleted in #73291. #83528 (Mikhail f. Shiryaev).03223_analyzer_with_cube_fuzz. #83642 (Alexey Milovidov).max_threads, affected it. #83681 (Alexey Milovidov).vector_search_postfilter_multiplier + add more tests. #83689 (Shankar Iyer).WriteBufferValidUTF8. This closes #83514. #83816 (Alexey Milovidov).max_database_replicated_create_table_thread_pool_size setting is 0 (automatic pool size). #83834 (Alexander Tokmakov).03100_lwu_24_renames. #83857 (Anton Popov).MergeTreeBackgroundExecutor. When we need to restart all tables due to connection loss and we wait for background tasks to finish - tables may even stuck in readonly mode for an hour. But looks like we don't need this lock for calling cancel. #84311 (Alexander Tokmakov).repo field in exported CI logs. #84321 (Mikhail f. Shiryaev).