Back to Yugabyte Db

What's new in the v2.21 release series

docs/content/stable/releases/ybdb-releases/end-of-life/v2.21.md

2026.1.0.0-b29143.9 KB
Original Source
<ul class="nav nav-tabs-alt nav-tabs-yb"> <li > <a href="../v2.21/" class="nav-link active"> <span>YugabyteDB</span> </a> </li> <li > <a href="../v2.21-anywhere/" class="nav-link"> <span>YugabyteDB Anywhere</span> </a> </li> </ul>

Release announcements

{{<warning title="Changes to supported operating systems">}} YugabyteDB 2.21.0.0 and newer releases do not support v7 Linux versions (CentOS7, Red Hat Enterprise Linux 7, Oracle Enterprise Linux 7.x), Amazon Linux 2, and Ubuntu 18. If you're currently using one of these Linux versions, upgrade to a supported OS version before installing YugabyteDB v2.21. Refer to Operating system support for the complete list of supported operating systems. {{</warning>}}

v2.21.1.0 - June 13, 2024 {#v2.21.1.0}

Build: 2.21.1.0-b271

Downloads

<ul class="nav yb-pills"> <li> <a href="https://software.yugabyte.com/releases/2.21.1.0/yugabyte-2.21.1.0-b271-darwin-x86_64.tar.gz"> <i class="fa-brands fa-apple"></i> <span>macOS</span> </a> </li> <li> <a href="https://software.yugabyte.com/releases/2.21.1.0/yugabyte-2.21.1.0-b271-linux-x86_64.tar.gz"> <i class="fa-brands fa-linux"></i> <span>Linux x86</span> </a> </li> <li> <a href="https://software.yugabyte.com/releases/2.21.1.0/yugabyte-2.21.1.0-b271-el8-aarch64.tar.gz"> <i class="fa-brands fa-linux"></i> <span>Linux ARM</span> </a> </li> </ul>

Docker:

sh
docker pull yugabytedb/yugabyte:2.21.1.0-b271

New features

  • Bitmap scan support. Combine multiple indexes for more efficient scans. {{<tags/feature/tp>}}
  • Active Session History. Get real-time and historical information about active sessions to analyze and troubleshoot performance issues. {{<tags/feature/tp>}}

Change log

<details> <summary>View the detailed changelog</summary>

Improvements

YSQL

  • Enhances logging for DDL transaction conflicts and PG catalog version mismatches by including the DDL command tag and specific log details outside of the log_ysql_catalog_versions gflag. {{<issue 20084>}}
  • Adds a webserver with a Prometheus endpoint on ysql_bench for resilience and scale documentation. {{<issue 19667>}}
  • Alters temporary namespace naming in YB to pg_temp_<tserver-uuid>_<backend_id> from pg_temp_<backend_id>, making them unique across nodes and preventing temp tables overwriting or deletion. {{<issue 19255>}}
  • Enhances ADD/DROP PK using a new table rewrite approach, preserving PG level metadata during rewrite operations and enabling concurrent DML abortions to manage schema version mismatches. {{<issue 17130>}}
  • Introduces new flags in yb_backup.py: backup_tablespaces, restore_tablespaces, and a redefined use_tablespaces, enhancing backup and restore procedures. {{<issue 20389>}}
  • Refines VACUUM warning messages, removes beta feature sign and makes it clear that garbage collection of dead tuples is automatic. {{<issue 18330>}}
  • Allows BNL (Block Nested Loop) joins on different integer types, such as int2 and int4, promoting flexibility in join-orderings. {{<issue 20715>}}
  • Adds function to log memory contexts of specific backend process, helping debug local memory bloat issues. {{<issue 14025>}}
  • Treats REFRESH MATERIALIZED VIEW as a non-disruptive change, preventing unnecessary transaction terminations. The default option, REFRESH MATERIALIZED VIEW NONCONCURRENTLY, modifies metadata but without making a disruptive alteration. {{<issue 20420>}}
  • Resolves ToString function issue which caused non-const references of std/boost::optional objects to display as pointers. {{<issue 20719>}}
  • Renamed EXPLAIN field names: Remote Filter to Storage Filter, Remote SQL to Storage SQL, and Remote Index Filter to Storage Index Filter. {{<issue 14503>}}
  • Redefines the ToString function order to prevent compilation failures when used for collections with std::optional. {{<issue 20887>}}
  • Introduces new yb_backup.py flags including backup_roles, restore_roles, ignore_existing_roles, and use_roles with distinct semantics for enhanced role management during backup and restore operations. {{<issue 20972>}}
  • Reduces stack trace duplication in yb_debug_report_error_stacktrace and refines its debugging functionality. {{<issue 21017>}}
  • Now, the yb_prefer_bnl flag takes precedence over a NestLoop hint, ensuring smoother upgrades. {{<issue 21129>}}
  • Enables easier alteration of the ysql_enable_db_catalog_version_mode gflag default value through a new framework. {{<issue 21127>}}
  • Removes unnecessary epoch parameter from LaunchTS and changes epoch back to term in YsqlBackendsManager. {{<issue 21217>}}
  • Introduces a new flag master_ts_ysql_catalog_lease_ms to decrease the lease period to 10 seconds, reducing the waiting time for unresponsive tservers. {{<issue 21249>}}
  • Refine ALTER TYPE functionality to apply the table rewrite approach enhancing database upgrade processes. {{<issue 17130>}}
  • Displays distinct prefix keys explicitly in the explain output, enhancing the clarity of indexing for users. {{<issue 20831>}}
  • Adds auto gflag ysql_yb_enable_ddl_atomicity_infra to control DDL atomicity feature during the upgrade phase. {{<issue 21535>}}
  • Allows YbInitPinnedCacheIfNeeded to only load the shared pinned cache, enhancing concurrent handling of DDLs in various databases. {{<issue 21635>}}
  • Now logs global-impact DDL statements that increment all database catalog versions. {{<issue 21826>}}
  • Adds a new YSQL view for YCQL statement metrics, allowing it to be joined with YCQL wait events in the yb_active_universe_history table. {{<issue 20616>}}
  • Reduces per-backend memory consumption by reinstating TOAST compression for catalogue tables. {{<issue 21040>}}
  • Avoids schema version mismatch errors during ALTER TABLE operations in cases where DDL atomicity is enabled. {{<issue 21787>}}
  • Resolves schema version mismatch errors that occur after an ALTER TABLE operation due to DDL transaction verification in non-debug builds. {{<issue 21787>}}
  • Introduces a new YSQL configuration parameter yb_enable_parallel_append to disable the unannounced feature parallel append. {{<issue 21934>}}
  • Adds new columns to localhost:13000/statements for more comprehensive database management, including user and database IDs along with varied block level statistics. {{<issue 21735>}}
  • Enables DDL atomicity feature by default by altering ysql_yb_ddl_rollback_enabled, report_ysql_ddl_txn_status_to_master, and ysql_ddl_transaction_wait_for_ddl_verification flags' defaults. {{<issue 22097>}}
  • Enhances YSQL backfill logging clarity and documentation for PgsqlBackfillSpecPB proto. {{<issue 21154>}}
  • Allows ysql_bench to execute a connection initialization SQL like Hikari CP, useful for setting parameters such as statement_timeout during resilience testing. {{<issue 19741>}}
  • Optimizes the get_tablespace_distance function, enhancing the speed of the yb_is_local_table YSQL function. Reduces query time by caching GeolocationDistance value. {{<issue 20860>}}
  • Updates the description for the ysql_catalog_preload_additional_tables flag to accurately reflect preloading behavior. {{<issue 20791>}}
  • Avoids utilizing unsupported ybgin index scans by adjusting the ybgin cost estimation, favoring sequential scans for better effectiveness when specific conditions are met, mitigating potential misuse of indices. {{<issue 9960>}}
  • Allows hiding non-deterministic "Memory:" fields in EXPLAIN output using the GUC yb_explain_hide_non_deterministic_fields, primarily for pg15. {{<issue 20958>}}
  • Delivers consistent error messages for aborted transactions that align with those from serialization and deadlock errors. {{<issue 21043>}}

YCQL

  • Now throws an error when using the unsupported GROUP BY clause in YCQL with autoflag ycql_suppress_group_by_error available for compatibility issues. {{<issue 13956>}}

DocDB

  • Marks the unused rpc_queue_limit flag as deprecated in the latest releases. {{<issue 20830>}}
  • Ensures flag validators are defined in the same source file as the flag for reliable initialization order. {{<issue 20915>}}
  • Adds the ability to cancel the ScopeExit action for more efficient resource cleanup during successful function completions. {{<issue 20595>}}
  • Allows the TServer and master memory allocation to automatically adjust based on the available node RAM. {{<issue 20664>}}
  • Adds a log message to notify when a tablet cannot be moved from a blacklisted server due to replication factor constraints. {{<issue 15624>}}
  • Reduces tablet overhead by eliminating unnecessary allocation of the WritableFileWriter in RocksDB's Write-Ahead Log (WAL). {{<issue 7996>}}
  • Introduces a new macro to verify if specific fields in a protobuf message are set, enhancing error tracking. {{<issue 20802>}}
  • Streamlines the process of registering master web server URL paths, making the code less prone to errors. {{<issue 20858>}}
  • Enables all catalog entities to track and abort tasks, ensuring task termination when entities are deleted or master loses leadership. {{<issue 20859>}}
  • Avoids potential deadlock by using xcluster safe time excluding the ddl_queue table when adding new tables. {{<issue 21076>}}
  • Reduces log clutter by making generate_test_certificates.sh only output on failure. {{<issue 20979>}}
  • Renames MultiStepTableTask to MultiStepCatalogEntityTask to ensure broader support for CatalogEntityWithTasks. {{<issue 20982>}}
  • Simplifies use and readability of IsOperationDoneResult by relocating to is_operation_done_result.h for broader usage. {{<issue 21085>}}
  • Allows immediate halt and return of actual error when table creation fails, instead of waiting for a generic "Timed out" error. {{<issue 17132>}}
  • Enables filtering of files based on hybrid time during conflict resolution to enhance performance and allows row number override in pg_single_tserver-test using a test gflag. {{<issue 20666>}}
  • Allows IsOperationDoneResult to be accessed for non-xcluster usage, enhancing functionality. {{<issue 21085>}}
  • Introduces a new limit on Prometheus metric entries to prevent server overwhelm when there are too many metrics. {{<issue 18089>}}
  • Fixes a race condition on xClusterPoller shutdown to prevent unexpected error_map additions. {{<issue 21134>}}
  • Aligns pointer to the left in arcanist_util/pre_commit_hook.sh for consistent styling. {{<issue 21329>}}
  • Shifts async_client_initializer to the server and minimizes client libraries dependency on server_process, enhancing its proper operation within a server. {{<issue 21337>}}
  • Relocates specific flags to rpc library and common_flags.cc from server_process, and removes FsManager usage in client library. {{<issue 21338>}}
  • Allows the combination of SCHECK and STATUS_EC_FORMAT with the addition of SCHECK_EC_FORMAT. {{<issue 21373>}}
  • Increases follower lag check to 1s *kTimeMultiplier in AreNodesSafeToTakeDown tests, minimizing test flakiness. {{<issue 21247>}}

CDC

  • Preserves CDC stream even when all associated tables are dropped, tying its lifecycle to the database. {{<issue 21419>}}
  • Added a test to certify the safe_time set during GetChanges call, reducing data loss during network failures. Ensures consistent safe_hybrid_time in multiple GetChanges calls. {{<issue 21240>}}
  • Allows modification of the publication refresh interval using the cdcsdk_publication_list_refresh_interval_secs flag. {{<issue 21796>}}
  • Enables REPLICA IDENTITY syntax for altering table commands in YSQL, allowing control over CDC image information. {{<issue 20143>}}
  • Allows separate creation functions for xcluster and cdcsdk, resolving an issue in stream creation. {{<issue 20536>}}
  • Removes unused includes from CDC files, potentially reducing build times. {{<issue 21235>}}
  • Introduces replica identity in CDC to populate before image records, allowing table-level before image information fetching and retaining in stream metadata. {{<issue 21314>}}
  • Eliminates unnecessary NOTICE messages when setting yb_read_time from walsender, reducing message clutter. {{<issue 22379>}}
  • Fixes CDCSDK flaky test by ensuring the write is persisted before reading to avoid race condition. {{<issue 20491>}}
  • Enables transaction state to be cleared promptly after a table is deleted, preventing table deletion from getting stuck and resulting in faster functionality. {{<issue 22095>}}

yugabyted

  • Allows DROP DATABASE query to accurately check for active connections before succeeding or failing. {{<issue 20581>}}
  • Allows smooth execution of DML queries interspersed with relevant DDL queries by accurately handling unnamed prepared statements. {{<issue 21367>}}
  • Allows setting of YSQL configuration parameters in various scenarios including SET, RESET, RESET ALL, and SET LOCAL using Connection Manager. {{<issue 19989>}}
  • Updates the yugabyted-ui backend to align with changes in the connection manager stats consumed from the :13000/connections endpoint, catering for removal of pool_name and addition of database_name and user_name. {{<issue 20494>}}
  • Allows a smooth restart of the second node in a cluster using the join flag without throwing any errors. {{<issue 20684>}}
  • Runs point-in-time recovery operations for specific databases or keyspaces directly through the new configure point_in_time_recovery sub-command in yugabyted. {{<issue 20493>}}
  • Retains the integrity of user's custom configuration file by associating config flag with start command, and directs updates to a yugabyted generated file within base_dir/conf directory. {{<issue 20881>}}
  • Facilitates faster loading time for UI by incorporating a local cache of master/tserver addresses in the yugabyted-ui api server. {{<issue 21181>}}
  • Allows separate counting of YSQL and YCQL connections when YSQL connection manager is active. {{<issue 21182>}}
  • Enables a predefined set of gflags related to the pg-parity project using the enable_pg_parity flag in the yugabyted start command. {{<issue 21221>}}
  • Enables parsing of the allowed_preview_flags_csv master flag when given using master_flags. {{<issue 21364>}}
  • Changes the flag enable_pg_parity to enable_pg_parity_tech_preview for activating a predefined set of gflags related to the pg-parity project with the yugabyted start command. {{<issue 21221>}}
  • Adds support for Prepare Statements via simple query protocol in Ysql Connection Manager, ensuring connection stickiness. {{<issue 19601>}}

Bug fixes

YSQL

  • Ensures the Linux PDEATH_SIG mechanism signals child processes of their parent process's exit, by correctly configuring all PG backends immediately after their fork from the postmaster process. {{<issue 20396>}}
  • Reduces the likelihood of a CHECK failure when restarting a DDL statement in debug build. {{<issue 20820>}}
  • Corrects the inaccurate detection of constants in distinct prefix computation during distinct index scans, ensuring reliable query results for batch nested loop joins. {{<issue 20827>}}
  • Renders a fix for memory corruption issue that caused failure in creating a valid execution plan for SELECT DISTINCT queries. Enables successful execution of queries without errors and prevents server connection closures by disabling distinct pushdown. This fix improves the stability and effectiveness of SELECT DISTINCT queries. {{<issue 20893>}}
  • Fixes table rewrite issue on non-colocated tables/matviews in colocated DB, ensuring the new table uses the original table's colocation setting. Includes a workaround for GH issue 20914. {{<issue 20856>}}
  • Reduces excessive storage metric updates during EXPLAIN ANALYZE operation, enhancing performance by incorporating storage_metrics_version in YBCPgExecStats and YbInstrumentation. {{<issue 20917>}}
  • Prevents simultaneous send of read and write operations in the same RPC request that could lead to inconsistent read results, by ensuring that, in case of multiple operations, all buffered ones are flushed first. {{<issue 20864>}}
  • Returns accurate data by checking actual column type before fetching in libpq_utils' template functions. {{<issue 20683>}}
  • Prevents YSQL upgrade failure from versions 2.16 to 2.21 by adding a 2-second delay if there's a breaking DDL statement. {{<issue 20842>}}
  • Corrects the division by zero error occurring with certain queries when the yb_enable_base_scans_cost_model is activated and yb_fetch_size_limit is enforced by setting a fixed size for result width when it equals zero. {{<issue 20892>}}
  • Catches and manages expected errors during concurrent DML & DDL operations on the same table. {{<issue 20953>}}
  • Resolves PcustomeriniTest.CatalogVersionUpdateIfNeeded test failure in perdb catalog version mode. {{<issue 20985>}}
  • Allows new-version DDL in an invalid per-db catalog version configuration during the trial phase, primarily for reversing unproductive upgrades. {{<issue 20300>}}
  • Ensures successful CREATE INDEX operation during the upgrade to per-database catalog version mode, even before the execution of the YSQL migration script. {{<issue 20300>}}
  • Transaction abort error is now considered an expected error in the TestPgDdlConcurrency.testModifiedTableWrite unit test. {{<issue 21022>}}
  • Allows safer execution of DDL statements during the finalization phase of cluster upgrades, reducing risks of data inconsistencies. {{<issue 21066>}}
  • Allows ModifyTable EXPLAIN statements to run as a single row transactions, decreasing latency. Also enables logging for transaction types when yb_debug_log_docdb_requests is enabled. {{<issue 19604>}}
  • Adjusts heartbeat mechanism to shut down when an "Unknown Session" error occurs, reducing log alerts. This benefits idle connections with expired sessions. {{<issue 21264>}}
  • Reduces PostgreSQL connection startup timeouts in geo-distributed clusters with a new wait_for_ysql_backends_catalog_version_master_tserver_rpc_timeout_ms GFlag, increasing the default timeout value to 60s from 30s. This alteration only impacts one specific RPC - WaitForYsqlBackendsCatalogVersion, not all RPCs, which should diminish time-out incidents. {{<issue 18228>}}
  • Changes the index backfill timeout-related flags to lower the possibility of running into timeout-related failures, especially significant when working with YSQL. {{<issue 10650>}}
  • Corrects the "create index" error by adjusting master's operation mode based on pg_yb_catalog_version table checks, ensuring accurate catalog version table mode. {{<issue 21230>}}
  • Reduces delay during master leader changes and cluster startups by having the master wait out the lease period before responding to WaitForYsqlBackendsCatalogVersion requests. {{<issue 21251>}}
  • Grants CREATE privilege on SCHEMA public to all users, enabling PgCatalogVersionTest.DBCatalogVersionGlobalDDL and PgCatalogVersionTest.DBCatalogVersionDisableGlobalDDL tests to pass in both PG11 and PG15. {{<issue 21326>}}
  • Allows BNL's on outer and inner tables, even if the inner table has "unbatchable" join restrictions that can't accept batches of inputs, enhancing queries with complex join conditions. {{<issue 21366>}}
  • Enabling yb_enable_base_scans_cost_model flag triggers PG selectivity estimation and ignores yb_enable_optimizer_statistics flag. {{<issue 21368>}}
  • Sets LC_ALL environment variable to C.UTF-8 when running pgrep in yb-ctl, preventing failure due to UTF-8 characters in other processes' names. {{<issue 21381>}}
  • Fixes seg faults in parallel index/indexonly queries with attributes exceeding those in the relation. {{<issue 21427>}}
  • Corrects the scanning direction error for GiST index by verifying if the scan relation is a YB relation and applying NoMovement direction only in that case. {{<issue 21435>}}
  • Corrects YbGate cleanup after errors to ensure proper functioning of tests and eliminates potential segmentation fault. Additionally, enhances error logging mechanism. {{<issue 21180>}}

YCQL

  • Solves CQL check-failure issue for No wait state when using yb_enable_ash without altering the default flag value. {{<issue 21136>}}
  • Allows the deletion of the Cassandra role in YCQLsh without it regenerating upon cluster restart, by adding a flag to mark if the role was previously created. {{<issue 21057>}}

DocDB

  • Clears pending_deletes_ on failed delete tasks thus preventing tablets from being incorrectly retained after task failure or completion. This rectifies a race condition and allows the Load Balancer to perform operations on specific tablets and Tablet Servers. {{<issue 13156>}}
  • Allows users to specify Gzip stream compression levels enhancing file fetching speed and RPC performance. {{<issue 20848>}}
  • Ensures Create Table operation fails if Alter Replication encounters an error, enhancing the reliability of replication setup. {{<issue 21732>}}
  • Converted the ysql_skip_row_lock_for_update to an auto-flag to resolve compatibility issues during upgrade, preventing incorrect DB record creations that can affect row visibility and integrity. {{<issue 22057>}}
  • Fixes a timeout issue when flushing tablets by handling failed RPC call responses. {{<issue 20948>}}
  • Modifies memory consumption calculations for pending operations to ensure accurate rejection of new writes at bootstrap, preventing loading failures. {{<issue 21254>}}
  • Trims large error messages in AsyncRpc::Failed to prevent hitting memory limit and resulting unavailability. {{<issue 21402>}}
  • Renames and updates the description of the gflag min_secustomerent_size_to_rollover_at_flush for clarity. {{<issue 21691>}}
  • Changes the class of enable_automatic_tablet_splitting gflag from kLocalPersisted (class 2) to kExternal (class 4) to eliminate setup issues with XCluster configurations. {{<issue 22088>}}
  • Allows DML operations on target cluster databases not involved in xCluster replication STANDBY mode. {{<issue 21245>}}
  • Eliminates duplication of the colocation parent table in snapshots created by schedules. {{<issue 20541>}}
  • Enables optional "INCLUDE_NONRUNNING" flag to list all namespaces in yb-admin, aiding in debugging. {{<issue 20331>}}
  • Reduces build failure chances on MacOS by modifying generate_test_certificates.sh to employ third-party openssl instead of system's openssl. {{<issue 20764>}}
  • Allows packing multiple values in Postgres layer for direct insertion into DocDB, reducing insert time and duplication with the use of the ysql_pack_inserted_value gflag. {{<issue 20713>}}
  • Erases errors from the altered universe after merging it back into the original one. {{<issue 20789>}}
  • Fixes test failure by updating the test after adding new tablet_id option to db_options. {{<issue 20975>}}
  • Enable lightweight profiling for identifying and timing slow-performing function call sites using the enable_callsite_profile and enable_callsite_profile_timing flags. {{<issue 21008>}}
  • Reduces TPCC NewOrder latency by replacing the ThreadPoolToken with a Strand within a dedicated rpc::ThreadPool in PeerMessageQueue's NotifyObservers functions, enhancing speed and efficiency. {{<issue 20912>}}
  • Allows database drop operations to proceed smoothly by ignoring missing streams errors and skipping replication checks for already dropped tables. {{<issue 21070>}}
  • Switches all builds, excluding ASAN, to Clang 17 while updating the default compiler type selection logic. {{<issue 21077>}}
  • Allows ListTabletServers to handle heartbeats older than 24 days by adjusting the setting to the maximum int32 value, avoiding system crash. {{<issue 21096>}}
  • Switches the ASAN build type to Clang 17, resolves its issues, and now supports Clang 18 compiler type. {{<issue 21077>}}
  • Adds a test to detect missed conflicts with index-only scans from concurrent transactions on non-unique indexes. {{<issue 20486>}}
  • Includes the indexed_table_id with the index in table listings, eliminating the need for a second lookup to associate a main table with an index. {{<issue 21159>}}
  • Ensures only missing replicas are marked as over-replicated, avoiding the incorrect removal of tablet replicas. {{<issue 21135>}}
  • Activates the wait_states-itest for kBackfillIndex_WaitForAFreeSlot. {{<issue 21239>}}
  • Allows DML operations on non-replicated databases and blocks DML only on databases in transactional xCluster replication STANDBY mode. Now only databases part of an inbound transactional xCluster replication group in the xCluster safe time map will have DML operations blocked. Also, certain attributes are moved from tserver to TserverXClusterContext. {{<issue 21245>}}
  • Enables logging stack traces during call site profiling for identifying frequent callers of hot spots. {{<issue 21305>}}
  • Early aborts transactions that fail during the promotion process, enhancing throughput in geo-partitioned workloads and offering stability in geo-partitioned tests. {{<issue 21328>}}
  • Corrects block cache metrics discrepancy by ensuring Statistics object passes into LRUCache from TableCache for accurate updates. {{<issue 21407>}}
  • Enables submitting multiple tasks to a thread sub-pool and waiting for all tasks to complete without enforcing sequential execution. {{<issue 21344>}}
  • Disables CppCassandraDriverTest.BatchWriteDuringSoftMemoryLimit to prevent test Spark job cancellations. {{<issue 21459>}}
  • Fixes a segmentation fault in yb-master by checking for a null pointer before dereferencing it, addressing an issue in the CDC run on 2.23.0.0-b37-arm. {{<issue 21648>}}
  • Allows DML operations on non-replicated databases and blocks DML only on databases in transactional xCluster replication STANDBY mode. Now only databases part of an inbound transactional xCluster replication group in the xCluster safe time map will have DML operations blocked. Also, certain attributes are moved from tserver to TserverXClusterContext. {{<issue 21245>}}
  • Adds a TSAN suppression to manage the apparent race condition in the function boost::regex_match. {{<issue 21585>}}
  • Eliminates potential FATAL errors during reported tabletPB creation by ensuring retrieval of schema version is atomic. {{<issue 21340>}}
  • Enables the session to outlive the callback by holding a shared pointer to it, preventing potential crashes during concurrent DML queries. {{<issue 21103>}}
  • Prevents yb-master crash by ensuring background task isn't deleted before the callback is invoked. {{<issue 21773>}}
  • Corrects the ClientTest.TestCreateTableWithRangePartition by letting the system select the suitable namespace ID. {{<issue 21827>}}
  • Enables callback completion wait in PollTransactionStatusBase during shutdown to prevent unexpected process termination. {{<issue 21773>}}
  • Allows viewing of the rpc bind addresses in the master leader UI, especially beneficial in cases like k8s where the rpc bind address with the pod DNS is more useful than the broadcast address. {{<issue 21959>}}
  • Reduces unnecessary logging during checkpoint operations by lowering INFO level logs to DEBUG_LEVEL, enhancing log readability. {{<issue 21658>}}
  • Prevents fatal errors by skipping ReserveMarker/AsyncAppend if the tablet peer has already been shut down. {{<issue 21769>}}
  • Enhances YSQL operation by refining task shutdown procedures and avoiding unnecessary task aborts. {{<issue 21917>}}
  • Enhances load balancer efficiency by refining validation logic to block tablet replica additions only for those with a pending delete in progress on the same server, avoiding potential slowdowns during mass tablet replica moves. {{<issue 21806>}}
  • Avoids multiple destruction of the same database connection, preventing system crashes due to simultaneous connection failures. {{<issue 21738>}}
  • Stops fatal errors caused by the re-use of remote log anchor session during remote bootstrap from a non-leader peer. This fix ensures shared pointers are accurately tracked for tablet_peer objects using the = operator, preventing unintentional destruction of underlying objects. {{<issue 22007>}}
  • Corrects a bug causing some tablet metrics to display incorrect metric_type attribute. {{<issue 21608>}}
  • Enables the skip_table_tombstone_check for colocated tables to prevent errors. {{<issue 22115>}}
  • Initializes prev_op to UNKNOWN to prevent AlmaLinux 8 fastdebug gcc11 compilation failures. {{<issue 21811>}}
  • Delays min_running_ht initialization until after the successful completion of tablet bootstrap to prevent unexpected behaviors. {{<issue 22099>}}
  • Resolves the issue of pg_locks query failure due to missing host node UUID in distributed transactions. {{<issue 22181>}}
  • Eliminates latency spikes in conflicting workloads by preventing redundant ProbeTransactionDeadlock rpcs. {{<issue 22426>}}
  • Validates the use of two arguments for disable_tablet_splitting, addressing a previous condition where only one was required, thereby enhancing backup process reliability. {{<issue 8744>}}

CDC

  • Fixes issue with CDC packed rows, now ensures a single record for large insert operations, providing consistent data regardless of row size. {{<issue 20310>}}
  • Introduces a fix for data loss issue caused by faulty update of cdc_sdk_safe_time during explicit checkpointing, along with tests to ensure validity. {{<issue 15718>}}
  • Fixes a NullPointerException in yb-client by adding a check for null in the partitionKey before calling getTablets. {{<issue 20636>}}
  • Resolves the issue of sending empty batches after a failed attempt to add a column on ALTER TABLE. {{<issue 20871>}}
  • Enables filtering out duplicate DDL operations when ysql_ddl_rollback_enabled flag is set to true. {{<issue 20989>}}
  • Reduces replication mismatches and RPC call failures by triggering RPC to random tablet with active tservers. {{<issue 20717>}}
  • Integrates retry logic over FlushTables calls in test to prevent test run failures due to timing out issues. {{<issue 20778>}}
  • Adds retry logic to avoid race condition in TestModifyPrimaryKeyBeforeImage by ensuring historical_max_op_id is updated before calling GetChanges RPC. {{<issue 20779>}}
  • Fixes a memory leakage issue in the walsender process by deep freeing the cached record batch after streaming to the client. {{<issue 21530>}}
  • Adds more debug logs in the walsender to aid in investigating issues like linked data loss. {{<issue 21465>}}
  • Reduces risk of segmentation fault during tablet split tests by safely handling null tablet peers. {{<issue 21723>}}
  • Adds more debug logs for stress run debugging, skips RollbackToSubTransaction RPC to local tserver if not needed, and enhances debugging of the ListReplicationSlots function. {{<issue 21780>}},{{<issue 21519>}},{{<issue 21652>}}
  • Fixes flaky tests to ensure proper response return when getting consistent changes and stable table addition after stream. {{<issue 22068>}}
  • Removes table level attributes from CDCSDK metrics to avoid tserver crash due to failed DCHECK assertion. {{<issue 22142>}}
  • Fixes the segmentation fault in walsender for dynamic table addition by refreshing stored replica identities and preventing a race condition when creating dynamic tables. {{<issue 22273>}}
  • Solves an issue where CDCSDK incorrectly deduces tablets as not interesting for stream before reaching the configured time limit. {{<issue 22383>}}
  • Assigns the correct "cdc_sdk_safe_time" for child tablets after a tablet split, preventing unintentional barriers or compactions. {{<issue 20429>}}
  • Enhances logging for memory pressure rejections by including blocker memory tracker details and rejected memory requirement. {{<issue 20776>}}
  • Cuts down the number of insert batch and inserts per batch in TestCDCSDKConsistentStreamWithTabletSplit and mends a data race issue. {{<issue 21315>}}
  • Enables support for streaming update operations via Walsender, enhancing PG compatible logical replication support. Now executes schema changes in the logical replication protocol and maintains a record of changes in each table's read_time_ht hybrid time in the PG catalog. Includes handling late ALTER TABLE responses and addressing incomplete cleanup in the case of a stream creation failure. This feature is disabled under test flag ysql_TEST_enable_replication_slot_consumption. {{<issue 20725>}}
  • Prevents failures in decoding change events by refreshing cached_schema_details when executing a new GetChanges request if the client indicates a necessity for the schema. {{<issue 20698>}}

yugabyted

  • Prevents SyntaxWarning and exceptions when incorrect advertise_address is given by adjusting string literals and adding check for errors. {{<issue 22552>}},{{<issue 22210>}},{{<issue 22230>}}
</details>

v2.21.0.1 - May 17, 2024 {#v2.21.0.1}

Build: 2.21.0.1-b1

Downloads

<ul class="nav yb-pills"> <li> <a href="https://software.yugabyte.com/releases/2.21.0.1/yugabyte-2.21.0.1-b1-darwin-x86_64.tar.gz"> <i class="fa-brands fa-apple"></i> <span>macOS</span> </a> </li> <li> <a href="https://software.yugabyte.com/releases/2.21.0.1/yugabyte-2.21.0.1-b1-linux-x86_64.tar.gz"> <i class="fa-brands fa-linux"></i> <span>Linux x86</span> </a> </li> <li> <a href="https://software.yugabyte.com/releases/2.21.0.1/yugabyte-2.21.0.1-b1-el8-aarch64.tar.gz"> <i class="fa-brands fa-linux"></i> <span>Linux ARM</span> </a> </li> </ul>

Docker:

sh
docker pull yugabytedb/yugabyte:2.21.0.1-b1

Bug fix

DocDB

Converted the ysql_skip_row_lock_for_update to an auto-flag to resolve compatibility issues during upgrade, preventing incorrect DB record creations that can affect row visibility and integrity.

v2.21.0.0 - March 26, 2024 {#v2.21.0.0}

Download

{{< warning title="Use 2.21.0.1">}} {{< /warning >}}

Highlights

Enhanced Postgres Compatibility Mode {{<tags/feature/tp>}}

We're pleased to announce the tech preview of the new Enhanced Postgres Compatibility Mode in the 2.21.0.0 release. This mode enables you to take advantage of many new improvements in both PostgreSQL compatibility and performance parity, making it even easier to lift and shift your applications from PostgreSQL to YugabyteDB. When this mode is turned on, YugabyteDB uses the Read-Committed isolation mode, the Wait-on-Conflict concurrency mode for predictable P99 latencies, and the new Cost Based Optimizer that takes advantage of the distributed storage layer architecture and includes query pushdowns, LSM indexes, and batched nested loop joins to offer PostgreSQL-like performance.

You can enable the compatibility mode by passing the enable_pg_parity_tech_preview flag to yugabyted, when bringing up your cluster.

For example, from your YugabyteDB home directory, run the following command:

sh
./bin/yugabyted start --enable_pg_parity_tech_preview

Note: When enabling the cost models, ensure that packed row for colocated tables is enabled by setting the --ysql_enable_packed_row_for_colocated_table flag to true.

New YugabyteDB Kubernetes Operator

A preliminary version of the completely rewritten YugabyteDB Kubernetes Operator is available in Tech Preview. The new operator automates the deployment, scaling, and management of YugabyteDB clusters in Kubernetes environments. It streamlines database operations, reducing manual effort for developers and operators.

For more information, refer to the YugabyteDB Kubernetes Operator GitHub project.

New features

  • New Kubernetes Operator. Automated deployment and management of clusters via the Kubernetes operator pattern. Includes support for YugabyteDB universes as a Kubernetes custom resource. Backup, upgrade, scale-out, scale-in, and more are possible on this Kubernetes custom resource. {{<tags/feature/tp idea="831">}}

  • YSQL: DDL concurrency. Support for isolating DDLs per database. Specifically, a DDL in one database does not cause catalog cache refreshes or aborts transactions due to breaking change in another database. {{<tags/feature/tp>}}

  • YSQL: DDL atomicity. Ensures that YSQL DDLs are fully atomic between YSQL and DocDB layers, that is in case of any errors, they are fully rolled back, and in case of success they are applied fully. Currently, such inconsistencies are rare but can happen. {{<tags/feature/tp>}}

  • YSQL: Lower latency for large scans with size-based fetching. A static size based fetch limit value to control how many rows can be returned in one request from DocDB. {{<tags/feature/tp>}}

  • YSQL: ALTER TABLE support. {{<tags/feature/tp>}} Adds support for the following variants of ALTER TABLE ADD COLUMN:

    • with a SERIAL data type
    • with a volatile DEFAULT
    • with a PRIMARY KEY
  • yugabyted

    • Docker-based deployments. Improves the yugabyted Docker user experience for RF-3 deployments and docker container/host restarts. {{<tags/feature/ea>}}

    • Set preferred regions. The preferred region handles all read and write requests from clients. Use the yugabyted configure data_placement command to specify preferred regions for clusters. {{<tags/feature/ea>}}

    • Backup and restore support in yugabyted. yugabyted now supports backup and restore of databases and keyspaces. You can also upload backups to public clouds, including AWS and GCP. {{<tags/feature/tp>}}

Change log

<details> <summary>View the detailed changelog</summary>

Improvements

YSQL
  • Offers consistent, specific deadlock error reporting regardless of when a transaction realizes its aborted state, through in-memory storage of recently deadlocked transaction information. {{<issue 18384>}},{{<issue 14114>}}
  • Introduces a new model for estimating DocDB seek and next operations, enhancing the accuracy of cost calculations for index lookups, especially when various types of index filters are applied. {{<issue 19354>}}
  • Modifies the BNL costing model to charge for unmatched outer tuples in semi/anti/inner unique joins, enhancing the accuracy of join ordering for efficient query execution. {{<issue 19054>}}
  • Introduces a new flag index_scan_prefer_sequential_scan_for_boundary_condition that potentially enhances speed in range-sharded databases by utilizing sequential scan over Local Skip scan under specified conditions. {{<issue 16178>}}
  • Allows testing of seek and next estimations through added Java tests, guarding against potential regressions. {{<issue 19082>}}
  • Corrects the computation of semi/anti join factors for inner unique joins, addressing a bug in the costing code that incorrectly estimated the fraction of outer join tuples having a match. This adjustment enhances the accuracy of join clause selectivity computations enhancing the database's performance. Additionally, fixes a bug in the final_cost_nestloop where outer_matched_rows were inaccurately set as 0, thus improving query estimation and execution. {{<issue 19021>}}
  • Reintroduces the use of Local Skip scan for index scanning with primary key filters in range sharded databases, reversing a previous change due to identified correctness issues. {{<issue 16178>}}
  • Alters the YSQLDump to generate CREATE INDEX NONCONCURRENTLY instead of CREATE INDEX, preventing automated index back-filling in the backup-restore, thereby accelerating the process. {{<issue 19457>}}
  • Mitigates CVE-2023-39417 by incorporating an upstream Postgres commit from REL_11_STABLE, which prevents the substitution of extension schemas or owners matching ["$']. {{<issue 14419>}}
  • Offers quick regression tests for CBO using the cbo_stat_dump and cbo_stat_load tools, enhancing developer productivity and performance feedback by rapidly validating CBO changes through the TAQO framework. {{<issue 19657>}}
  • Ensures Row Level Security (RLS) policy remains intact during table rewrite by accurately copying both relrowsecurity and relforcerowsecurity fields. {{<issue 19815>}}
  • Sets the tuple count to 1000 for all tables appearing empty or unanalyzed when yb_enable_optimizer_statistics is true, improving Cost-Based Optimizer's query plan selection. {{<issue 16825>}}
  • Imports upstream postgres commit from REL_11_STABLE as a preventive measure for future support of DEPENDS ON EXTENSION for objects like FUNCTION, PROCEDURE, etc, mitigating potential risks like CVE-2020-1720 and CVE-2023-39417. {{<issue 14419>}}
  • Introduces sorting abilities to BNL nodes, matching their sorting properties to that of other joins, with a GUC flag yb_bnl_optimize_first_batch controlling it, enhancing performance especially in presence of small LIMIT clauses. {{<issue 19589>}}
  • Enables tracking and aggregating of table mutation counts at the cluster level by sending the counts to an auto-analyze service, easing automatic triggering of ANALYZE when mutation thresholds exceed. {{<issue 15670>}}
  • Ensures response cache invalidation when temporary tables are discarded without altering the catalog version, avoiding discrepancies while utilizing the advantages of session-bound modifications. {{<issue 19178>}}
  • Includes MyDatabaseId in the T-server cache key to resolve stale shared relation issues as a result of different databases sharing T-server cache entries. {{<issue 19363>}}
  • Streamlines YSQL DDL functionality by replacing the IsTransactionalDdlStatement function with the YbGetDdlMode function, offering more cohesiveness through enums instead of booleans for significant DDL modes while enabling easier addition of new modes. {{<issue 19178>}}
  • Enables the upgrade to OpenSSL 3.0+ by importing the upstream PostgreSQL commit Disable OpenSSL EVP digest padding in pgcrypto. {{<issue 19733>}}
  • Enables importing of the upstream PG commit, preparing the platform for OpenSSL 3.0+ upgrades. {{<issue 19734>}}
  • Blocks the use of advisory locks in YSQL and responds to the external client with an error message when they are requested. {{<issue 18954>}}
  • Imports the pgcrypto: Check for error return of px_cipher_decrypt upstream PG commit essential for upgrading OpenSSL to 3.0+. {{<issue 19732>}}
  • Adjusts the webserver's Out Of Memory (OOM) score through the yb_webserver_oom_score_adj flag (default 900) to prevent unnecessary shutdowns while allowing quick termination if it starts consuming excessive memory. {{<issue 20028>}}
  • Sets yb_bnl_batch_size to 1024 and yb_prefer_bnl to true by default, ensuring BNL's replace nested loop joins without altering non-NL join plans. {{<issue 19273>}}
  • Replaces remaining unnecessary scans of the pg_inherits table with cache lookups, reducing wasteful calls to the YB-Master and optimizing DDL operations. Fixes a structuring bug in the INHERITSRELID cache for better future compatibility. {{<issue 10478>}}
  • Enables READ COMMITTED isolation by default in debug builds, eliminates setting a transaction to READ ONLY via pg_hint_plan, and updates certain tests to instead run explicitly in REPEATABLE READ. {{<issue 18462>}}
  • Introduces a new flag, ysql_use_relcache_file, to control the use of relcache init file, helping regulate Postgres backend memory usage, and modify unpredicted system table preloading, reducing overall memory usage. {{<issue 19226>}}
  • Introduces asynchronous support for ALTER INDEX SET TABLESPACE, ALTER INDEX ALTER COLUMN SET STATISTICS, CREATE MATERIALIZED VIEW with TABLESPACE, and ALTER MATERIALIZED VIEW SET TABLESPACE enhancing database flexibility, with a traceable warning for beta features that can be muted by adjusting the ysql_beta_feature_tablespace_alteration flag to true. {{<issue 6639>}}
  • Changes the default unit for the yb_fetch_size_limit to bytes from kilobytes, allowing a size limit setting to non-integer kilobyte values, enhancing query performance during upgrades. {{<issue 18522>}}
  • Enables Postgres' parallel query feature and implements parallel scan of YB tables in YBSeqScan, IndexScan, and IndexOnlyScan nodes, resulting in potentially faster query results. {{<issue 18095>}}
  • Replaces outdated PGConn Fetch* functions with more robust versions for improved database testing, now supporting additional BasePGType and OptionalPGType elements. {{<issue 19906>}}
  • Prevents creation of index with TABLESPACE on a temporary table, averting client hangups and displaying an error message: ERROR: cannot set tablespace for temporary index instead. {{<issue 19368>}}
  • Offers more context to the wait states in tserver layer by adding Active Session History (ASH) metadata to Perform RPCs, providing insights for PGPROC and ASH collectors. Updates yb_enable_ash GFlag and assures upgrade/downgrade safety. {{<issue 19135>}}
  • Reduces contention and potential deadlock risk during the execution of pg_stat_activity request by introducing a transaction cache at the t-server, which stores the active sessions and their transaction mapping. This allows the request to access the cache under a shared lock, alleviating the need for an exclusive lock. {{<issue 18711>}}
  • Resolves the record type not registered error that appeared when retrieving fieldnames for batched index condition expressions in YB Batched Nested Loop through bypassing fieldname resolution for indecipherable batched expressions. {{<issue 19094>}}
  • Trims unnecessary master RPC calls during connection initialization by removing YB_YQL_PREFETCHER_NO_CACHE enum value and introducing YBCStartSysTablePrefetchingNoCache function. {{<issue 19304>}}
  • Enables the PgIndexBackfillTest.NoAbortTxn C++ test for explicit flag setting, increasing its resilience against any default changes in YSQL backend manager flags. {{<issue 19351>}}
  • Strengthens PgIndexBackfillTest.NoAbortTxn and other tests to endure potential YSQL backends manager flags' default value alterations, thereby boosting resilience. {{<issue 19351>}}
  • Enables unified server functionality following process termination by resorting to restarting the postmaster for a crashed or killed Postgres backend, contributing to simplicity and fewer bugs. {{<issue 19180>}}
  • Resolves an issue with RowCompareExpression bindings that previously led to incorrect results and occasional crashes in YbBindScanKeys by accounting for unique PgGate request conditions. {{<issue 19384>}}
  • Reduces unnecessary error logs related to tablespace during initdb by checking the FLAGS_create_initial_sys_catalog_snapshot before initiating the tablespace refresh task. {{<issue 19386>}}
  • Eliminates unnecessary error logs during initdb bootstrap process by checking for the existence of pg_yb_tablegroup catalog only in non-bootstrap mode. {{<issue 19387>}}
  • Enhances read committed isolation by enabling each statement to pick a read time on docdb when possible, ensuring more efficient operations and adding a test for this functionality. {{<issue 19397>}}
  • Removes the TransactionCache class shifting session's transactions' information closer to the session in the SessionInfo structure, averting a potential deadlock scenario by ensuring smoother test execution when per-database catalog version mode is activated. {{<issue 18711>}}
  • Corrects the handling of RowCompareExpression bindings in YbBindScanKeys to prevent inaccurate results and potential system crashes. {{<issue 19384>}}
  • Launches the yb_auh extension, building the foundation for the Active Universe History project with a circular buffer for wait events storage and a background worker for local tserver and PG backends polling. New Gflags are introduced: enable_yb_auh, yb_auh.circular_buffer_size, yb_auh.sampling_interval, and yb_auh.sample_size. Default settings are disabled, 16 MB, 1000 ms, and 500, respectively. {{<issue 19127>}}
  • Adds pg_hint_plan syntax and functionality to control batched nested loop joins, allows setting hints YbBatchedNL(t1 t2) and NoYbBatchedNL, and modifies yb_prefer_bnl handling. Also, it removes BNL's dependency on enable_nestloop and adjusts cost model. {{<issue 19494>}}
  • Enables the modification of is_single_row_txn for finer control over non-transactional writes required by COPY, index backfill, or when yb_disable_transactional_writes is set, preventing issues during non-bufferable operations for single row transactions. {{<issue 4906>}}
  • Introduces a new PG function yb_active_session_history_internal and a corresponding view yb_active_session_history for easier querying, which require the Gflag TEST_yb_enable_ash to be enabled; errors will occur otherwise. {{<issue 19128>}}
  • Enables fetching of ASH samples from all PG processes, excluding prepared transactions, background workers, and backends without set ASH metadata, using a newly-created Postgres backend. {{<issue 19129>}}
  • Introduces a NOTICE for potentially unsafe ALTER TABLE operations (such as altering primary key, altering type), ensuring users are aware of the risks. To suppress this notice, adjust the ysql_suppress_unsafe_alter_notice gflag to true. {{<issue 19360>}}
  • Adds a new column with both a NOT NULL constraint and a non-volatile DEFAULT value without needing a table scan, leading to faster YSQL Alter Table operations. No table scan is needed as all existing rows will use the non-volatile DEFAULT value in their new column, reducing constraint violation checks time. {{<issue 19355>}}
  • Simplifies the code in the pg_dml_read file by replacing the DocKeyBuilder helper class with a function and switches from using an arena array to boost::small_vector. {{<issue 19685>}}
  • Enables an alternative table rewrite approach that only drops and recreates associated DocDB tables and indexes, using the relfilenode field to map a PostgreSQL table OID to the respective DocDB table, resulting in a more efficient way to perform operations such as ALTER TYPE and ADD/DROP primary key. {{<issue 4034>}}
  • Allows ordered index scans with IN conditions on a lower column, ensuring accurate result order for YB LSM indexes, and generalizes the fix to all such indexes. {{<issue 19576>}}
  • Enables PgClientServiceImpl to periodically clear its own reserved_oids_map_, enhancing database cleaning and eliminating reliance on TabletServer for scheduling. {{<issue 19916>}}
  • Optimizes scans not requiring certain row order by allowing parallel scans of multiple partitions and secondary index scans, potentially altering the output row order in some queries without the ORDER BY clause. {{<issue 13737>}}
  • Replaces deprecated FetchValue with FetchRow, simplifying changes and fixing indentation issues in ‘pg_mini-’ without modifying formatting in other areas. {{<issue 19918>}}
  • Renames the term Active Universe History to Active Session History for enhanced comprehension. {{<issue 19948>}}
  • Introduces yb_silence_advisory_locks_not_supported_error as a temporary solution for users to avoid disruption when using advisory locks without actual lock acquisition. {{<issue 19974>}}
  • Marks the ysql_enable_read_request_caching GFlag as non-runtime since Postgres flags, except PG_FLAGs, cannot be dynamically updated, enhancing cache configuration consistency. {{<issue 19983>}}
  • Adds a configuration option for altering default key sorting from HASH to ASC in YSQL, facilitating smoother PostgreSQL migrations and efficiently using indexes with ASC sorting, especially for inequality and ORDER BY clause queries. {{<issue 19937>}}
  • Reworks the wait event format in YSQL and ASH to match the Postgres format, enhancing compatibility and simplifying association of wait events. {{<issue 19130>}}
  • Enables the start and end of wait events in the PGGate layer through a callback, introducing a new Flusher class, which returns a FlushFuture object providing an updated wait event and flush request duration. {{<issue 19137>}},{{<issue 20022>}}
  • Enables the pushdown of aggregates where the split is AGGSPLIT_INITIAL_SERIAL, thereby effectively forwarding phase 1 results from YB scan to a higher level, labeled as "Noop Aggregate". {{<issue 19839>}}
  • Enables ALTER TABLE rewrite commands, adding support for ALTER TABLE ADD COLUMN operations and modernizing REINDEX implementation for end-user indexes. {{<issue 19563>}}
  • Enables ignoring already existing tablespaces during YSQL DB backup-restore process with the newly added flag ignore_existing_tablespaces in the yb_backup.py script. {{<issue 20334>}}
  • Adjusts preload settings to allow users to specify additional tables in the ysql_catalog_preload_additional_table_list without forcing preloading of default tables. {{<issue 20290>}}
  • Adds Storage Row statistics to the EXPLAIN (ANALYZE,DIST) output, enabling users to distinguish between work done by the storage layer and the query layer and understand the selectivity of remote filters and index conditions. {{<issue 12676>}}
  • Reworks TID expectations in index scans for more clarity and convenience by sidelining the use of TID t_self or t_ybctid and ensuring the setting of either yb_agg_slot, xs_hitup, or xs_itup. {{<issue 20373>}}
  • Refactors IndexScanDesc yb_agg_slot to prevent setting during non-pushdown cases and eliminates return value from ybFetchNext for unnecessary instances, preventing future misuse. {{<issue 20371>}}
  • Replaces existing retry attempt flags ysql_max_read_restart_attempts and ysql_max_write_restart_attempts with a unified GUC variable yb_max_query_layer_retries to control retries in all isolation levels including Read Committed, with default reset to 60 retries. Defaults for retry_backoff_multiplier and retry_min_backoff adjusted to 1.2 and 10ms respectively. {{<issue 20359>}}
  • Centralizes all code for creating internal PostgreSQL connections, simplifying usage in ysql_upgrade, ysql index backfill, WaitForYsqlBackendsCatalogVersion and ddl replication. Now utilizes the detailed error message from PGConn::Connect. {{<issue 20655>}}
  • Revamps the ToString function to create unique responses for optional types (std/boost::optional), enhancing log readability and data relevance. {{<issue 20719>}}
  • Adds a new GUC yb_explain_hide_non_deterministic_fields to remove non-deterministic fields from EXPLAIN ANALYZE's output, reduces flakiness between runs in pg_regress tests. {{<issue 19492>}}
  • Corrects formatting errors in the pg_stat_get_activity function, aligns variable names, adds yb_prefix to txn_rpc_timestamp, and applies column indexing based on PG_STAT_GET_ACTIVITY_COLS macro. {{<issue 20281>}}
  • Relocates Unknown Session Unit Test to pg_libpq, renaming it from PgBackendsTestSessionExpire to PgBackendsSessionExpireTest for convention conformity, enhancing testing protocol. {{<issue 20545>}}
YCQL
  • Introduces an UpdateMapRemoveKey API, enabling the removal of specific keys from a Map, leaving all other keys unaffected. {{<issue 19829>}}
DocDB
  • Introduces yb_read_time GUC variable, usable by superusers to query the database at a specific point in time in the past, specifically aiding backup and restore scenarios. This variable helps generate a database schema of a specific past point using ysql_dump. Make sure it's not set before a DDL operation or during it. Default value is 0, meaning the data is read in real-time, while setting a Unix timestamp (in microseconds) allows reading data as of that time. {{<issue 19114>}}
  • Accelerates rollback and downgrade processes by introducing capability to demote AutoFlags, offering enhanced control over rollback version and emergency repair functionality with new yb-admin commands. {{<issue 13686>}}
  • Enables tracking of active WriteQuery objects and outstanding transaction status RPC requests at the tablet level for easier debugging. {{<issue 18940>}}
  • Introduces an /xcluster UI page for yb-tserver to track real-time statuses of xCluster source streams and target pollers with a capability to reset data following a restart. Also features sorting and a search box for easier navigation. {{<issue 19203>}}
  • Introduces a read-time flag in ysql_dump, offering a way to dump the database schema as of a specific point in time, improving backup restoration capabilities. {{<issue 19258>}}
  • Enhances timeout handling for YCQL index scans to avert overruns, resulting in less log spew, ensuring index tablet scans do not timeout prematurely at the YCQLProxy/YBClient side, and eliminating unnecessary repeated master leader requests. {{<issue 19221>}}
  • Reduces chances of transaction deadlocks and improves fairness in read committed isolation by modifying the order of transactions resumption across all tablets based on xactStartTimestamp. {{<issue 18055>}}
  • Switches the data transfer rate on the tserver UI from MiBps to KiBps for enhanced precision, considering the typical tablet data transfer range. {{<issue 19203>}}
  • Reduces tablet shutdown issues and delayed database operations by addressing a bug causing unnecessary blockage in clearing the ResumedWaiterRunner queue during WaitQueue shutdown. {{<issue 19272>}}
  • Offers redesigned server level aggregation for metrics, thus introducing more metrics for enhanced debugging. Removes several unused URL parameters and makes the new output compatible with YugabyteDB Anywhere and YugabyteDB Aeon, preventing double-counting issues in charts. Drops unused Json and Prometheus callbacks from MetricEntity for a cleaner design. {{<issue 18078>}}
  • Replaces glog includes with yb/util, introducing yb VLOG macros for clearer differentiation between INFO and VERBOSE logs, while addressing issues of duplicate includes. {{<issue 15273>}}
  • Adjusts the verbose level for VLOG macros to help differentiate between INFO and VERBOSE logs, fostering ease in debugging and analysis with better log filtration. {{<issue 15273>}}
  • Aligns retryable request timeouts with respective YCQL and YSQL client write timeouts, thus reducing unnecessary log replay during YCQL tablet bootstrap. {{<issue 18736>}}
  • Eliminates duplicate includes from specific files, providing clearer differentiation between INFO and VERBOSE logs for enhanced user debugging experience. {{<issue 15273>}}
  • Enables a retry mechanism for acquiring shared in-memory locks from the wait-queue during waiter resumption to respect client/statement timeout, reducing request failures and associated latency in contentious workloads. {{<issue 19032>}},{{<issue 19859>}}
  • Accelerates TServer Init by handling deleted and tombstoned tablets asynchronously on startup, therefore, enabling the quick starting of the RPC port. Introduces a new flag num_open_tablets_metadata_simultaneously to set the number of threads for opening tablets' metadata during startup, enhancing the startup time. The modification also takes steps towards deleting the superblock in DeletedTablet. {{<issue 15088>}}
  • Introduces automatic recovery of index tables affected by a bug, effectively preventing performance degradation and disk size leak by ensuring that tombstones are properly filtered out by compactions once index backfilling is complete. {{<issue 19731>}}
  • Adds a 10s delay between an AutoFlag config update and its application, ensuring all tservers have the new config before any AutoFlags switch and begin producing new data. Guarantees process continuity by temporarily holding back new configs if the process restarts during apply time. {{<issue 19932>}}
  • Parallelizes the RPCs made during the DoGetLockStatus process in pg_client_service.cc to expedite fetching locks, enhancing database performance. {{<issue 18034>}}
  • Introduces support for upgrade and rollback of universes with xCluster links, checking AutoFlag compatibility during configuration changes. Includes error handling and broadcasting of AutoFlag config changes. The aim of these changes is to ensure that the target universe has the superset of specific AutoFlags. {{<issue 19518>}}
  • Enables logging of all instances of tablet metadata creation/updating, providing additional insights in case of tablet server startup crashes due to multiple meta records for the same tablet. {{<issue 20042>}}
  • Introduces a new get_auto_flags_config yb-admin command to retrieve the current AutoFlags configuration, aiding in debugging xCluster replication failures. {{<issue 20046>}}
  • Enhances pg_locks by including results from Single Shard transactions that previously went untracked, enabling users to query these transactions. During upgrades or downgrades to version 2024.1 and above, pg_locks queries may fail due to nodes lacking the newly implemented GetOldSingleShardWaiters service method. {{<issue 18195>}}
  • Expands load balancer metrics by incorporating tablets_in_wrong_placement, blacklisted_leaders, and tablet_load_variance, enhancing the tracking of load balancer progress. {{<issue 20118>}}
  • Adds new regular expression filters to the Prometheus metric endpoint by creating a distinct API for YugabyteDB Anywhere, offering server-level aggregation for tablet and table metrics. Users should add version=v2 to the URL for enabling this feature, granting control over metric output filters and determining the scope of metric aggregation effectively. {{<issue 19943>}}
  • Limits the number of rows returned per transaction per tablet in pg_locks to avoid potential memory issues during batch inserts, and includes additional fields to indicate partial lock info. {{<issue 20765>}}
  • Introduces a new GUC yb_locks_txn_locks_per_tablet to limit the number of rows returned by pg_locks, preventing the system from running out of memory during large transactions. {{<issue 19934>}}
  • Allows for the check of zero bytes at the end of SST data files, and enables an error report with the number of zeros once the flag rocksdb_check_sst_file_tail_for_zeros is set to a positive value. {{<issue 19691>}}
  • Boosts the bootstrap process by reading entries from the offset of the last flushed operation id instead of the secustomerent's beginning, significantly reducing unnecessary reading. For colocated tables, it enforces the replaying of at least two segments when the lazy_flush_superblock is enabled. {{<issue 18312>}}
  • Prevents tservers from communicating with master leaders in different universe clusters averting possible data loss, by introducing a new universe_uuid field and an autoflag master_enable_universe_uuid_heartbeat_check to manage the tserver heartbeat checks. {{<issue 17904>}}
  • Rejects ConfigChange requests for system catalog while another server is transitioning, preventing potential data loss from mistaken quorum formation by new peers. {{<issue 18335>}}
  • Enables tracing of UpdateConsensus API by activating the collect_update_consensus_traces flag, offering visibility into remote follower traces and adding trace messages to local logs. The feature ensures upgrade/rollback safety and impacts the leader and follower only if both incorporate the change. {{<issue 19417>}}
  • Introduces the rocksdb_max_sst_write_retries flag to set the number of retry attempts if corruption is detected when writing SST file, affecting both flushes and compactions. {{<issue 19730>}}
  • Safeguards the master_join_existing_universe flag to prevent unnecessary initial sys catalog snapshot restoration. {{<issue 19357>}}
  • Adds a retry mechanism on block checksum mismatches and enhances error logging for better identification of transient read errors. {{<issue 20102>}}
  • Refines error messages on block checksum failure by including a retry scheme and logging on success or failure, offering better error tracking. {{<issue 20102>}}
  • Adds a URL parameter, show_help, to the scrape endpoint, enabling control over display of help and metadata information, overriding the export_help_and_type_in_prometheus_metrics GFlag. {{<issue 19176>}}
  • Renames AsyncClientInitialiser to AsyncClientInitializer for consistency in naming conventions. {{<issue 19920>}}
  • Introduces flags tablet_replicas_per_gib_limit, tablet_replicas_per_core_limit, and tablet_overhead_size_percentage to customize tablet replication based on cluster resources, enhancing user control over system load balance. {{<issue 16177>}}
  • Introduces a new script, analyze_test_results.py, to reconcile discrepancies between Spark-based test runner and JUnit-compatible XML test reports, offering more accurate and reliable test results. {{<issue 18594>}}
  • Allows for YSQL parallel scans by breaking table tablets keyspaces into ranges of similar data size for efficient scanning time. {{<issue 19341>}}
  • Reduces unwanted logging in LogAfterLoad when a single 0 version is loaded, thus minimizing unnecessary log generation especially when managing many YSQL databases. {{<issue 18489>}}
  • Introduces AreNodesSafeToTakeDown API that ensures safe node removals during cluster upgrades or maintenance operations by checking tablet health and follower lag, facilitating seamless and risk-free updates. {{<issue 17562>}}
  • Adds a show-changes command to the sys-catalog-tool to search and provide details of all updated entries marked as ADD, CHANGE, or REMOVE. This needs to be run before update to validate the expected changes in the SysCatalog JSON file. Notably, this command exclusively interacts with the file, without reading or writing to the SysCatalog. {{<issue 18800>}}
  • Enhances the TCMalloc heap snapshot functionality with additional columns for estimated bytes and samples count from a call stack, allowing direct comparison with the total system memory and accurate proxy for memory usage. {{<issue 19071>}}
  • Tracks and batches updates for rocksdb and tablet-level event stats metrics, distinguishing between counter and gauge metrics, and exposing them in EXPLAIN (ANALYZE, DIST, DEBUG) and tracing. {{<issue 16785>}}
  • Adopts the trace outside the block for ensuring correct execution of per-session tracing with standalone traces, and fixes callbacks to adopt the appropriate trace. {{<issue 19099>}}
  • Modifies the use of scan choices to increase effectiveness in scenarios where only the lower bound is specified, enhancing both speed and performance. {{<issue 19117>}}
  • Allows tracking of per-RPC wait-states using WaitStateInfo for incoming RPC updates, ensuring safe upgrades and functioning ASH without interfering with existing functionalities. {{<issue 19138>}}
  • Optimizes PgWire response serialization for large query results, enhancing overall read performance. {{<issue 19213>}}
  • Reduces high load issues by renaming blocking synchronous YBSession flush functions to TEST_* and replacing them with non-blocking asynchronous versions (FlushAsync). {{<issue 12165>}}
  • Reduces the safe time lag in the xCluster by sending the apply safe time more frequently when there are no active transactions. {{<issue 19274>}}
  • Elevates the timeout in TSAN mode for the PgSharedMemTest.TimeOut test, averting potential table creation timeouts. {{<issue 19313>}}
  • Adds a new retrying master-to-master task, allowing for the API AreNodesSafeToTakeDown to check if it's safe to remove or upgrade certain nodes without disrupting overall cluster health. {{<issue 17562>}}
  • Replaces EnableVerboseLoggingForModule with google::SetVLOGLevel for a less complex procedure in setting the module log level, eliminating the updating of the vmodule gflag. {{<issue 19344>}}
  • Renames cdc to xcluster, moves ValidateTableSchema to xrepl_catalog_manager and renames it to ValidateTableSchemaForXCluster. Revises allow_ycql_transactional_xcluster to be a TEST flag, enhances XClusterManager's ability to handle XCluster related control logic, and launches dedicated XClusterConfig class. {{<issue 19353>}}
  • Reduces macOS 13.6 linker warnings by updating the compiler to avoid duplicate RPATHs, enables failure on duplicate RPATHs through YB_FAIL_ON_DUPLICATE_RPATH, and cleans build system. {{<issue 19378>}}
  • Enables thread safety for members passed by reference by setting the Wthread-safety-reference, fixing all resulting build errors for increased stability. {{<issue 19365>}}
  • Enables TEST_SYNC_POINT macro in release builds reducing its impact in production by adding the check for FLAGS_TEST_enable_sync_points before making expensive SyncPoint calls. {{<issue 19379>}}
  • Introduces XClusterManager to handle all XCluster related control logic in the yb-master, creates a dedicated class XClusterConfig for changes to XClusterConfigInfo, and makes allow_ycql_transactional_xcluster a TEST flag. {{<issue 19353>}}
  • Adds a skip_indexes command line option to create_snapshot and create_keyspace_snapshot, allowing users to exclude indexes when creating backups in YCQL. {{<issue 14142>}}
  • Enables a fallback to RPC when request or response exceeds the scope of allocated shared memory, ensuring continued functionality in larger data scenarios. {{<issue 19430>}}
  • Enhances thread safety analysis by enabling the -Wthread-safety-precise compiler flag, which increases scrutiny on mutex field assignments, and adds the ability to override the compiler type for third-party archive selection using YB_COMPILER_TYPE_FOR_THIRDPARTY environment variable. {{<issue 19462>}}
  • Simplifies xCluster code by allocating related tests to a separate file, introducing XClusterManager for better control logic, and establishing a dedicated XClusterConfig class for changes to XClusterConfigInfo. {{<issue 19353>}}
  • Removes a disabled test, enhancing master start in shell mode with either an empty master_addresses or a set master_join_existing_universe flag. {{<issue 19528>}}
  • Saves memory and disk space by introducing a JoinStringsLimitCount utility, which limits reporting and logging to the first 20 elements of large number arrays like tablet Ids. {{<issue 19527>}}
  • Filters out tservers in the read cluster when determining whether to add new tablet replicas to the cluster, providing the dual ability to manage CPU usage when maintaining idle tablets and ensure robust front-end work operations. This process includes configuration adjustments to tablet_replicas_per_gib_limit, tablet_replicas_per_core_limit, and tablet_overhead_size_percentage flags. {{<issue 16177>}}
  • Renames test file "xcluster_ysq_colocated" to "xcluster_ysql_colocated" for enhanced clarity and correction of a previous error. {{<issue 19531>}}
  • Allows longer GLog traces exceeding 30k limit by splitting output into less than 30k per line and introduces a new Gflag trace_max_dump_size to limit size of printed traces. {{<issue 19532>}}
  • Adds a metric for running tablet peers per tserver for easy calculation of tablet peers to cores, and tablet peers to memory ratios on YBM clusters. {{<issue 9647>}}
  • Renames CDCTabletMetrics to XClusterTabletMetrics and several related files, refines metrics retrieval and setting, and enhances handling of race conditions for smoother data management. {{<issue 20079>}}
  • Switches tablet_replicas_per_core_limit and tablet_replicas_per_gib_limit to runtime flags, for setting and adjusting resource-based tablet limits on-the-go. {{<issue 16177>}}
  • Enables aggregation of retryable requests mem-tracker metric at table-level for Prometheus by assigning the entity to the mem-tracker after the Tablet opens with the tablet metric entity. {{<issue 19301>}}
  • Implements a wait period after the addition of new transaction status tablets, enhancing the stability of XClusterYSqlTestConsistentTransactionsTest.UnevenTxnStatusTablets. {{<issue 19302>}}
  • Upgrades OpenSSL to version 3.0.8, disabling Linuxbrew builds and enabling glog to use the stack unwinding function based on backtrace. {{<issue 19736>}}
  • Facilitates the use of remote_build.py tool by interpreting arguments for yb_build.sh even when they couldn't be correctly parsed as remote_build.py arguments. {{<issue 19696>}}
  • Introduces the trace_max_dump_size Gflag (default 25000) for limiting trace print sizes, works around GLog's character limit for printing long traces. {{<issue 19532>}},{{<issue 19769>}}
  • Relocates XClusterConfigInfo and XClusterSafeTimeInfo from catalog_entity_info.h to xcluster_catalog_entity.h, and from catalog_loaders.h to xcluster_catalog_entity.h, respectively. Also, establishes a SingletonMetadataCowWrapper for singleton catalog entities, creates an XClusterManager interface, and transfers xcluster_safe_time_info_ and its functions from Catalocustomeranager to XClusterManager. {{<issue 19713>}}
  • Facilitates a more rapid server initialization by deleting the superblock within the DeleteTablet process when the delete_type is TABLET_DATA_DELETED, reducing the number of DELETED tablet superblocks at server startup. {{<issue 19840>}}
  • Introduces a continuation marker for better traceability when a trace segment is split into multiple LOG(INFO) outputs; also adds a new GFlag trace_max_dump_size to limit the size of traces printed. {{<issue 19532>}},{{<issue 19808>}}
  • Generates an enhanced error message displaying the version info when the yb process incorrectly starts on an older version after AutoFlags have been enabled, aiding in easier problem identification. {{<issue 16181>}}
  • Renames producer_id to replication_group_id in older proto messages, standardizing the replication group identity for enhanced consistency and rollback safety. {{<issue 19825>}}
  • Centralizes common helper functions for YCQL xcluster tests into XClusterYcqlTestBase for streamlined testing procedures. {{<issue 19830>}}
  • Balances tablet load more evenly across all drives, preventing bottlenecks during remote-bootstrapping by evenly distributing tablets and utilizing available disk bandwidth. {{<issue 19846>}}
  • Introduces additional debug logs for troubleshooting SELECT statement errors that could arise from processing non-provisional records or writing provisional records without a hybrid timestamp. {{<issue 19876>}}
  • Cleans up allocated shared memory objects on TServer startup if the TServer process didn't shut down gracefully. {{<issue 19988>}}
  • Enhances the demote_single_auto_flag yb-admin command by returning specific error messages for invalid process_name, AutoFlag name, or non-promoted AutoFlag, making identifications easier. {{<issue 20004>}}
  • Enables monitoring of master leader heartbeat delays through a new RPC in the MasterAdmin, ensuring undesired lags can be readily detected and mitigated. {{<issue 18788>}}
  • Avoids indefinite mutex lock and TServer thread blockage by correctly handling crashes during request transmission via shared memory. {{<issue 20050>}}
  • Eliminates usage of UNKNOWN flags in tools, marking them as NON_RUNTIME since dynamic update of these flags is not supported. {{<issue 20123>}}
  • Renames the misleading cdc xCluster metric entity to xcluster, ensuring an accurate representation without affecting dependencies as services like YugabyteDB Anywhere rely on the unchanged metric name. {{<issue 20131>}}
  • Establishes a flag to manage indexing backfills, offering control over whether non-deferred indexes should be batched during the backfill operation. {{<issue 20213>}}
  • Delivers automatic recovery for index tables affected by a bug previously found and addressed, preventing any future performance issues triggered by incorrectly set property values. {{<issue 20247>}}
  • Changes Successfully read [n]ops from disk. logs to verbose logging, lowering the frequency of identical log outputs and boosting performance. {{<issue 20287>}}
  • Allows configuration of the yb_build.sh script via .git/yb_buildrc and ~/.yb_buildrc bash scripts, to specify implicit arguments or alternative defaults before parsing command line arguments. {{<issue 20291>}}
  • Converts UNKNOWN flags to either RUNTIME or NON_RUNTIME in DocDB for optimal flag management. {{<issue 16979>}}
  • Marks the Tserver flag num_concurrent_backfills_allowed as RUNTIME instead of UNKNOWN for better manageability. {{<issue 20348>}}
  • Upgrades unit test key/certificate pairs from 1024-bit RSA keys to 2048-bit, meeting FIPS 140-2 requirements, and integrates their generation into the build process. {{<issue 20370>}}
  • Marks the force_global_transactions, ycql_use_local_transaction_tables, and auto_promote_nonlocal_transactions_to_global gflags as runtime, enabling them to be changed directly as required for each new transaction. {{<issue 20479>}}
  • Organizes AutoFlags management across dedicated MasterAutoFlagsManager, TserverAutoFlagsManager and subset AutoFlagsManagerBase, offering neat code architecture and resolving a bug in Master::InitAutoFlags. {{<issue 19958>}}
  • Renames cdc::ProducerTabletInfo to cdc::TabletStreamInfo and removes ReplicationGroupId from it, relocates ReplicationGroupId from cdc to xcluster namespace, and introduces xcluster::ProducerTabletInfo to optimize naming consistency. {{<issue 20452>}}
  • Enables the use of the OpenSSL FIPS module by setting the new openssl_require_fips = true gflag, ensuring FIPS standard compliance for database cluster creation. {{<issue 20524>}}
  • Adds Prometheus metrics for server hard and soft memory limits, enabling better tracking of memory use in TServer or master and creation of dashboard charts for universes using non-default values. {{<issue 20578>}}
  • Introduces a helper function that checks if a CowObject has a write lock, offering special functionality in retail mode and debug mode for enhanced thread safety. {{<issue 20599>}}
  • Eliminates the issue of accessing erased objects in the ClusterLoadBalancer::RunLoadBalancerWithOptions, enhancing the runtime performance. {{<issue 20673>}}
  • Streamlines bloom filter key calculation by avoiding duplicate calculations. This results in approximately 4.5% tserver time improvement, and overall 1.5% performance boost. {{<issue 20720>}}
  • Limits the number of tablets per node, and hastens reaching the desired number of tablets by lowering the values of FLAGS_tablet_split_low_phase_shard_count_per_node to 1 and FLAGS_tablet_split_low_phase_size_threshold_bytes to 128_MB. {{<issue 20579>}}
  • Introduces new auto flags to stave off backward compatibility issues related to version 2.20, ensuring the stable existence of previously promoted AutoFlags during process startup time. {{<issue 13474>}}
  • Adds verbose logs for frequent global and per-table state changes within a load balancer run for easier debugging. {{<issue 20289>}}
  • Splits XClusterManager into two separate managers, XClusterSourceManager and XClusterTargetManager, each handling different objects, to enhance code readability and component isolation. {{<issue 20737>}}
CDC
  • Integrates CDCSDK stream creation for a namespace into YugabyteDB master, introducing support for garnering a CDC stream via cdcsdk_ysql_replication_slot_name. Invalidates deprecated logic in cdc_service, focusing on YSQL strategies instead. Promotes explicit parameter requirements for request validation when namespace_id is populated. Addresses a race condition and initial checkpoint discrepancy in CreateCDCStream. This alteration modifies sys-catalog entry and necessitates client checking of the autoflag yb_enable_replication_commands. {{<issue 19211>}}
  • Enables CRUD syntax for Publications in YSQL as part of a YSQL API for CDC via the PG logical replication mechanism, allowing users to specify tables for streaming through CDC. However, CDC does not support certain features, which may limit table selection and result in errors. The change is irreversible due to the introduction of the yb_enable_replication_commands autoflag. {{<issue 18930>}},{{<issue 18933>}},{{<issue 18931>}}
  • Allows maxAttempts for RPCs in AsyncClient to be adjustable, decreasing the risk of Too many attempts exceptions occurring in a short period. {{<issue 12751>}}
  • Enables deletion of CDCSDK streams through replication slot names, advancing the support for SQL syntax for CDC via the PG logical replication model. However, this feature isn't rollback safe and is disabled during upgrades, requiring a subsequent check of the autoflag yb_enable_replication_commands. {{<issue 19212>}}
  • Introduces support for creating, viewing, and dropping replication slots in YSQL. Adds two interfaces for support, functions pg_create_logical_replication_slot and pg_drop_replication_slot, and Walsender commands CREATE_REPLICATION_SLOT and DROP_REPLICATION_SLOT. Inserts view pg_replication_slots for viewing replication slots. Fixes two issues concerning cleanup of held locks and skipping cache refresh. {{<issue 19211>}},{{<issue 19212>}},{{<issue 19509>}}
  • Prevents Object already exists error during consecutive CreateCDCStream and DeleteCDCStream calls by effectively handling the stream delete state, and supports creating a CDCSDK stream for a namespace via SQL syntax. {{<issue 19211>}},{{<issue 19212>}}
  • Automatically forwards CreateCDCStream requests to yb-master for atomic creation of CDCSDK streams, enhancing consistent snapshot capability. This is covered by the ysql_yb_enable_replication_commands flag and temporarily bypasses the requirement for a replication slot name. {{<issue 18890>}}
  • Unveils enhanced replica command recognition to overcome issues, paving the way for new replication slot support. Also incorporates the ability to create a CDCSDK stream for a namespace via SQL syntax and remedy specific race conditions. {{<issue 19211>}},{{<issue 19212>}}
  • Defines replication slots as active or inactive in YugabyteDB, considering a slot active if it's consumed within the set duration defined by the ysql_cdc_active_replication_slot_window_ms Tserver GFlag. This change allows better visibility into slot activities and prevents dropping of active slots. It also addresses a bug in the WaitForGetChangesToFetchRecords function used in testing. {{<issue 19211>}},{{<issue 19212>}}
  • Supports the creation of CDCSDK stream for a namespace, with the ability to fetch it using cdcsdk_ysql_replication_slot_name. Simultaneously, addresses a race condition problem during the CreateCDCStream operation and ensures proper initial checkpoint setting in cdc_state_table. Introduces limits on replication slots (CDC stream) utilizing a GFlag and reports the status when the slots limit is reached. This change also accommodates the detection of replication commands in yb_is_dml_command. {{<issue 19211>}},{{<issue 19212>}}
  • Enables reading of Decimal and VarInt datatypes in CDC for CQL. {{<issue 19726>}}
  • Reinstates support for identifying replication commands after a previous rollback. Allows users to create a CDCSDK stream for a namespace and to retrieve a CDC stream using cdcsdk_ysql_replication_slot_name. Addresses a race condition issue between CreateCDCStream and the Catalocustomeranager's background cleanup task and fixes a problem related to the initial checkpoint of tables in the cdc_state_table for CDCSDK. Also reintroduces the ability to determine whether a replication slot (CDC stream) is active or inactive. {{<issue 19211>}},{{<issue 19212>}}
  • Limits the number of replication slots in YSQL with max_replication_slots GFlag, introducing an error code for when the limit is reached, and enhances CDC stream creation. {{<issue 19211>}}
  • Displays the replication commands conducted by walsenders in the pg_stat_activity section. The new implementation supports the creation of a CDCSDK stream for a namespace via cdcsdk_ysql_replication_slot_name, enables the detection of replication commands without errors, and introduces the limitation of the number of CDCSDK streams by the max_replication_slots GUC. {{<issue 19211>}},{{<issue 19212>}}
  • Expands the range of SQL commands that can be issued to a walsender, increases support for creating CDCSDK stream for a namespace, and guards against a potential race condition between CreateCDCStream and Catalocustomeranager background cleanup task. {{<issue 19211>}},{{<issue 19212>}}
  • Avoids erroneous deletions from the cdc_state table caused by a race condition during tablet splits by reversing the call order in the CleanUpCDCStreamsMetadata method. {{<issue 19746>}}
  • Detects replication commands in yb_is_dml_command, supports creating logical replication slots through SQL using CREATE_REPLICATION_SLOT and pg_create_logical_replication_slot. The change includes support for CDCSDK stream creation, imposes limit on replication slots/streams, and resolves a race condition related to CreateCDCStream. {{<issue 19211>}},{{<issue 19212>}}
  • Changes the yb_enable_replication_commands from an autoflag to a TEST flag, making it safer and more flexible for enabling replication slots feature by default. Supports YSQL commands for replication slots when the flag is true, while disallows them when the flag is set to false. It also rectifies a race condition between CreateCDCStream and the CataloCustomeranager background cleanup task. The revision further supports the creation of CDCSDK stream for a namespace, aiding in the long-term goal of supporting SQL syntax for CDC. {{<issue 19211>}},{{<issue 18890>}}
  • Ensures cleanup of entries from cdcsdk_replication_slots_to_stream_map_ when corresponding entries are deleted from cdc_stream_map_, avoiding potential inconsistencies. {{<issue 19211>}}
  • Introduces a new yb-admin CLI command and master RPC to enable backfilling of a replication slot name to existing CDCSDK streams, providing manageable streams via YSQL Publication/Replication slot interface. {{<issue 19261>}}
  • Logs a NOTICE for each unsupported table when creating a publication using the FOR ALL TABLES case in CDC, improving user visibility on skipped tables. {{<issue 19291>}}
  • Enriches the CDCStreamInfo java class with a new cdcsdk_replication_slot_name field and an accessor method for better support of Publication/Replication slot. {{<issue 19811>}}
  • Optimizes the CreateCDCStream by eliminating unnecessary sleep statements, preventing a race condition, and ensuring correct initial checkpoint settings for the CDCSDK. Also, this code change introduces support for SQL syntax for CDC using the Postgres logical replication model, allows detecting replication commands without errors and defines whether a replication slot (CDC stream) is active or not. {{<issue 19211>}}
  • Transforms yb_enable_replication_commands into a runtime PG preview flag, correcting a bug that caused publication commands to always be enabled regardless of flag value. {{<issue 18930>}}
  • Introduces a GFlag to toggle automatic tablet splitting for tables within a CDCSDK stream, enhancing user control over replication processes. {{<issue 19482>}}
  • Expands support for two new record types: PG_DEFAULT and PG_NOTHING based on Postgres replica identity types while maintaining backwards compatibility by renaming ALLand MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES modes to PG_FULL and PG_CHANGE_OLD_NEW respectively. A failsafe cdc_enable_postgres_replica_identity autoflag is added. {{<issue 19260>}}
  • Addresses a test failure in TestCreateCDCStreamForNamespaceLimitReached by specifically adding the record type CHANGE to the stream request. Enables support for two new record types PG_DEFAULT and PG_NOTHING, while retaining the ALL and MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES modes. Adjusts settings using the newly added autoflag cdc_enable_postgres_replica_identity. {{<issue 19260>}}
  • Introduces support for two new record types, DEFAULT and NOTHING, based on Postgres replica identity types, and renames ALL and MODIFIED_COLUMNS_OLD_AND_NEW_IMAGES modes to PG_FULL and PG_CHANGE_OLD_NEW respectively for backward compatibility. It introduces an autoflag cdc_enable_postgres_replica_identity during CDC stream creation and adjusts the failing test TestCreateCDCStreamForNamespaceLimitReached by specifying the record type CHANGE. {{<issue 19260>}}
  • Enhances CDCSDK to report tablet splits promptly upon detection, controls data duplication by cross-referencing hash_key bounds, and optimizes the retrieval of child tablets via tablet_peer. {{<issue 18479>}}
  • Refines the GetCheckpointResponse to indicate snapshot_key presence only when present, enhancing accuracy of bootstrapping and streaming processes. {{<issue 19292>}}
  • Introduces the UpdateMapUpsertKeyValue API that lets you update specific keys without needing to re-add all keys, allowing for more efficient updates. {{<issue 19577>}}
  • Enhances the CDC State Table's key update efficiency by selectively updating or removing keys as needed, without having to replace the entire map column. {{<issue 19577>}}
  • Reactivates the cdcsdk_stream-test for TSAN mode, previously disabled, enhancing overall testing capabilities. {{<issue 19752>}}
  • Helps ensure failed CDCSDK stream creation processes are rolled back effectively, reducing problems caused by incomplete creations through a ScopeExit mechanism. Manual clean-up may be required in certain failure scenarios until DDL atomicity for alter table statements is implemented. {{<issue 18934>}}
  • Enables the tests in cdcsdk_snapshot-test to run in TSAN mode, augmenting their utility and coverage. {{<issue 19752>}}
  • Rectifies the intermittent failure issue in TestReleaseResourcesOnUnpolledSplitTablets by ensuring that UpdatePeersAndMetrics thread refreshes the cached CDC stream metadata if in the initialized state. {{<issue 18934>}}
  • Alters the default checkpoint type to EXPLICIT during stream creation, ensuring no upgrade or rollback issues due to alterations in the default proto field value. {{<issue 18748>}}
  • Allows yb-client to apply retries for retryable error codes, preventing the unnecessary resetting of attempts and deadlines when a CDCErrorException is encountered. {{<issue 19648>}}
  • Releases retention barriers on tables that are not of interest in the CDC Consistent Snapshot feature stream, defined by the new GFlag "cdcsdk_tablet_not_of_interest_timeout_secs." This enhances user control over snapshot consumption. {{<issue 20146>}}
  • Refactors tests to use ASSERT_EQ assertions, not ASSERT_GE, for checking consumed record count, utilizing GetChangeRecordCount method for more accurate record handling and tablet splitting. {{<issue 20261>}}
  • Switches the default consistent snapshot option to USE_SNAPSHOT when creating a new stream, and converts the Consistent Snapshot feature to a preview feature guarded by the RUNTIME_PREVIEW flag yb_enable_cdc_consistent_snapshot_streams. {{<issue 20367>}}
  • Modifies the default value of gflag "cdcsdk_tablet_not_of_interest_timeout_secs" to 4 hours enhancing CDC Consistent Snapshot feature and remains guarded by the PREVIEW flag "yb_enable_cdc_consistent_snapshot_streams". {{<issue 20378>}}
yugabyted
  • Integrates client-to-server encryption support for Ysql Connection Manager, securing the connection between the client application, Ysql connection manager, and pg_backend through enabling SSL connectivity. Uses the existing use_client_to_server_encryption and certs_for_client_dir flags to enable and configure this feature, while not supporting certification files set via ysql_pg_conf_csv and cert-based authentication. Ensures upgrade and rollback safely without the need for an auto flag or node communication. {{<issue 19108>}}
  • Publishes Ysql Connection Manager metrics on <tserver_ip_address>:13000/prometheus-metrics, enhancing data monitoring and diagnostics. {{<issue 19109>}}
  • Alters the format of YSQL connection manager's prometheus metrics on the prometheus-metrics endpoint to include the database as a metric label. {{<issue 19484>}}
  • Enables faster and more secure unix socket connections between Ysql Connection Manager and pg backend on the same machine, replacing the previous TCP/IP connections. Introduces a new flag ysql_conn_mgr_use_unix_conn to configure this feature. {{<issue 19483>}}
  • Enables the use of the YSQL Connection Manager feature as an alternative in the yb-pgsql java test framework by setting the YB_ENABLE_YSQL_CONN_MGR_IN_TESTS environment variable to true. {{<issue 19703>}}
  • Allows for the creation of separate pools for each user/database combination in the Ysql Connection Manager, eradicating the need to set the user context at the beginning of each transaction. Updates to stats/metrics format also enhance database pool tracking. {{<issue 19722>}}
  • Enables restriction of encryption to the logical connection only in YSQL Connection Manager by setting use_client_to_server_encryption. Physical connections, between the YSQL connection manager and Postgres process on the same machine, are not encrypted, enhancing internal performance without sacrificing secure external communications. {{<issue 19108>}}
  • Introduces the GUC variable ysql_conn_mgr_sticky_object_count for easier and faster tracking of connection stickiness in YSQL Connection Manager tests, eliminating the need to modify pool sizes. {{<issue 20067>}}
  • Introduces the GUC variable yb_use_tserver_key_auth for authenticating clients using yb-tserver-key. Removes the "postgres only" requirement for yb-tserver-key authentication and sets ysql_conn_mgr_use_unix_conn as true by default. Requires no HBA changes. {{<issue 19996>}}
  • Integrates a database migration visualization tool in the yugabyted UI, including a new dashboard for monitoring migration progress and complexity, facilitating smoother transition from other databases. {{<issue 18782>}}
  • Corrects the CPU usage Sankey diagram to accurately report used and available values, enhancing reliability of performance metrics on the performance page. {{<issue 19991>}}
  • Enables a new user interface feature in yugabyted for connection management metrics, displaying metrics on active and total logical/physical connections, and providing a clickable banner to navigate to dedicated connections visuals. {{<issue 18805>}}
  • Rectifies confusion with the yugabyted-UI; password authentication no longer incorrectly shows as enabled for an insecure cluster unless the encryption-at-rest is activated. {{<issue 19295>}}
  • Rectifies the misalignment in the display of status messages for specific scenarios in yugabyted. {{<issue 19334>}}
  • Corrects the display of the total number of CPUs on the overview page and ensures live queries show all statuses, not just idle. {{<issue 19414>}}
  • Offers the ability to set preferred regions using yugabyted CLI for lower latencies, by expanding the functionality of the constraint_value flag, offering a way to assign preference orders to Availability Zones (AZ). {{<issue 19415>}}
  • Corrects join flag bugs, ensuring a smooth start command even if a node's join IP is not an active master and enables error handling when the placement_uuid from the join IP can't be obtained. Now supports Hostnames and handles edge cases for addresses provided through CLI. {{<issue 19316>}},{{<issue 19314>}}
  • Adjusts the yugabyted start command to interpret 0.0.0.0 as 127.0.0.1 in the advertise_address, aligning with the IP use in master, tserver, and yugabyted-UI. {{<issue 18580>}}
  • Adds prerequisite checks to confirm if default ports are open before yugabyted starts, resulting in either failure to launch or impaired functionality with warnings depending on the blocked ports. {{<issue 19504>}}
  • Integrates ysql connection manager stats into the tserver metrics snapshotter, which can be enabled via the metrics_snapshotter_tserver_metrics_whitelist gflag, offering visibility into total logical and physical connections. {{<issue 18805>}}
  • Allows metrics whitelist to include ysql_conn_mgr flag only if the connection manager is enabled, enhancing the accord between connection manager metrics and yugabyted UI. {{<issue 18805>}}
  • Enables Yugabyted UI to display Alert messages from all nodes by directing API calls through the yugabyted API server. {{<issue 19972>}}
  • Resolves an issue where the UI failed to launch when advertise_address=0.0.0.0 by ensuring 127.0.0.1 is used instead, and adds a connection check for address uniqueness and timeout for tserver API calls. {{<issue 18580>}}
  • Enables the starting of two different local RF-1 instances on Mac by adding a check for empty join flag during the second node's initiation. {{<issue 20018>}}
  • Removes the deprecated gflag use_initial_sys_catalog_snapshot, replaced by enable_ysql that is now true by default, eliminating repetitive warning messages on starting yugabyted nodes. {{<issue 20056>}}
  • Adapts yugabyted-ui to efficiently support Kubernetes (k8s) deployments, ensuring correct function for nodes with only masters. A new bind_address flag added for customizing the API server's bind address. {{<issue 20301>}}
  • Rectifies the malfunction in yugabyted-ui when yugabyted utilizes custom ysql_port and ycql_port values by introducing a new flag for YCQL port number. {{<issue 20406>}}
  • Updates the yugabyted-ui backend to align with changes in the connection manager stats consumed from the :13000/connections endpoint, catering for removal of pool_name and addition of database_name and user_name. {{<issue 20494>}}
  • Adds yugabyted-ui support to the K8s OSS Yugabyte helm chart, including new values to control UI and metrics snapshotter activation for enhanced metrics visualisation in the K8s environment. {{<issue 20344>}}
  • Retains the integrity of user's custom configuration file by associating config flag with start command, and directs updates to a yugabyted generated file within base_dir/conf directory. {{<issue 20881>}}
  • Allows a smooth restart of the second node in a cluster using the join flag without throwing any errors. {{<issue 20684>}}
  • Enables a predefined set of gflags related to the pg-parity project using the enable_pg_parity flag in the yugabyted start command. {{<issue 21221>}}
  • Changes the flag enable_pg_parity to enable_pg_parity_tech_preview for activating a predefined set of gflags related to the pg-parity project with the yugabyted start command. {{<issue 21221>}}
Other improvements
  • Introduces a strict deletion check for orphaned tablets to prevent erroneous data loss when the master issues DeleteTablets to tservers, with the feature guard master_enable_deletion_check_for_orphaned_tablets=true, ensuring upgrade and downgrade safety. {{<issue 18332>}}
  • Simplifies reading of remotely fetched traces by introducing proper nesting levels and splitting multi-line trace entries into different lines. {{<issue 19758>}}
  • Enables monitoring of inbound calls for read and write RPCs without any performance impact, by maintaining and updating WaitStateInfo during execution and annotating waits during I/O and lock/condition waiting. {{<issue 19143>}}
  • Switches release packaging to use native libraries on lowest common version (centos7 for linux-x86) instead of linuxbrew libraries, introducing changes to the default calculation for linuxbrew builds in the 2.21 release. {{<issue 19219>}}
  • Redefines release packaging to use native library build instead of linuxbrew, boosting compatibility with later OS versions. Changes the default setting for linuxbrew builds to false. Fixes shellcheck errors in compiler wrapper. {{<issue 19219>}}
  • Redesigns build options parsing in Jenkins for better compatibility, switching from YB_BUILD_OPTS evaluation to YB_*environment variables, and mends shellcheck mistakes in compiler wrapper. {{<issue 19219>}}
  • Corrects the Jenkins build error that occurred when YB_BUILD_OPT was not set, ensuring smooth build operations even in the absence of YB_BUILD_OPTS. The change switches the packaging method to use native library build instead of Linuxbrew, offering better compatibility with later OS versions. {{<issue 19219>}}
  • Ensures consistency at the time of stream creation in the CDC Consistent Snapshot feature by selecting a single common read point across all tablets within the input database. Additionally, guards changes with the TEST flag yb_enable_cdc_consistent_snapshot_streams, set to false by default. Also includes alteration to create stream workflow on the Master side and introduces retention barriers on Regular db, WAL, and IntentsDB. {{<issue 19678>}}
  • Allows you to preserve information sources during stream creation until snapshot records and related changes are consumed by maintaining retention barriers on WAL/Intents/RegularDB. Also, ensures data consistency during failover scenarios by performing preparations as part of the Apply of Raft operation. Includes support for colocated tables during snapshot stream creation, with a filter to exclude WAL records with commit_time lower than or equal to the snapshot_time. Currently, changes are hidden behind the TEST flag, which will later be an autoflag. {{<issue 19679>}}
  • Extends MiniCluster with YB Controller servers and introduces graceful shutdown feature, ensuring a smoother testing experience. {{<issue 19849>}}
  • Extends MiniYBCluster to include YB Controller servers and allows for their graceful shutdown. {{<issue 19849>}}
  • Introduces snapshot and streaming consumption changes as well as support for colocated tables in the context of consistent snapshot stream, allowing exhaustive and mutually exclusive snapshot and change records. {{<issue 19680>}}
  • Enhances the yb-admin CLI to support the creation of consistent snapshot streams, increasing control over snapshot options like NOEXPORT_SNAPSHOT and USE_SNAPSHOT. {{<issue 19682>}}
  • Introduces retention_barrier_no_revision_interval_secs gflag to avoid race conditions in setting retention barriers during stream creation, increasing the consistency of snapshot streams. {{<issue 20145>}}
  • Introduces a generic task that runs tasks after all tablets are created on new tables and fixes issues that could leave the table in the RUNNING state or schedule tasks before updating the data on disk. {{<issue 20577>}}

Bug fixes

YSQL
  • Allows for ALTER TYPE to run on temporary tables without blocking PG table rewrite, preventing data corruption and enabling smoother transaction handling. {{<issue 18909>}}
  • Introduces a per-database PG new OID allocator, ensuring OID uniqueness within the database and enhancing horizontal scalability in multi-node and multi-tenancy environments. This new mechanism mitigates OID collisions and allows OID consistency in backup-restore scenarios across clusters. A new GFlag ysql_enable_pg_per_database_oid_allocator is provided to return to old OID allocator behavior if necessary. {{<issue 16130>}}
  • Restarts the postmaster when a process is killed during its own initialization or cleanup to prevent potential mishandling of shared memory items. {{<issue 19945>}}
  • Resolves a bug that incorrectly type-checks bound tuple IN conditions involving binary columns like UUID for releases 2.17.1 and higher, improving database consistency. {{<issue 19753>}}
  • Adjusts the default values of yb_local_throughput_cost, yb_local_latency_cost, and yb_docdb_remote_filter_overhead_cycles, enhancing performance across most TAQO workloads. {{<issue 20032>}}
  • Ensures consistent wait start times in pg_locks by tracking the RPC request start time for the waiter instead of the time-out in the wait-queue, providing a more accurate reflection of real progress. {{<issue 18603>}},{{<issue 20120>}}
  • Converts the "Unknown session" error into a FATAL error, allowing drivers to instantly finish a non-responsive connection, enhancing client connection management. {{<issue 16445>}}
  • Corrects a backup failure issue by ensuring the function yb_catalog_version is introduced, especially in 2.4.x or 2.6.x clusters where it was previously missed due to a YSQL upgrade code bug. {{<issue 18507>}}
  • Ensures the Linux PDEATH_SIG mechanism signals child processes of their parent process's exit, by correctly configuring all PG backends immediately after their fork from the postmaster process. {{<issue 20396>}}
  • Enhances distinct iteration to avoid missing live rows after detecting a deleted row, by making AdvanceToNextRow aware of whether a fetched row is deleted, thereby ensuring no rows are missed during distinct queries-to-tables with deleted tuples. {{<issue 19911>}}
  • Enables cleanup after killed backends, fixing an issue where killing a background worker uses up a Proc struct, therefore preventing the webserver from failing after 8 attempts. {{<issue 20154>}}
  • Releases memory to the operating system after processing each endpoint call, effectively managing large amounts of data produced by long and unique queries and preventing unnecessary accumulation of memory. {{<issue 20040>}}
  • Eliminates segmentation fault in webserver SIGHUP handler at cleanup by ensuring MyLatch usage in all instances in order to manage process life cycle. {{<issue 20309>}}
  • Adds a regression test for nested correlated subqueries to guard against reintroducing a previously fixed issue and ensures correct query results, with plans to backport it to relevant branches. {{<issue 20316>}}
  • Corrects the lookup function in BNL (Block Nested Loop) to ensure matching outer tuples are found accurately when the join condition contains more than just hashable equality filters. {{<issue 20531>}}
  • Marks BNL plannodes that sort results as unable to project, addressing a regression in sorted BNL's performance and ensuring the accuracy of sorting when a target list changes due to merged overhead projection operators. {{<issue 20660>}}
  • Extends early termination of index scans for conditions with the form index_column OP NULL to additional btree operators >/>=/</<=, ensuring such conditions no longer send unnecessary data to DocDB. {{<issue 20642>}}
  • Corrects an error in the aggregate scans' pushdown eligibility criteria to prevent wrong results from being returned when PG recheck is not expected, but YB preliminary check is required to filter additional rows. {{<issue 20709>}}
  • Corrects the inaccurate detection of constants in distinct prefix computation during distinct index scans, ensuring reliable query results for batch nested loop joins. {{<issue 20827>}}
  • Renders a fix for memory corruption issue that caused failure in creating a valid execution plan for SELECT DISTINCT queries. Enables successful execution of queries without errors and prevents server connection closures by disabling distinct pushdown. This fix improves the stability and effectiveness of SELECT DISTINCT queries. {{<issue 20893>}}
  • Eliminates unnecessary computation of range bounds in Index-Only Scan precheck condition, preventing crashes for certain queries and improving performance. {{<issue 21004>}}
  • Trims down the probability of inaccurate behaviour involving conflicts between single shard INSERT operations by ensuring read times are chosen after conflict resolution, enhancing data consistency. {{<issue 19407>}}
  • Reduces the time spent on preparing read requests in queries with a large number of operands in the IN operator by avoiding O(n^2) complexity in list traversal when generating ybctids. {{<issue 19329>}}
  • Refines parameter computation for Nested Loop joins in YSQL, removing the need to manually track relations that can't be batched parameters, thus mitigating bugs and simplifying logic. {{<issue 19642>}},{{<issue 19946>}}
  • Includes additional tests that capture and demonstrably rectify previously recurring errors from Batched Nested Loop Left Join due to incorrectly parameterized batched expressions in multiple loop scenarios. {{<issue 19642>}},{{<issue 19946>}},{{<issue 20495>}}
  • Corrects the incrementation timing of pg_stat_user_indexes idx_scan column for LSM index for accurate stat generation, ensuring it no longer increments too early. {{<issue 17495>}}
  • Reduces spinlock deadlock detection time by 75% for prompt handling of potential freezes and restarts Postmaster when a process holding a spinlock is killed, ensuring successful initiation of new connections. {{<issue 18272>}},{{<issue 18265>}}
  • Prevents potential postmaster crashes during cleanup of killed connections by using the killed process's ProcStruct to wait on an unavailable LWLock. {{<issue 18000>}}
  • Overhauls the handling of DDL statements, preventing them from restarting in READ COMMITTED mode, better managing DDL transactions, and ensuring more immediate clean-up of DDL transactions. {{<issue 18761>}}
  • Rectifies the issue of filters not binding to the request by amending the erroneous duplication-check of the bindings on the first column of the row element, enhancing query performance. {{<issue 19308>}}
  • Resolves an issue by safely dropping all foreign key constraints in one pass, preventing errors when altering a column referenced by a foreign key in partitioned tables. {{<issue 19063>}}
  • Cures null constraint violations in ALTER TYPE operations and failures on tables with a range key, ensuring accurate operation and error reduction. {{<issue 18911>}},{{<issue 19382>}}
  • Restores previous conditions after test PgRegressIndex yb_index_scan fails due to a commit reversion. {{<issue 19477>}}
  • Eliminates unnecessary file creation for views on temporary tables by checking if storage is actually needed. {{<issue 19522>}}
  • Moves estimated seeks and nexts in the EXPLAIN plan from VERBOSE to DEBUG flag, enhancing Sequential Scan nodes to include these estimates. {{<issue 19938>}}
  • Corrects DDL Atomicity by cleaning up failed CREATE TABLE operations, allowing for multiple sub-commands in ALTER TABLE ALTER COLUMN TYPE, adequately looking up Materialized views in PG schema, and addressing order field-dependency in DocDB columns. {{<issue 19605>}}
  • Rectifies the serialization mismatch in YBBatchedNestLoop, reducing errors when Parallel Query is enabled. {{<issue 19612>}}
  • Corrects an error that prevents the ALTER TABLE SET TABLESPACE command from executing successfully when the cluster has a placement_uuid set, by properly filling in the placement_uuid during validation. {{<issue 14984>}}
  • Allows transfer of parameter values to and from background workers in Parallel Query by correcting the finalize_plan function, improving Nested Correlated Subquery results. {{<issue 19694>}}
  • Enables running the postprocess script on alternate expected files in pg_regress, effectively fixing mismatches previously noticed due to its absence. {{<issue 19737>}}
  • Reduces maintenance time by switching to a less complex implementation of SideBySideDiff.java, thereby eliminating errors from SideBySideDiff.sanityCheckLinesMatch. {{<issue 19690>}}
  • Prevents PostgreSQL backend crashes induced by assert errors in the YbPgInheritsCache as it now correctly cleans up unreleased references, improving transaction reliability. {{<issue 19807>}}
  • Safeguards against potential bugs by ensuring that yb_transaction_priority_lower_bound and yb_transaction_priority_upper_bound are disregarded in read committed isolation, irrespective of the enable_wait_queue status. {{<issue 19921>}}
  • Adjusts the shared relcache init file invalidation to ensure correct refresh of the rel cache after executing DDL statements, ensuring consistency with Postgres results. {{<issue 19955>}}
  • Streamlines the creation of a publication for all tables in per-database catalog version mode by making updates to pg_yb_catalog_version that bypass CheckCmdReplicaIdentity function, eliminating DDL errors. {{<issue 19965>}}
  • Eliminates unnecessary catalog version incrementation on no-op GRANT DDL statements to enhance optimization by rectifying a previously missed case. {{<issue 19981>}}
  • Allows successful dropping of table groups when DDL Atomicity is enabled by verifying if the tables within the group are marked for deletion, instead of ensuring the group is empty. {{<issue 20002>}}
  • Revises YbSeqScan to send ysql_catalog_version in user-initiated system table requests, ensuring system table scans use an up-to-date catalog and reducing chances of TestPgRegressIndex failure. {{<issue 20017>}}
  • Rectifies the assertion failure issue in the per-database catalog version mode. The fix updates the conditions for treating DDL statements, eliminating previous failures caused by treating some DDL statements as non-DDL statements. {{<issue 19975>}}
  • Increases the delay when restarting the test cluster in tsan build to prevent occasional failures in unit test PgOidCollisionTest.TablespaceOidCollision/0. {{<issue 20008>}}
  • Corrects the method for deriving element_typeid to prevent crashes when running aggregations with join by ensuring it's derived from the RHS of the index condition, not the LHS. {{<issue 20003>}}
  • Resolves a bug ensuring ddl_transaction_state gets properly reset even if YbIncrementMasterCatalogVersionTableEntry throws an exception, preventing non-global DDL statements from being incorrectly handled as global ones. {{<issue 20038>}}
  • Prevents a possible system crash in YSQL backends manager by ensuring essential checks are in place before using the job database object. {{<issue 20060>}}
  • Enforces stricter locking mechanisms during concurrent updates on different columns of the same row, to maintain data consistency and prevent 'write-skew anomaly within a row’. Adds a new gflag ysql_skip_row_lock_for_update to toggle the new row-level locking behavior. {{<issue 15196>}}
  • Ensures removal of both shared and per-database relation cache initialization files during postmaster startup to prevent the reusing of outdated files. {{<issue 20125>}}
  • Disables CheckCmdReplicaIdentity for tables when yb_non_ddl_txn_for_sys_tables_allowed is set to true, preventing YSQL upgrades from failing during update/delete operations on system tables. {{<issue 20085>}},{{<issue 20143>}}
  • Eliminates the possibility of a segfault during the LWLock process when the postmaster cleans up a killed process, by using KilledProcToCleanup instead of MyProc. {{<issue 20166>}}
  • Restores PostgreSQL 11 code to its original format, facilitating an easier merge with PG15. {{<issue 20176>}}
  • Enhances visibility and debugging capabilities by introducing two boolean flags, which log every endpoint access and print basic tcmalloc stats after path handler and garbage collection. Now yb_pg_metrics handles the SIGHUP signal to update flags values. Also adds :13000/memz and :13000/webserver-heap-prof to expose memory usage with a new runtime variable to control tcmalloc sampling. {{<issue 20157>}}
  • Introduces the pg_stat_statements.yb_qtext_size_limit flag, controlling the maximum file size read into memory, limiting potentially large or corrupt qtext files impacting system memory usage. {{<issue 20211>}}
  • Unveils fresh insight into webserver memory usage through the creation of :13000/memz and :13000/webserver-heap-prof for printing tcmalloc stats and displaying current or peak allocations, respectively. {{<issue 20157>}}
  • Rectifies an issue with corrupted state manipulation, caused by processes being killed during writing, by restarting the postmaster anytime a backend is extraordinarily killed in a critical section. This helps avoid infinite loops and CPU overuse, thereby enhancing database stability. {{<issue 20255>}}
  • Caps retrieval of beentry from localhost:13000/rpcz to 1000 iterations, preventing indefinite waits and ensuring safety even in cases of inconsistent states. {{<issue 20274>}}
  • Blocks new-version DDL statements in an invalid per-database catalog version configuration to avoid possible stale read/write RPCs and provide accurate results during cluster upgrades. {{<issue 20300>}}
  • Moves the Active Session History (ASH) code from extension to core Postgres, eliminating the chance of partial feature activation and ensuring control solely through the TEST_yb_enable_ash gflag, enhancing the user's control over the ASH functionality. {{<issue 20180>}}
  • Enables rollback from PostgreSQL 15 upgrade to preserve PostgreSQL 11 data directory, therefore preventing a loss of stored data such as statistics. {{<issue 20319>}}
  • Renames the debug field in ExplainState to yb_debug and repositions it to the bottom of the struct for clarity purposes. {{<issue 20366>}}
  • Reduces memory consumption during secondary index scans by introducing a separate arena for batch operations, lowering the risk of a node run out due to high memory usage. {{<issue 20275>}}
  • Prevents background worker crashes caused by assertion failures in Active Session History (ASH) when MyProcPort is not established. {{<issue 20338>}}
  • Adds an extra null check to avoid runtime errors when ASH is enabled by default and prevents the execution of ASH code while running initdb, fixing the PcustomeriniAsh test failure. {{<issue 20362>}}
  • Reduces likelihood of Restart read required error during Cross-DB Concurrent DDLs with per-database catalog version enabled by initiating the function YbInitPinnedCacheIfNeeded before starting the DDL transaction. Also, improper usage of yb_non_ddl_txn_for_sys_tables_allowed with a DDL statement has been rectified. {{<issue 20303>}}
  • Increases the schema version of the default partition whenever you create a new partition, preventing erroneous data insertion into the default partition due to cache refresh issues. {{<issue 17942>}}
  • Enhances test environment on Mac by fixing clean-up issues, and introduces a rollback ability for stashed PG11 data during PG15 upgrade. {{<issue 20319>}}
  • Adds PgClient session id to ASH metadata to support aggregations for tserver wait events based on client session id, controlled by TEST_yb_enable_ash. Safe to upgrade/downgrade. {{<issue 20242>}}
  • Revamps the initialization of YbPgInheritsCache's hash table to use binary comparison with HASH_BLOBS flag, ensuring correct hash lookups, while also stopping marathon Java partitioning tests on TSAN to prevent timeouts and test failures. {{<issue 20436>}}
  • Rectifies the mismatched sizes of various ASH fields, ensuring upgrade and downgrade safety, while providing new functionality without disturbing the existing one. Note that if you downgrade, ASH will become unavailable and it is guarded by TEST_yb_enable_ash. {{<issue 20454>}}
  • Mitigates MISMATCHED_SCHEMA error in cross DB concurrent DDLs with per-database catalog version turned on, by ensuring backends only apply messages sent by themselves. {{<issue 20340>}}
  • Eliminates tsan warnings in the MetricWatcher helper class by using MetricEntity class, preventing potential test failures. {{<issue 20580>}}
  • Rectifies potential flakiness in TestYbAsh testEmptyCircularBuffer by ensuring buffer remains empty during idle cluster and excluding certain query samples. {{<issue 20629>}}
  • Refines the Batch Nested Loop (BNL) first batch building logic to accurately handle scenarios when the provisional first batch size equalizes the outer table's size for correct query results. {{<issue 20707>}}
  • Corrects the division by zero error occurring with certain queries when the yb_enable_base_scans_cost_model is activated and yb_fetch_size_limit is enforced by setting a fixed size for result width when it equals zero. {{<issue 20892>}}
  • Reduces PostgreSQL connection startup timeouts in geo-distributed clusters with a new wait_for_ysql_backends_catalog_version_master_tserver_rpc_timeout_ms GFlag, increasing the default timeout value to 60s from 30s. This alteration only impacts one specific RPC - WaitForYsqlBackendsCatalogVersion, not all RPCs, which should diminish time-out incidents. {{<issue 18228>}}
  • Updates two column names in the yb_active_session_history view: yql_endpoint_tserver_uuid changes to top_level_node_id for intuition, and session_id changes to ysql_session_id for clarity. {{<issue 20920>}}
  • Fixes YSQL upgrade failure from 2.16 to 2.21 by adding a 2-second delay before moving to the next connection if the previous script included a breaking DDL statement. {{<issue 20842>}}
YCQL
  • Solves a concurrency issue in the TestCQLServiceWithCassAuth.TestReadSystemTableAuthenticated unit test by adjusting the CQLServer's shared_pointer reset method. {{<issue 17779>}}
DocDB
  • Resolves potential WriteQuery leak issue in CQL workloads, ensuring proper execution and destruction of queries, while preventing possible tablet shutdown blockages during conflict resolution failure. {{<issue 19919>}}
  • Enhances error reporting of cross-cluster pollers, addressing persistence of stale or missed errors and simplifies the corresponding code. Now, instead of storing verbose detailed status, only error codes are stored for efficient memory usage. {{<issue 19455>}}
  • Refines meta cache updates to avoid overwriting child tablets and consequently causing stale data, ensuring more accurate partition map refreshes. {{<issue 18732>}}
  • Streamlines transaction processing by updating TabletState only for tablets engaged in writes and ignoring old statuses during transaction promotion, reducing failure errors and boosting consistency. {{<issue 18081>}},{{<issue 19535>}}
  • Resolves an inconsistency problem where indexes grow in size even after delete operations, causing slower query performance. The fix involves intelligent handling of backfill done events on the tablet server side. Note, it only works for newly created indexes and will not auto-recover from current buggy states. {{<issue 19544>}}
  • Enables wait-on-conflict by default in release builds across all isolation levels. {{<issue 19837>}}
  • Addresses potential deadlock during tablet shutdown when wait-queues are enabled by refactoring the Wait-Queue shutdown path to execute thread_pool_token_->Shutdown as part of WaitQueue::Impl::CompleteShutdown instead of StartShutdown. {{<issue 19867>}}
  • Includes a script to ensure no index tables retain delete markers post-backfill, addressing a bug causing indexes to expand in size following row deletion, which slowed queries. The bug affected both YCQL and YSQL APIs for new indexes created with versions 2.14.x/2.16.x/2.18.x and led to increasing storage needs due to accumulated delete markers. This script negates these issues and boosts index performance. {{<issue 19544>}}
  • Sets kMinAutoFlagsConfigVersion to 1, providing accurate configuration version comparison and reducing potential confusion. {{<issue 19985>}}
  • Reduces the occurrence of Transaction Metadata Missing errors by accurately reporting deadlocked transactions that may result from multiple aborts in a deadlocked cycle. {{<issue 20016>}}
  • Enables single shard waiters to progress after a blocking subtransaction rolls back, by applying the same logic used for distributed transactions. {{<issue 20113>}}
  • Handles backfill responses getting interleaved across different operations more gracefully to prevent crashes caused by slow masters or network delays. {{<issue 20510>}}
  • Reintroduces bloom filters use during multi-row insert, improving conflict resolution and rectifying missing conflict issues, while also addressing GH 20648 problem. {{<issue 20398>}},{{<issue 20648>}}
  • Reschedules the resumption of contentious waiters on the same underlying Scheduler::Impl::strand_, which is used for executing incoming rpc calls, instead of reactor threads, thus preventing a fatal issue. {{<issue 20651>}}
  • Reduces log warnings in normal situations by downgrading repeated waiter resumption alerts to VLOG(1), benefiting from the direct signaling of transaction resolution. {{<issue 19573>}}
  • Disables the wait-on-conflict feature in 2.21.0 by default to fix a launch-blocking bug linked to multiple requests per session to a single tablet. {{<issue 20978>}}
  • Reflects the actual columns locked in conflict resolution instead of the shared in-memory locks in pg_locks, providing more accurate output for waiting transactions. {{<issue 18399>}}
  • Deactivates the packed row feature for colocated tables, averting potential write failure issues identified in 20638 during specific kinds of compactions. {{<issue 21047>}}
  • Enables segfault prevention originating from pg_locks queries when wait-queues are disabled by explicitly checking the existence of server_->tablet_manager ->waiting_txn_registry before its usage. {{<issue 20772>}}
  • Fixes a race condition on kv_store_.colocation_to_table to prevent undefined behavior and re-enables packed row feature for colocated tables, enhancing data writing and compaction processes. {{<issue 20638>}}
  • Modifies the DocDB system by shifting the acquirement of submit_token_ of the WriteQuery to the post-conflict resolution phase to prevent DDL requests from being blocked, thus optimizing both reads and writes for continued performance and enhanced data consistency. {{<issue 20730>}}
  • Corrects transaction queue behavior allowing multiple waiters for a single transaction per tablet, thereby resolving conflicts and enhancing transaction handling capability. {{<issue 18394>}}
  • Restores the wait-on-conflict feature in the 2.21.0 branch that was previously disabled due to a bug, now resolved. {{<issue 20978>}}
  • Filters out external intents beyond producer tablet range to address disparity in tablet partitions, ensuring each consumer tablet only receives relevant intents. This resolves the issue of potential hidden batch records due to erroneous starting of write_ids from zero. {{<issue 19728>}}
  • Resolves the issue where transactions continue and commit despite supposed immediate abort after promotion, due to a timing gap between sending UpdateTransactionStatusLocation RPCs and reception of the first PROMOTED heartbeat. This update delays the sending of UpdateTransactionStatusLocation RPCs until the first PROMOTED heartbeat is acknowledged. {{<issue 17319>}}
  • Refines the leaderless tablet detection logic to prevent incorrect reporting of tablets having recently undergone leader changes as leaderless, improving data consistency. {{<issue 20124>}}
  • Prevents the deletion of active snapshots during a database backup, even if their corresponding tables are dropped, enhancing the reliability of backup operations. {{<issue 17616>}}
  • Adjusts calculation of replication lag metrics for split tablet children by incorporating parent tablet's last sent/committed record time, promoting greater accuracy in metric results. {{<issue 17025>}}
  • Addresses the bug where large transactions partially apply to regular RocksDB during tablet server restarts, thus ensuring consistent transaction data after restarts. {{<issue 19359>}}
  • Allows setting all columns of a row to NULL, resulting in deletion instead of creating a row consisting of NULLs, rectifying an issue during compaction. {{<issue 18157>}}
  • Corrects an issue where an invalid filter key negatively affected the performance of backwards scans, by improperly passing all SST files through the bloom filter. This update will be applied to versions 2.20 and 2.18. {{<issue 19440>}}
  • Resolves issues of data validation failure and unreachable nodes by properly setting child checkpoints in cdc_state during tablet splits, curbing log amplification. {{<issue 18540>}}
  • Allows tracing of outgoing calls only if the current RPC is being traced, reducing excessive memory consumption and logging. {{<issue 19497>}}
  • Introduces retry logic to synchronize metadata and checkpoint creation during remote bootstrap initialization, reducing inconsistency risks associated with schema packing. {{<issue 19546>}}
  • Stops Garbage Collection (GC) of schema packings that XCluster config references to avoid data loss during replication, taking into account network partitions and schema changes. {{<issue 17229>}}
  • Removes a regression that could crash the TServer when replaying alter schema during local bootstrap by adding ANNOTATE_UNPROTECTED_WRITE to CqlPackedRowTest.RemoteBootstrap. {{<issue 19546>}}
  • Corrects Master's tablet_overhead mem_tracker issue, ensuring it displays accurate memory consumption, addressing discrepancy in MemTracker metric names between TServer and Master. {{<issue 19904>}}
  • Resolves a race condition in MasterChangeConfigTest.TestBlockRemoveServerWhenConfigHasTransitioningServer by ensuring the launched async thread operates on a copy of ExternalMaster* instead of the mutating current_masters vector. {{<issue 19927>}}
  • Corrects intermittent index creation failure for empty YCQL tables by evaluating the result of is_running rather than checking index state directly, ensuring accurate retain_delete_markers and reducing potential performance issues. {{<issue 19933>}}
  • Addresses a PITR restore issue by terminating all active transactions, ensuring inserted or updated data doesn't get omitted, and giving a clear signal about the non-application of such transactions. {{<issue 14290>}}
  • Adds retries around the leader step down in the PgNamespaceTest.CreateNamespaceFromTemplateLeaderFailover test to allow the target leader time to properly catch up, preventing previous failures. {{<issue 14316>}}
  • Disables the packed row feature for colocated tables, effectively preventing a possible encounter with the underlying issue in 21218 during debugging. {{<issue 21218>}}
  • Prevents system crashes caused by the CallHome class calling a pure virtual function due to a timing issue during system shutdown. {{<issue 18254>}}
  • Corrects an Xcluster Consumer shutdown issue encountered during testing by implementing a temporary mitigation that waits for the Flush with a timeout. {{<issue 19402>}}
  • Amends RaftGroupMetadata::CreateSubtabletMetadata to update the log prefix, preventing the use of parent tablet ID in child tablet's metadata logging. {{<issue 19375>}}
  • Resolves crashes in sys-catalog-tool linked with TabletBootstrap failing due to uninitialized transaction_participant_context, enhancing stability. {{<issue 19412>}}
  • Corrects a previously non-retryable PGSQL operation, preventing errors from being returned back to PG layer during a parent tablet shutdown scenario. {{<issue 19033>}}
  • Enables transaction promotion in TestPgWaitQueuesRegress for an enhanced testing process. {{<issue 19575>}}
  • Restores the original behavior of not counting tablets on dead tservers towards the replica count, ensuring accurate representation of under-replicated tablets. {{<issue 17867>}}
  • Ensures the correct in-memory state for the master coming out of shell mode by fetching the universe key from other masters, enabling proper decryption of the universe key registry. {{<issue 19513>}}
  • Corrects a lock order inversion in the transaction loader to prevent potential deadlock scenarios. {{<issue 19508>}}
  • Adds tests for handling indexes in colocated databases in transactional and non-transactional xCluster environments, enhancing database reliability and consistency. Also simplifies WaitForReplicationDrain test helper for easier usage. {{<issue 18427>}},{{<issue 16758>}}
  • Rectifies the issue causing the XClusterYsqlIndexTest.FailedCreateIndex test to fail by altering the over-aggressive DCHECK to an efficient SCHECK to allow for transient ALTER operations. {{<issue 18967>}}
  • Rectifies the use-after-free issue in RefinedStream::Connected failure path by ensuring a status return rather than causing memory writes to a freed space. {{<issue 19727>}}
  • Introduces macros that simplify the creation of comma-separated expression lists to a stream, reducing repetition. {{<issue 19761>}}
  • Redefines the structure of thirdparty_archives.yml by eliminating redundant fields, implementing sensible default values, and introducing blank lines for improved readability between distinct third-party archive build sections. {{<issue 19883>}}
  • Increases the visibility of Remote Bootstrap (RBS) sessions by adding a dedicated tserver page that lists all ongoing RBS sessions, including the remote log anchor sessions. Additionally, amplifies the Last status field on the tserver's tablets page to display the source a peer is or has been bootstrapping from. {{<issue 19568>}}
  • Resolves a maybe-uninitialized compilation error in almalinux8 release gcc11, enhancing the reliability of the code by addressing both identified issues. {{<issue 19987>}}
  • Rectifies the TestYSQLDumpAsOfTime compilation issue by replacing <int64_t> with <PGUint64>. {{<issue 19992>}}
  • Eliminates the extra verbosity in MiniCluster logs by removing entries with hk!!. {{<issue 20007>}}
  • Resolves an issue where the webserver may start prematurely and fail, by ensuring cds::Initialize is called before executing any function on cds::threading::Manager, minimizing race conditions. {{<issue 20119>}}
  • Introduces an asynchronous interface for PgClient shared memory exchange, allowing for multiple requests and parallel query processing. {{<issue 20151>}}
  • Displays the errno when unable to open version_metadata.json or auto_flags.json files, providing clarity on the nature of the IO error. {{<issue 20250>}}
  • Deprecates the enable_process_lifetime_heap_sampling flag, simplifying tcmalloc sampling control to only setting profiler_sample_freq_bytes, which if <=0 disables sampling. {{<issue 20236>}}
  • Prevents application crashes caused by an interrupted interprocess semaphore which previously threw an exception. {{<issue 20325>}}
  • Allows early termination of old single statement read-committed transactions facing kConflict errors to enhance system throughput. {{<issue 20329>}}
  • Eliminates premature shutdowns during transaction status resolution by ensuring the rpcs_.Shutdown only occurs after all status resolvers of the participant have ended, avoiding any in-progress status resolver rpc(s). {{<issue 19823>}}
  • Reduces potential request is too old errors during YSQL DDLs by setting the SysCatalog tablet's retryable request retain duration to the maximum of YSQL and YQL client timeout. {{<issue 20330>}}
  • Fixes ./yb_build.sh help to correctly display the help command instead of an error message due to a mismatched function name. {{<issue 20390>}}
  • Removes non-trivially destructible static initializations from the code, eliminating complexities that could lead to difficult to identify bugs. {{<issue 20407>}}
  • Replaces the deprecated exec_program command with execute_process in CMake, resolving issue 20481 and eliminating potential warning CMP0153 for developers. {{<issue 20481>}}
  • Allows bulk load time reduction by packing all values when inserting a row with multiple values into the PostgreSQL layer. Apply the preview flag -ysql_pack_inserted_value to enable this feature and note it currently uses v1 encoding. {{<issue 20713>}}
  • Stores the first error from a failed setup replication to ensure more accurate feedback to the user, instead of a final generic error message like Universe is being deleted. {{<issue 20689>}}
  • Changes the path in yb_build.sh to locate generate_test_truststore.sh in $YB_BUILD_SUPPORT_DIR, solving build failures on GitHub Actions. {{<issue 20747>}}
  • Reduces TPCC NewOrder latency by replacing the ThreadPoolToken with a Strand within a dedicated rpc::ThreadPool in PeerMessageQueue's NotifyObservers functions, enhancing speed and efficiency. {{<issue 20912>}}
  • Early aborts transactions that fail during the promotion process, enhancing throughput in geo-partitioned workloads and offering stability in geo-partitioned tests. {{<issue 21328>}}
  • Eliminates a race condition that can occur when simultaneous calls to SendAbortToOldStatusTabletIfNeeded try to send the abort RPC, thus preventing avoidable FATALs for failed geo promotions. {{<issue 17113>}}
  • Changes the initial remote log anchor request to be at the follower's last logged operation id index, reducing the probability of falling back to bootstrapping from the leader and improving the success rate of remote bootstraps. {{<issue 19536>}}
  • Prevents concurrent heap profiles from running and problematic resetting of sampling frequency, allowing only one heap profile to run at a time. {{<issue 19841>}}
  • Resolves use-after-move errors detected by clang-tidy's bugprone-use-after-move-check for increased code stability. {{<issue 20435>}}
  • Resolves issues in the under-replicated endpoint algorithm, ensuring correct counting of replicas only when the block's minimum number of replicas has not been fulfilled yet, hence offering accurate replica tally for placement blocks. {{<issue 20657>}}
CDC
  • Introduces an additional test case ensuring that only tablets belonging to a dropped table get deleted from the cdc_state table. {{<issue 19196>}}
  • Eliminates deadlock during the deletion of namespace-level CDC streams, enabling the successful execution of the ysqlsh drop database command even when the database has multiple tables. {{<issue 19879>}}
  • Resolves an issue preventing newly created tables from being added to the stream metadata and CDC state table after an existing table is dropped, by considering streams in DELETING_METADATA state as well as ACTIVE state during dynamic table addition. {{<issue 20428>}}
  • Removes only non-active tablets from cdc_state in CleanUpCDCStreamsMetadata, including retaining parent split tablets, to preserve essential data during stream cleaning. {{<issue 19348>}}
  • Fixes the issue of WAL garbage collection for tables added after stream creation by enabling WAL retention for each such tablet, reducing connector failure. {{<issue 19385>}}
  • Reinstates the creation of CDC streams with old record types to ensure backwards compatibility and prevent CDC error 9 when the ALL mode is utilized. {{<issue 19929>}}
  • Fixed the decoding of NUMERIC value in CDC records to prevent precision loss by ensuring that the decoded string is not converted to scientific notation if its length is more than 20 characters. Additionally, the fix involves using the string representation with no limit on length and employing the Postgres numeric_out method for decoding, which is identical to the decoding of numerics in a PG query. {{<issue 20414>}}
  • Rectifies an error within the CDCService side, where Merger tried to set tablet safetime to a lower value. Now, for non-consistent snapshot streams, the commit_time_threshold adjusts correctly to the safe_hybrid_time value as per the request, instead of always setting to zero. {{<issue 20356>}}
  • Rectifies consistent snapshot stream creation by ensuring tablets complete their tasks and snapshot safe opids populate in the cdc_state table for proper initialization. {{<issue 20477>}}
  • Allows continuation of tablet fetching, even if certain tables face errors, by logging a warning instead of sending unnecessary errors to the client. {{<issue 19434>}}
  • Rectifies pg_replication_slots view failure prior to any cdc/xCluster stream creation by refining the logic to read the cdc_state_table only when a cdc stream exists. {{<issue 20073>}}
  • Updates the CDCSDK stream metadata with consistent snapshot-related details and ensures its persistence in the sys_catalog, enhancing the stability and accuracy of data. {{<issue 20202>}}
  • Corrects the AsyncYBClient method to pass the explicit_cdc_sdk_opid instead of a null value, ensuring proper snapshot checkpointing and enhancing snapshot resume functionality in EXPLICIT mode. {{<issue 19394>}}
  • Alleviates a regression in the connector snapshot resume capability by adjusting the key population in GetChangesRequest, ensuring the key is populated only when it is not null. {{<issue 19394>}}
  • Removes potential crash in DEBUG mode by ensuring each entry returned from the cdc_state_table iteration in pg_replication_slots view is checked with RETURN_NOT_OK before usage. {{<issue 19894>}}
  • Increases the value of FLAGS_update_min_cdc_indices_interval_secs from 2 to 5, ensuring the CDC state table tablet has enough time to wait for a new leader and correctly update the log. {{<issue 18156>}}
  • Corrects the calculation of the cdcsdk_sent_lag metric to prevent disproportionate growth, by updating the last_sent_record_time with each SafePoint record, reducing inconsistency between transactions. {{<issue 15415>}}
  • Eliminates errors in streaming changes from child tablets in CDCSDK by accurately determining the slowest consumer and preventing unnecessary Garbage Collection of intents. {{<issue 20284>}}
  • Allows propagation of RPC deadline from clients to YB-Master for CreateCDCStream, reducing unnecessary retries and correctly timing out requests. {{<issue 20583>}}
  • Resolves memory leak errors in the asan environment caused by not freeing YBCStatus from YBCPgExecCreateReplicationSlot in case of AlreadyPresent or LimitReached errors. {{<issue 20279>}}
  • Resolves CDCLog and CDCService test failures by setting FLAGS_cdcsdk_retention_barrier_no_revision_interval_secs to 0, ensuring upgrade and rollback safety. {{<issue 20353>}}
  • Rectifies timing issues in the CDCSDKConsistentSnapshotTest.TestRetentionBarrierSettingRace, enhancing stability for TSAN builds via application of WaitFor with an adequate timeout. {{<issue 20455>}}
  • Prevents write pausing on a tablet for an AlterSchema procedure that is solely setting retention barriers during consistent snapshot stream creation. {{<issue 20620>}}
  • Stream creation failures now trigger a thorough cleanup to avoid resource misuse, resolving issues caused by late ALTER TABLE responses. {{<issue 20725>}}
yugabyted
  • Revises auth failure handling in Ysql Connection Manager to give accurate error messages, prevent broken control connections, and improves error packet handling. {{<issue 17289>}},{{<issue 19781>}},{{<issue 19800>}}
  • Adjusts Ysql Conn Mgr Stats setting to align with Ysql Conn Mgr's status, maintaining FALSE setting even when Postgres process is created without a tablet server. {{<issue 19998>}}
  • Resolves the hanging issue in Odyssey when incoming packet size exceeds a limit, by ensuring COPY_DATA and QUERY message types are fully received before processing. {{<issue 19245>}},{{<issue 19284>}}
  • Maintains sticky object count bi-directionally when creating new sub transactions or returning to parent transactions, aligning count with actual usage. {{<issue 20071>}}
  • Allows usage of SET LOCAL query to set temporary session parameters for specific transactions, with values reverting after transaction completion. {{<issue 19556>}}
  • Introduces a JSON endpoint at /api/v1/mem-trackers, enhancing data reliability by avoiding parsing of the HTML page at the /mem-trackers server endpoint for memory usage data. {{<issue 18057>}}
  • Modifies yugabyted UI apiserver to acquire memory usage data from the new JSON endpoint /api/v1/mem-trackers instead of parsing HTML from /mem-trackers, ensuring more reliability. {{<issue 18057>}}
Other fixes
  • Ensures the tserver start and tserver stop scripts successfully terminate all running PG processes, regardless of PID length, enhancing process management. {{<issue 19817>}}
  • Updates the condition for HT lease reporting to ensure accurate leaderless tablet detection in RF-1 setup, preventing false alarms. {{<issue 20919>}}
  • Increases the max_stack_depth from 900kB to 950kB for proper execution and lessens the excessive logging triggered by inherits cache in yb_pg_errors.sql. {{<issue 19443>}}
  • Reduces disruptions by throttling the master process log messages related to "tablet server has a pending delete" into 20-second intervals. {{<issue 19331>}}
  • Prevents segmentation faults in the stats collector after a Postmaster reset, ensuring the stats collector's operations are uninterrupted even when a query is terminated. {{<issue 19572>}}

Other

  • Streamlines code base by eliminating over 900 unnecessary includes, splitting oversized .proto files, enhancing the protoc-gen-yrpc to produce forward headers for protobuf, and upgrading precompiled headers. Also restructures MasterService, divides it into smaller services improving build times, and moves encryption-related classes. Updates now allow less system entropy drain via revised UUID generation. {{<issue 10584>}}
  • Validates the use of two arguments for disable_tablet_splitting, addressing a previous condition where only one was required, thereby enhancing backup process reliability. {{<issue 8744>}}
  • Enables passing of username and password to the connect command akin to ysqlsh, permitting direct connection to the desired database/keyspace. {{<issue 14869>}}
  • Introduces documentation for GFlags pertinent to the bootstrap from closest peer feature in the tserver flags page. {{<issue 18061>}}
  • Corrects a nonfunctional link in the RBS GFlags description and adds documentation for the bootstrap from closest peer feature. {{<issue 18061>}}
  • Reduces network requests when running ./yb_build.sh offline for a smoother rebuild process and adds helpful error messages for easier debugging. {{<issue 19476>}}
  • Rectifies the issue where yugabyted crashes if yugabyted-ui binary doesn't exist, allowing the cluster to start with the UI disabled, similar to setting ui=false and alerts the user with a warning. {{<issue 16098>}}
  • Resolves the odyssey build failure on Ubuntu 23.04 when compiling using ./yb_build.sh release gcc13 by addressing -Werror=address issue. {{<issue 19959>}}
  • Adjusts previously hardcoded ports such as master_rpc_port, tserver_webserver_port, and master_webserver_port to dynamically accommodate custom configurations, solving connectivity issues in multi-region/zone cluster setups. {{<issue 15334>}}
  • Ensures better visibility into local calls by tracking them and allowing DumpRunningRpcs API to fetch them; if rolled back, this functionality will turn unavailable. {{<issue 19697>}}
  • Transitions primary build and packaging from Centos7 to AlmaLinux8, discontinuing support for Linux OS's with glibc less than 2.28 for future integrations, while preserving it for versions 2.20 and earlier. {{<issue 20173>}}
</details>