docs/en/08-operation/04-maintenance.md
This section introduces advanced cluster maintenance methods provided in TDengine Enterprise, which can make the TDengine cluster run more robustly and efficiently over the long term.
For how to manage cluster nodes, please refer to Node Management
TDengine is designed for various writing scenarios, and many of these scenarios can lead to data storage amplification or data file holes in TDengine's storage. This not only affects the efficiency of data storage but also impacts query performance. To address these issues, TDengine Enterprise offers a data reorganization feature, namely the DATA COMPACT function, which reorganizes stored data files, removes file holes and invalid data, improves data organization, and thus enhances storage and query efficiency. The data reorganization feature was first released in version 3.0.3.0 and has since undergone several iterations of optimization. It is recommended to use the latest version.
COMPACT DATABASE db_name [start with 'XXXX'] [end with 'YYYY'] [META_ONLY] [FORCE];
COMPACT [db_name.]VGROUPS IN (vgroup_id1, vgroup_id2, ...) [start with 'XXXX'] [end with 'YYYY'] [META_ONLY] [FORCE];
SHOW COMPACTS;
SHOW COMPACT compact_id;
KILL COMPACT compact_id;
SCAN DATABASE db_name [start with 'XXXX'] [end with 'YYYY'];
SCAN [db_name.]VGROUPS IN (vgroup_id1, vgroup_id2, ...) [start with 'XXXX'] [end with 'YYYY'];
SHOW SCANS;
SHOW SCAN <scan_id>;
KILL SCAN <scan_id>;
When one or more nodes in a multi-replica cluster restart due to upgrades or other reasons, it may lead to an imbalance in the load among the various dnodes in the cluster. In extreme cases, all vgroup leaders may be located on the same dnode. To solve this problem, you can use the following commands, which were first released in version 3.0.4.0. It is recommended to use the latest version as much as possible.
balance vgroup leader; # Rebalance all vgroup leaders
balance vgroup leader on <vgroup_id>; # Rebalance a vgroup leader
balance vgroup leader database <database_name>; # Rebalance all vgroup leaders within a database
Attempts to evenly distribute one or all vgroup leaders across their respective replica nodes. This command forces a re-election of the vgroup, changing the vgroup's leader during the election process, thereby eventually achieving an even distribution of leaders.
Vgroup elections are inherently random, so the even distribution produced by the re-election is also probabilistic and not perfectly even. The side effect of this command is that it affects queries and writing; during the vgroup re-election, from the start of the election to the election of a new leader, this vgroup cannot be written to or queried. The election process generally completes within seconds. All vgroups will be re-elected one by one sequentially.
If the data on a data node (dnode) in the cluster is completely lost or damaged, such as due to disk damage or directory deletion, you can use the restore dnode command to recover some or all logical nodes on that data node. This feature relies on data replication from other replicas in the cluster, so it only works when the number of dnodes in the cluster is three or more and the number of replicas is three.
restore dnode <dnode_id>; # Restore mnode, all vnodes, and qnode on dnode
restore mnode on dnode <dnode_id>; # Restore mnode on dnode
restore vnode on dnode <dnode_id>; # Restore all vnodes on dnode
restore qnode on dnode <dnode_id>; # Restore qnode on dnode
When a vgroup is overloaded with CPU or Disk resource usage due to too many subtables, after adding a dnode, you can split the vgroup into two virtual groups using the split vgroup command. After the split, the newly created two vgroups will undertake the read and write services originally provided by one vgroup. This command was first released in version 3.0.6.0, and it is recommended to use the latest version whenever possible.
split vgroup <vgroup_id>
Starting from version 3.1.1.0, TDengine Enterprise supports online hot updates of the very important dnode configuration parameter supportVnodes. This parameter, originally configured in the taos.cfg file, represents the maximum number of vnodes that the dnode can support. When a database is created, new vnodes are allocated, and when a database is deleted, its vnodes are destroyed.
However, online updates of supportVnodes do not persist, and after a system restart, the maximum number of vnodes allowed will still be determined by the supportVnodes configured in taos.cfg.
If the supportVnodes set through online updates or configuration files is less than the current actual number of vnodes on the dnode, the existing vnodes will not be affected. However, whether a new database can be successfully created will still depend on the actual effective supportVnodes parameter.
Dual replicas are a special high-availability configuration for databases. This section provides special instructions for their use and maintenance. This feature was first released in version 3.3.0.0, and it is recommended to use the latest version whenever possible.
Use the following SQL commands to view the status of each Vgroup in a dual-replica database:
show arbgroups;
select * from information_schema.ins_arbgroups;
db_name | vgroup_id | v1_dnode | v2_dnode | is_sync | assigned_dnode | assigned_token |
=================================================================================================================================
db | 2 | 2 | 3 | 0 | NULL | NULL |
db | 3 | 1 | 2 | 0 | 1 | d1#g3#1714119404630#663 |
db | 4 | 1 | 3 | 1 | NULL | NULL |
is_sync has the following two values:
AssignedLeader role, and the Vgroup will not be able to provide service.AssignedLeader role, and the Vgroup can continue to provide service.assigned_dnode:
assigned_token:
The main value of dual replicas lies in saving storage costs while maintaining a certain level of high availability and reliability. In practice, the recommended configuration is:
supportVnodes parameter to 0Assuming there is an existing single replica cluster with N nodes (N>=1), and you want to upgrade it to a dual replica cluster, ensure that N>=3 after the upgrade, and configure the supportVnodes parameter of a newly added node to 0. After completing the cluster upgrade, use the command alter database replica 2 to change the replica count for a specific database.