Klustron Logical Backup and Restore
Klustron Logical Backup and Restore
1 Background
Users can perform a logical backup of a Klustron cluster, including the entire cluster or only specific data such as a database, schema, or table. Logical backups allow users to restore a cluster or specific data at any time.
Common scenarios for using logical backups include:
a. Before performing a dangerous operation, users can backup a database or table. In the event of a failure, they can quickly restore the damaged database or table without affecting other databases and tables, and the cluster does not need to be taken offline.
b. Backing up different parts of the cluster's data at different frequencies, such as performing a full backup of high-value or frequently updated data every 12 hours and backing up infrequently changing tables once a week.
c. Exporting the Klustron cluster and importing it into other database systems.
2 Implementation Principles
2.1 Table logical backup
Task processing flow: When cluster_mgr receives a business logic rollback task, it performs parameter verification and parsing, etc. Since Klustron is a computing and storage separation architecture, it is necessary to find the table type to be backed up according to the computing node. If it is a partitioned table, it is necessary to find which storage nodes the table is mapped to. Then send the parsed task to the node_mgr node for specific processing.
Specific processing flow:
- Use the kunlun_pgdump tool to dump the table structure information through the computing node. kunlun_pgdump extends based on pgdump according to Klustron's special table structure information.
- According to the backup table information, connect to the corresponding storage node and use the mydump tool to dump the table data.
- Pack the table structure and table data on each computing node and push it to the cold backup machine.
2.2 Table logical restore
Task processing flow: When cluster_mgr receives a business logical restore task, it performs parameter verification and parsing, etc. It queries whether the table logical backup data exists based on the rollback time and cold backup parameters. Then send the parsed task to the node_mgr node for specific processing.
Specific processing flow:
- node_mgr logs in to the cold backup machine to query the full logical backup data closest to the rollback time. Select the closest backup data package so that incremental recovery through backup binlog can be faster, thereby improving recovery speed.
- Find the associated backup binlog data on each storage node based on the full logical data timestamp and the rollback timestamp. Since Klustron supports table migration, it is necessary to check whether there is a table migration operation between the full logical data timestamp and the rollback timestamp. If so, based on the table migration information, find the backup binlog after the migration.
- node_mgr dumps the found logical backup data package to the node_mgr machine and restores the table structure through the target computing node using kunlun_pgrestore. Restore table data through the target computing node MySQL port using kunlun_myloader.
- The dump backup binlog needs to be restored. Use binlog2sync to parse the binlog and restore the incremental data through the target computing node MySQL port.
3 Configuration and Usage
3.1 Initiate logical backup through xpanel for the required databases and tables, as shown in the figure below.
The table structure and data need to be logically backed up.
3.1.1 Select the cluster settings where the tables to be logically backed up are located.
Click "Save" to initiate logical backup for the table.
3.2 Initiate logical restore through xpanel for the backed up databases and tables, as shown in the figure below.
3.2.1 Select the cluster settings where the backed up tables are located.
If you click "Save" after completing the steps, the backed up tables will be logically restored.
3.3 Perform table logical backup through cluster_mgr API
curl -d '
{
"version":"1.0",
"job_id":"",
"job_type":"logical_backup",
"timestamp":"1435749309",
"user_name":"kunlun_test",
"paras":{
"cluster_id":"1",
"backup":[{
"db_table":"postgres_$$_public.transfer_account",
"backup_time":"01:00:00-02:00:00"
},{
"db_table":"postgres_$$_public.test2",
"backup_time":"01:00:00-02:00:00"
}]
}
}
' -X POST http://127.0.0.1:58000/HttpService/Emit
This interface is an asynchronous interface, which returns job_id. Query the task execution status based on the returned job_id.
curl -d '
{
"version":"1.0",
"job_id":"10",
"job_type":"get_status",
"timestamp":"1435749309",
"user_name":"kunlun_test",
"paras":{
}
}
' -X POST http://127.0.0.1:58000/HttpService/Emit
The current logical backup data can be viewed on HDFS.
3.4 Perform table logical restore through cluster_mgr API
curl -d '
{
"version":"1.0",
"job_id":"",
"job_type":"logical_restore",
"timestamp":"1435749309",
"user_name":"kunlun_test",
"paras":{
"src_cluster_id":"1",
"dst_cluster_id":"2",
"db_table":"postgres_$$_public.transfer_account",
"restore_time":"2023-01-09 06:15:00"
}
}
' -X POST http://127.0.0.1:58000/HttpService/Emit
This interface is an asynchronous interface, which returns job_id. Query the task execution status based on the returned job_id.
curl -d '
{
"version":"1.0",
"job_id":"10",
"job_type":"get_status",
"timestamp":"1435749309",
"user_name":"kunlun_test",
"paras":{
}
}
' -X POST http://127.0.0.1:58000/HttpService/Emit
After a successful rollback, the table structure and data in the target cluster will be restored.