5. How to synchronize MySQL data to Klustron
5. How to synchronize MySQL data to Klustron
This article describes how to synchronize the full amount of data from a running MySQL instance to the Klustron cluster, and then continuously update the streaming data to the Klustron cluster.
01 Modify mysql instance permissions
The mysql instance must have spuer or BACKUP_ADMIN authority, otherwise the metadata file exported by mydumper has no Pos value and Log value
mysql -u root -proot -P 8898 -h 172.0.0.132 GRANT Super, BACKUP_ADMIN ON *.* TO 'root'@'%' WITH GRANT OPTION; flush privileges
02 Use mydumper to fully export MySQL data
There is this tool in the bin/util/ of the kunlun-node_mgr package in the base directory of the Klustron cluster. Use mydumper to export the mysql data to the specified location
./mydumper -h 172.0.0.136 -u root -p root -P 8898 -B sysbench -o /nvme2/compare/sysbench/mydumper

-B database to export
-o The location where the export file is stored
03 ddl2kunlun-linux
Download wget http://downloads.klustron.com/kunlun-utils/1.1.1/ddl2kunlun-linux
This tool is used to convert the mysql table definition into a table definition that can be used by Klustron
Example of use
./ddl2kunlun-linux -host="172.0.0.132" -port="8898" -user="root" -password="root" \ -sourceType="mysql" -database="tpcc" -table="tablename" > a.sql
- -host/-port/-user/-password the information of the database to be exported
- -sourceType The database type of the exported database, default mysql
- -database/-table The database and table name to be exported
- You can view the help documentation through ./ddl2kunlun-linux --help
- Then use psql or mysql to import the exported table definition into Klustron to generate a table
Then use the command line to generate the corresponding table in Klustron PGPASSWORD=abc psql -h 172.0.0.113 -p 47001 -U abc -d postgres < a.sql
You can also use a for loop to complete the whole process
echo `show databases;` > sysbench.sql for i in `mysql -h 172.0.0.132 -uroot -proot -P8898 sysbench < sysbench.sql | grep -v Tables_in_` do ./ddl2kunlun-linux -host="172.0.0.132" -port="8898" -user="root" -password="root" -sourceType="mysql" -database="sysbench" -table="$i" > a.sql echo create table $i psql -h 172.0.0.132 -p 35001 -U abc -d postgres < a.sql done
04 Process the sql file generated by mydumper
If you are using the Klustron mysql protocol, you don't need to do this step
In the Klustron pg protocol, "" double quotes will be recognized as column
The date has a minimum value, 1970-01-01
cd /nvme2/compare/sysbench/mydumper for i in `ls . | grep -v schema | grep sql` do table=`echo $i | awk -F. '{print $2}'` db=`echo $i | awk -F. '{print $1}'` sed -i "s/\`$table\`/${db}.$table/" $i sed -i 's/0000-00-00/1970-01-01/' $i sed -i "s/\"/'/g" $i done
05 Import the processed mydumper data into Klustron
It is mainly to import
sysbench.customer.00000.sql
the sql file with this file name, and other files such as the file name contains
metadata
and
schema
do not need to be imported into Klustron
PGPASSWORD=abc psql -h 172.0.0.132 -p 30001 -U abc -d postgres -f sysbench.customer.00000.sql
You can refer to the following script to quickly import
cd /nvme2/compare/sysbench/mydumper for i in `ls . | grep -v schema | grep sql` do PGPASSWORD=abc psql -h 172.0.0.132 -p 30001 -U abc -d postgres -f $i done
06 binlog2sync
Download wget http://downloads.klustron.com/kunlun-utils/1.1.1/binlog_sync
This tool is to directly convert the binlog of the upstream mysql database into sql for the downstream Klustron cluster
Modify the parameters of the upstream database before use:
set global binlog_row_metadata = FULL;
Detailed options:
--remote_host
,--remote_port
,--remote_user
,--remote_password
ip, port, user, pwd of the upstream database;--remote_binlog_file
Using mydumper will generate a metadata file in the specified or current directory, whichLog
corresponds to this option;--binlog_position
Using mydumper will generate ametadata
file in the specified or current directory, whichPos
corresponds to this option; --db_host
,--db_port
,--db_use
r,--db_password
the ip, port, user, pwd of the computing node of the mysql protocol in the downstream Klustron;--job_id
Just fill in any number;--db_type
Whether the writing node uses the mysql protocol or the postgres protocol, and Klustron uses postgres;--stop_never
When the value is 1, the program will not stop when syncing to the latest binlog location. When the value is 0, otherwise when synchronizing to the latest binlog position, the program will stop.
example
./binlog_sync --remote_host=172.0.0.136 --remote_port=8898 --remote_user=root --remote_password=root --remote_binlog_file=binlog.000002 \ --binlog_position=120404375 --db_host=172.0.0.132 --db_port=30002 --db_user=abc --db_password=abc --reserve_event_dir=./binlog_event \ --db_type="postgres" --work_mode="stream" --stop_never_server_id=100 --stop_never=1 > log 2>&1 &