Skip to main content

5. How to synchronize MySQL data to Klustron

KlustronAbout 3 min

5. How to synchronize MySQL data to Klustron

This article describes how to synchronize the full amount of data from a running MySQL instance to the Klustron cluster, and then continuously update the streaming data to the Klustron cluster.

01 Modify mysql instance permissions

  • The mysql instance must have spuer or BACKUP_ADMIN authority, otherwise the metadata file exported by mydumper has no Pos value and Log value

    mysql -u root -proot -P 8898 -h 172.0.0.132
    GRANT Super, BACKUP_ADMIN ON *.* TO 'root'@'%' WITH GRANT OPTION;
    flush privileges
    

02 Use mydumper to fully export MySQL data

  • There is this tool in the bin/util/ of the kunlun-node_mgr package in the base directory of the Klustron cluster. Use mydumper to export the mysql data to the specified location

    ./mydumper -h 172.0.0.136 -u root -p root -P 8898 -B sysbench -o /nvme2/compare/sysbench/mydumper
    

    ![img](如何同步 MySQL 数据到 KunlunBase/1.png)

  • -B database to export

  • -o The location where the export file is stored

03 ddl2kunlun-linux

  • Download wget http://downloads.klustron.com/kunlun-utils/1.1.1/ddl2kunlun-linux

  • This tool is used to convert the mysql table definition into a table definition that can be used by Klustron

  • Example of use

    ./ddl2kunlun-linux -host="172.0.0.132" -port="8898" -user="root" -password="root" \
    -sourceType="mysql" -database="tpcc" -table="tablename" > a.sql
    
    • -host/-port/-user/-password the information of the database to be exported
    • -sourceType The database type of the exported database, default mysql
    • -database/-table The database and table name to be exported
    • You can view the help documentation through ./ddl2kunlun-linux --help
    • Then use psql or mysql to import the exported table definition into Klustron to generate a table
  • Then use the command line to generate the corresponding table in Klustron PGPASSWORD=abc psql -h 172.0.0.113 -p 47001 -U abc -d postgres < a.sql

  • You can also use a for loop to complete the whole process

    echo `show databases;` > sysbench.sql
    for i in `mysql -h 172.0.0.132 -uroot -proot -P8898 sysbench < sysbench.sql | grep -v Tables_in_`
    do 
      ./ddl2kunlun-linux -host="172.0.0.132" -port="8898" -user="root" -password="root" -sourceType="mysql" -database="sysbench" -table="$i" > a.sql
      echo create table $i
      psql -h 172.0.0.132 -p 35001 -U abc -d postgres < a.sql
    done
    

04 Process the sql file generated by mydumper

  • If you are using the Klustron mysql protocol, you don't need to do this step

  • In the Klustron pg protocol, "" double quotes will be recognized as column

    • The date has a minimum value, 1970-01-01

      cd /nvme2/compare/sysbench/mydumper
      for i in `ls . | grep -v schema | grep sql`
      do
      table=`echo $i | awk -F. '{print $2}'`
      db=`echo $i | awk -F. '{print $1}'`
      sed -i "s/\`$table\`/${db}.$table/" $i
      sed -i 's/0000-00-00/1970-01-01/' $i
      sed -i "s/\"/'/g" $i
      done
      

05 Import the processed mydumper data into Klustron

  • It is mainly to import

    sysbench.customer.00000.sql
    

    the sql file with this file name, and other files such as the file name contains

    metadata
    

    and

    schema
    

    do not need to be imported into Klustron

    PGPASSWORD=abc psql -h 172.0.0.132 -p 30001 -U abc -d postgres -f sysbench.customer.00000.sql
    
  • You can refer to the following script to quickly import

    cd /nvme2/compare/sysbench/mydumper
    for i in `ls . | grep -v schema | grep sql`
    do
      PGPASSWORD=abc psql -h 172.0.0.132 -p 30001 -U abc -d postgres -f $i
    done
    

06 binlog2sync

  • Download wget http://downloads.klustron.com/kunlun-utils/1.1.1/binlog_sync

  • This tool is to directly convert the binlog of the upstream mysql database into sql for the downstream Klustron cluster

  • Modify the parameters of the upstream database before use:

    set global binlog_row_metadata = FULL;
    
  • Detailed options:

    • --remote_host, --remote_port, --remote_user, --remote_passwordip, port, user, pwd of the upstream database;
    • --remote_binlog_fileUsing mydumper will generate a metadata file in the specified or current directory, which Logcorresponds to this option;
    • --binlog_positionUsing mydumper will generate a metadatafile in the specified or current directory, which Poscorresponds to this option; ![img](如何同步 MySQL 数据到 KunlunBase/2.png)
    • --db_host, --db_port, --db_user, --db_passwordthe ip, port, user, pwd of the computing node of the mysql protocol in the downstream Klustron;
    • --job_idJust fill in any number;
    • --db_typeWhether the writing node uses the mysql protocol or the postgres protocol, and Klustron uses postgres;
    • --stop_neverWhen the value is 1, the program will not stop when syncing to the latest binlog location. When the value is 0, otherwise when synchronizing to the latest binlog position, the program will stop.
  • example

    ./binlog_sync --remote_host=172.0.0.136 --remote_port=8898 --remote_user=root --remote_password=root --remote_binlog_file=binlog.000002 \
    --binlog_position=120404375 --db_host=172.0.0.132 --db_port=30002 --db_user=abc --db_password=abc --reserve_event_dir=./binlog_event \
    --db_type="postgres" --work_mode="stream" --stop_never_server_id=100 --stop_never=1 > log 2>&1 &
    

END