Skip to main content

Klustron Global MVCC Testing

KlustronAbout 4 min

Klustron Global MVCC Testing

Note:

Unless otherwise specified, the version numbers mentioned in this document can be replaced with any released version number. For all released versions, please refer to:http://doc.klustron.com/zh/Release_notes.html

Objective:

As a distributed database that fully supports strong consistency scenarios in finance, securities, etc., data read consistency is an essential feature of Klustron. Global MVCC (Multi-Version Concurrency Control) is a global consistency mechanism designed to address read consistency issues in a distributed environment. It achieves global data read consistency by setting a global data version number for distributed transactions to obtain the current transaction snapshot.

In this document, we construct a bank account scenario to verify the different results obtained when querying an account with Global MVCC enabled and disabled. This demonstrates the importance of enabling Global MVCC in scenarios requiring financial-level strong consistency.

01 Why Global MVCC is Needed

Let's first look at the read consistency issue in distributed transactions, as shown in the diagram below:

Without Global MVCC:

  • Time A1: Client1 initiates a transaction, inserting two records into table t1: (1, 'Beijing') and (2, 'Shenzhen'). Assume these two records are stored in shard1 and shard2 respectively.
  • Time A2: Client1 commits the transaction.
  • Time A3: Client2 initiates a query on table t1: SELECT * FROM t1;. At time A3, Client1's transaction has been committed in shard1 but not yet in shard2.

At this point, Client2 can see (1, 'Beijing') but not (2, 'Shenzhen'), thus seeing only part of the transaction's results.

To solve this problem, Klustron implements Global MVCC. The main principle is to establish a global snapshot to obtain the visible data for the current transaction.

02 Enabling Global MVCC

In Klustron, you need to enable the global MVCC option when creating the cluster, and then create the cluster. By default, global MVCC is not enabled.

03 Testing Cases

3.1 Environment Preparation

Log in to the compute node via the PG client, create a user and a database.

psql -h 10.37.129.6 -p 47001 postgres
create user kunlun_test with password 'kunlun';
create database test_db with owner kunlun_test encoding utf8 template template0;
\q
psql -h 10.37.129.6 -p 47001 -U kunlun_test test_db

Query the IDs of the two shards using the following statement:

select * from pg_shard;

The IDs of the two shards are 1 and 2 respectively.

Create the account table bank_accounts, with both partitions using the InnoDB storage engine by default:

create table bank_accounts
(
   id         INT NOT NULL AUTO_INCREMENT,
   balance    DECIMAL(18,2) NOT NULL,
   primary key(id)
) partition by range(id);
create table bank_accounts_p0 partition of bank_accounts 
for values from (1) to (501) with (shard=1);
create table bank_accounts_p1 partition of bank_accounts
for values from (501) to (1001) with (shard=2);

Insert data for 1000 accounts:

create or replace procedure generate_account_data()
AS $$
DECLARE
  v_balance double;
  i integer = 1;
BEGIN
    while i<=1000 loop
        v_balance = ROUND(1000+RANDOM()*9000,2);
        INSERT INTO bank_accounts VALUES (i,v_balance);
        commit;
        i = i+1;
    end loop;
END; $$
LANGUAGE plpgsql;
call generate_account_data();
analyze bank_accounts;

After importing the data, it is evenly distributed across the two shards.

Create the Python program paccupdate.py as follows:

import psycopg2.extras
from psycopg2 import DatabaseError
import time
import random
import os
from multiprocessing import Pool

def db_work(num):
	print("----Starting Task %d----" % (num)) 
	conn = psycopg2.connect(database='test_db',user='kunlun_test',
                 password='kunlun',host='10.37.129.6',port='47001')
	update_sql1 = ''' update bank_accounts set balance=balance-100 where id=%s'''
	update_sql2 = ''' update bank_accounts set balance=balance+100 where id=%s'''
	cursor = conn.cursor()

	try:
		for i in range(1000):
			id = random.randint((num-1)*50+1,(num-1)*50+50)
			print(f"Decreasing ID: {id}")
			cursor.execute(update_sql1, [id])
			id = random.randint(1000-num*50+1,1000-num*50+50)
			print(f"Increasing ID: {id}")
			cursor.execute(update_sql2, [id])
			conn.commit()
			time.sleep(1)
	finally:
		cursor.close()
		conn.close()
		print("----Task %d Completed----" % (num)) 

    	

def main():
	po = Pool(4)
	for i in range(1,5):
		po.apply_async(db_work,(i,))
	print("----Start----")
	po.close()
	po.join()
	print("----End----")

if __name__ == "__main__":
	main()
  1. Four threads are created.
  2. Each thread executes a transfer transaction in each loop, randomly selecting an account in one shard and decreasing its balance by 100; then randomly selecting another account in a different shard and increasing its balance by 100.
  3. Each thread executes 1000 transfer transactions, sleeping for 1 second between each loop.

3.2 Testing Global MVCC - Both Shards Using InnoDB Storage Engine

Query the total balance of all accounts, which is 5505457.37.

Run the paccupdate.py program, starting 4 threads to perform transfer operations on user accounts simultaneously.

At the same time, run another program qsum.py to query the total balance of the accounts in real-time:

import psycopg2.extras
from psycopg2 import DatabaseError
import time
from decimal import Decimal

conn = psycopg2.connect(database='test_db',user='kunlun_test',
                 password='kunlun',host='10.37.129.6',port='47001')
select_sql = ''' select sum(balance) from bank_accounts;'''
cursor = conn.cursor()


try:
	for i in range(1000):
		cursor.execute(select_sql)
		res = cursor.fetchall()
		for row in res:
			balance = row[0].quantize(Decimal('0.01'))
			print(f"Total account balance is {balance}")
		time.sleep(1)


finally:
	cursor.close()
	conn.close()

Query the total balance of all accounts during the execution of paccupdate.py.

You can see that the total balance remains unchanged during the execution of the program.

3.3 Testing Global MVCC - Both Shards Using RocksDB Storage Engine

Create a new account table bank_accounts_rocksdb with both shards using the RocksDB storage engine:

create table bank_accounts_rocksdb
(
   id         INT NOT NULL AUTO_INCREMENT,
   balance    DECIMAL(18,2) NOT NULL,
   primary key(id)
) partition by range(id);
create table bank_accounts_rs_p0 partition of bank_accounts_rocksdb
for values from (1) to (501) with (shard=1,engine=rocksdb);
create table bank_accounts_rs_p1 partition of bank_accounts_rocksdb
for values from (501) to (1001) with (shard=2,engine=rocksdb);

insert into bank_accounts_rocksdb select * from bank_accounts;
analyze bank_accounts_rocksdb;

Follow the steps in 3.2, replacing the target table in the Python program with bank_accounts_rocksdb. Run four processes to simulate transfer operations.

At the same time, in another session, continuously query the total balance of the accounts. Modify the table for calculating the sum in qsumrdb.py to bank_accounts_rocksdb.

When the storage engine of the shard for the account table is changed to RocksDB, Global MVCC ensures that the total balance of all accounts remains unchanged during multi-user transfers.

3.4 Testing Global MVCC - One Shard Using InnoDB, the Other Using RocksDB

Create the table bank_accounts_mixed, with one shard using the InnoDB storage engine and the other using RocksDB:

create table bank_accounts_mixed
(
   id         INT NOT NULL AUTO_INCREMENT,
   balance    DECIMAL(18,2) NOT NULL,
   primary key(id)
) partition by range(id);
create table bank_accounts_mixed_p0 partition of bank_accounts_mixed
for values from (1) to (501) with (shard=1);
create table bank_accounts_mixed_p1 partition of bank_accounts_mixed
for values from (501) to (1001) with (shard=2,engine=rocksdb);

insert into bank_accounts_mixed select * from bank_accounts;
analyze bank_accounts_rocksdb;

Follow the steps in 3.2, replacing the target table in the Python program with bank_accounts_mixed. Run four processes to simulate transfer operations.

At the same time, in another session, continuously query the total balance of the accounts. Modify the table for calculating the sum in qsummix.py to bank_accounts_mixed.

Similarly, when the storage engines of the two shards for the account table are set to InnoDB and RocksDB respectively, Global MVCC ensures that the total balance of all accounts remains unchanged during multi-user transfers.

END