Global MVCC Mechanism
Global MVCC Mechanism
Introduction
Klustron, as a distributed database capable of fully supporting strong consistency scenarios such as finance and securities, considers data read consistency an indispensable feature. Global MVCC is a global consistency mechanism designed to solve the problem of read consistency in distributed environments. It achieves global data read consistency by setting a global data version number for distributed transactions, which allows the current transaction to capture a snapshot.
In this Tech Talk, we will delve into the Global MVCC feature and its technical implementation, aiming to enhance understanding of this key feature while offering an in-depth look at Klustron.
Key Takeaway: Klustron's Global MVCC feature creates a snapshot of the current transaction by setting a global data version number, thereby achieving global data read consistency.
To facilitate communication within our team and beyond, we are excited to announce the launch of the Klustron's BBS forum. We invite everyone to join us there! (Link: https://forum.klustron.com/)
The forum is currently in a beta version and may experience some instability. We encourage you to share information and seek your understanding should any issues occur.
01 Klustron Introduction
First, let's briefly introduce Klustron, the distributed database product from Zetuo Technology. The following figure illustrates the overall architecture of Klustron. As depicted, Klustron is a distributed database product with separate storage and computation.
As a comprehensive distributed database solution, Klustron is equipped with several robust features:
Scalable compute and storage abilities
Data partitioning: hash, range, list
- Support for any number and type of partition columns
Data distribution
- Including auto, random, mirror, and table grouping
Automatic, flexible, non-disruptive, business-unintrusive, and transparent to end-users
Financial-grade high reliability
Automatically handles software, hardware, network failures, and complete data center outages
- Ensures data integrity and continuous service
- Aims for RTO < 30 seconds & RPO=0
Automatic detection of primary node failures with primary/standby switching
HTAP: Harmonious OLTP & OLAP without interference
OLTP-focused: Equivalent to using MySQL or PostgreSQL for application software
OLAP as a secondary focus: High-performance through multi-level parallel queries
Flexible computing with multi-language stored procedures: ML, privacy computing
Ecosystem compatibility
Supports PostgreSQL and MySQL connection protocols and SQL syntax
Compatible with common MySQL DDL syntax
Supports JDBC, ODBC, and common programming language connectors for PostgreSQL and MySQL clients
Comprehensive multi-level security
- Encrypted storage and transmission
- Multi-level access control mechanisms
02 WHY
Why is Global MVCC Necessary?
Let's explore the issue of read consistency in distributed transactions. Illustrated below is a typical scenario:
Ongoing distributed transaction GT1
- Writes to multiple shards (shard1 GT1.t1 & shard2 GT1.t2)
- Employs a two-phase commit process
Active SELECT query (GT2)
- Reads the update GT1.t1 in shard1
- Fails to read the update GT1.t2 in shard2
This situation leads to inconsistency, where only partial data of the transaction is readable.
To address this issue, Klustron has implemented Global MVCC. The principle behind Global MVCC is to establish a global snapshot that captures the visible data for the current transaction, thereby ensuring read consistency across the distributed environment.
03 Principles and Implementation of Global MVCC
The implementation of Global MVCC requires modifications at various levels, including the upper compute nodes and metadata clusters, as well as the lower storage nodes.
Starting with the upper compute nodes, as shown in the following figure, it is essential to assign a global version number to all distributed transactions. Then, this global version number is used to establish a global snapshot.
At the lower storage nodes, modifications are made to the MySQL InnoDB storage engine to support the global snapshot, as depicted in the subsequent figure. Key changes were made to InnoDB's transaction visibility judgment process.
Global Visibility Judgment Algorithm: Specifically for XA Transaction Updates
Begin with local visibility assessment
Local visibility does not necessarily imply global visibility
- Transactions with a version number less than local_xmin are definitely visible
Local invisibility doesn't always mean global invisibility
- Transactions not started at the time of snapshot acquisition are definitely not visible
Global version number comparison
What if it's globally invisible?
- Use undo log to generate an older version of the row
After these modifications, let's compare the processes before and after the changes, as shown in the next figure. With the global version number and global transaction snapshot, consistency issues in transactions can be avoided.
Finally, let's analyze the performance cost of Global MVCC. Since some key processes of Global MVCC incur time and resource costs, there is a certain performance degradation. Based on our tests and analysis, the performance loss is between 5% and 10%, which is within an acceptable range.
Comprehensive Analysis:
Compute Nodes:
No new time overhead added
Assigning GVNO (Global Version Number): An integer is issued with the XA COMMIT statement and stored in the tgvc_cache's tgvc
- This is considered negligible
Acquiring Global Snapshot: Involves network transmission overhead
- Occurs once for each SELECT statement (RC) or per transaction (RR)
- The current value is obtained from the metadata cluster sequence with
select currval('global_mvcc_seq')
Allocating Global Snapshot: An integer is sent with each SELECT statement
- This is also considered negligible
Storage Nodes:
Management of tgvc (Transaction Global Version Control): Negligible
Global MVCC Visibility Judgment Logic: Involves integer comparisons
- Minimal READ waiting time for setting the global version number: Typically < 20ms
Covered Index Searches: Relating to the max_trx_id at the page header, which indicates the last transaction that updated the page
Previously: If the readview was visible for all rows on the page (rv.m_up_limit_id > max_trx_id), the index row was directly returned.
Now: The above condition must be met, and if max_trx_id > local_xmin, a table lookup is required.
- This results in a slightly increased proportion of table lookups
Purge: Undo logs are retained until they are no longer needed by the increase in global_xmin due to Global MVCC
04 Q&A
q1: In what scenarios should Global MVCC be enabled?
a1: Global MVCC is particularly beneficial in scenarios where data consistency is a high priority, such as in finance and securities. Since enabling this feature can lead to a performance decrease of approximately 5% - 10%, it's important to weigh the need for this level of consistency against the potential impact on performance. Deciding whether to enable Global MVCC should be based on the specific requirements of your application scenario.
q2: How can one try out Klustron?
a2: Those interested in Klustron can download a trial version from our official website and deploy it according to the installation guide. Additionally, we offer Klustron's serverless services on Amazon's marketplace and Alibaba Cloud, which are also available for trial.
We invite everyone to download and install the Klustron database cluster for free use (no registration code required).。
Download the complete Klustron software package here:
http://downloads.klustron.com/
For purchases, please reach out to us at sales_vip@klustron.com. For further inquiries, our assistant is available on WeChat for support.