Principles and Defects of the Classic Two-Phase Commit Algorithm
Principles and Defects of the Classic Two-Phase Commit Algorithm
Background
In the classic two-phase commit algorithm, some nodes may encounter failures during transaction submission, leading to errors that can result in data loss or errors for users. These errors include:
1.Part of a distributed transaction branch is committed while the other part is rolled back.
2.The client receives a response indicating that the transaction has been successfully committed, but all branches of the distributed transaction are rolled back.
3.The client receives a response indicating that the transaction has been rolled back, but some or all branches of the distributed transaction have been committed.
4.During storage node failure restore, a transaction branch of a storage node cannot be restored correctly.
For errors of the #4 type, the storage node itself is responsible for error handling, while the distributed transaction processing mechanism is responsible for handling errors of the first three types. The author will discuss this further in a next article.
The author has previously shared a technical presentation about handling the #4 error at FOSDEM 2021: https://fosdem.org/2021/schedule/event/mysql_xa/. The video link is: Klustron Distributed Database MySQL XA Disaster Recovery Technology for Transaction Processing (https://b23.tv/h7zzmR). More detailed articles will be written in the future.
Principles of the Classic Two-Phase Commit Algorithm
The two-phase commit algorithm divides transaction submission into two phases: prepare and commit.
In the first phase, the transaction manager GTM sends a prepare command to all resource managers (RM). Each RM then prepares the local branch of the distributed transaction, which is to flush their WAL logs to ensure that even if the RM crashes, these prepared transactions can still be committed (or rolled back) after recovery.
After preparing a transaction, the transaction enters a prepared state, and can be either committed or rolled back.
If GTM receives successful responses from all RMs, it sends a commit to each participating RM, and each RM then submits its prepared transaction branches, thus completing the two-phase commit.
Defects of the Classic Two-Phase Commit Algorithm
If a failure occurs during the two-phase commit process, such as GTM or RM crashes, then the two-phase commit process may be interrupted and unable to proceed correctly.
How to optimize and avoid its defects?
To avoid such accidents, Klustron's distributed transaction processing mechanism is based on the classic two-phase commit algorithm, and its resilience and error handling capabilities are enhanced on this basis.
This ensures that at any time, node failures or network issues, timeouts, etc. in the Klustron cluster will not cause inconsistencies or data loss in cluster management.
A Klustron cluster consists of multiple independent and functionally identical computing nodes for distributed transaction processing and distributed query processing (discussed in the next article):
Multiple storage clusters store user data shards, and each storage cluster uses a high availability mechanism to ensure that data is not lost when nodes crash.
A metadata cluster with the same structure as the storage cluster stores key metadata information for the cluster, including the commit log mentioned in this article.
A cluster_mgr module is responsible for maintaining the cluster's operation status and processing prepared state transaction branches left over due to node failures.
Therefore, with the organic cooperation of the above modules, the defects of the classic two-phase algorithm can be well avoided!
Conclusion
In summary, we have optimized the classic two-phase commit algorithm process to solve these problems and achieve an unbeatable disaster recovery capability. We will discuss the principles and processes of these optimizations in detail in the next article (and welcome everyone to discuss).