Klustron system architecture
Klustron system architecture
01 Preface
Klustron Kunlun Distributed Database Cluster (hereinafter referred to as Klustron) is a distributed relational database management system, oriented to TB and PB level massive data processing, with high throughput and low latency to process massive data and high concurrent read and write requests.
It provides robust transaction ACID guarantee, efficient and easy-to-use distributed query processing, high scalability, high availability and transparent sub-database and sub-table data processing functions, business layer and end-user unaware horizontal expansion capabilities, is a typical NewSQL distributed database system.
Application software developers can use Kunlun database in the same way as single-node relational databases, and can get all the advantages of the NewSQL database mentioned above, without having to consider storage details such as data partitioning methods.
In this way, application developers can quickly develop robust, reliable, highly available and highly scalable information systems to process PB-level massive data. All the challenges and difficulties of massive data management are solved by the Kunlun system, which greatly reduces the time cost, capital cost and technical difficulty of developing distributed systems, and comprehensively improves its product quality, greatly speeding up the process of application development and change. Go-live progress.
02 Architecture
[Architecture diagram: A Klustron cluster consists of compute nodes, storage shards, and a metadata cluster]
A Klustron cluster consists of three types of components: one or more computing nodes, one or more storage shards, a metadata cluster, and cluster management tools.
2.1 Compute Node
Klustron's computing node (Klustron-server) contains metadata for each data table and other database objects (indexes, views, materialized views, sequences, stored procedures/functions, users/roles and privileges, etc.), but does not store user data ( Therefore, it can be increased or decreased as needed at any time), and user data is stored in the storage shard.
Users can increase or decrease computing nodes according to the workload. Each computing node is equal and independent to each other, without dependencies, and can handle user connections and read and write requests.
The computing node supports the PostgreSQL connection protocol and the MySQL connection protocol to receive and verify the user's connection request. After the verification is passed, it receives and processes the query sent by the connection and returns the result to the client.
When executing a SQL, the computing node parses the statement, then performs distributed query optimization on it, and then completes the distributed query execution by interacting with the back-end storage shard. The interactive method is to generate SQL statements for the relevant back-end storage shards according to the needs of SQL statements and the distribution information of data table partitions in the back-end shards.
Computing nodes will concurrently send SQL statements to the storage cluster. If the executed SQL statement is SELECT or INSERT/DELETE/UPDATE...RETURNING instead of simple INSERT/DELETE/UPDATE, then the computing node will also receive the storage cluster after sending the statement The returned partial results are combined and processed with the results returned by all back-end storage shards to form the final query results and returned to the client.
The computing node Klustron-server of Klustron is developed based on PostgreSQL. The current version 1.0/1.1 is based on PostgreSQL-11.5 and will be upgraded to a newer version of PostgreSQL in the future. In order to realize various core functions of Klustron, such as automatic DDL synchronization and replication, distributed transaction processing, distributed query processing, global deadlock processing, parallel query processing and other advanced functions, we have added and modified PostgreSQL source code A large amount of code, the self-developed code of each module and component of Klustron exceeds 1 million lines.
Our code is well modularized so that we can continue to follow PostgreSQL version updates in the future.
2.2 Storage cluster (shard)
Klustron-storage is the storage node of Klustron, which is the MySQL branch we developed based on the deep optimization of percona-server-8.0. Users must use Klustron-storage software to build Klustron storage clusters and metadata clusters, because the key functions required by Klustron clusters only exist in Klustron-storage, and it also includes all disaster recovery defects of the community version of MySQL-8.0 XA transaction processing repair; Finally, Klustron-storage has a significant performance optimization in XA transaction processing compared with the community version of MySQL.
Each storage shard stores a part of user tables or table partitions, and the user data subsets of each shard have no intersection; each storage shard is a binlog replication cluster, through the standard MySQL row based binlog replication and Klustron's unique Fullsync strong synchronization mechanism To achieve financial-level data consistency and ensure RPO = 0; at the same time, Klustron's self-developed high-availability mechanism Fullsync HA can monitor the running status of the master node of each shard and automatically complete the master node election and master-standby switchover when a master node failure is found. Make sure the RTO is less than 30 seconds.
The master node of a shard accepts read and write requests from computing nodes, executes the requests and returns the results to the computing nodes; when the standby machine read function is enabled, the standby node of the shard can receive and process read-only requests from computing nodes.
Users can increase or decrease storage shards according to the increase or decrease of data volume, and the data will be evenly distributed to all shards automatically, so as to achieve automatic and transparent high scalability.
2.3 Metadata Cluster
The metadata cluster is also a binlog replication cluster composed of Klustron-storage instances, and is also based on Fullsync and Fullsync HA technology to achieve strong consistency and high availability. A metadata cluster stores the metadata of a Klustron cluster and can be shared by multiple database clusters. The metadata cluster stores the topology of these Klustron clusters, node connection information, DDL logs, commit logs, and other cluster management logs.
2.4 cluster_mgr cluster
The cluster_mgr cluster of Klustron is responsible for maintaining the correct cluster and node status, realizing functions such as cluster management, cluster logical backup and recovery , cluster physical backup and recovery , and horizontal elastic scaling . All these functions are provided to users in the form of API, which is convenient for users to call in scripts or various software. At the same time, the XPanel DBA GUI tool software that comes with Klustron realizes its functions by calling the cluster_mgr API .
The cluster_mgr high-availability cluster is based on raft technology, which ensures that after the master node goes down, the cluster standby node can immediately discover and elect a new master node to provide cluster management. DBA can manage the cluster_mgr cluster and complete the advanced cluster_mgr cluster management functions.
A cluster_mgr cluster is bound to a metadata cluster and works closely with it to provide services for one or more Klustron clusters. Binding here means that a cluster_mgr cluster uses a fixed metadata cluster, and a metadata cluster can only be used by the same cluster_mgr cluster, and the two are one-to-one.
2.5 Auxiliary tools
1、XPanel
Klustron provides XPanel GUI tool software, allowing DBAs to easily complete all database operation and maintenance management tasks by clicking the mouse.
2. Prometheus and Grafana monitor the running status of cluster nodes
Klustron automatically installs prometheus and Grafana during bootstrap so that DBA can use Grafana to monitor the real-time running status of each node of the Klustron cluster.
3. ElasticSearch + Kibana collects and retrieves cluster node logs
If the user installs ElasticSearch and sets its access mode during bootstrap, related modules of Klustron will also automatically cooperate with related modules of ES, so that ES can collect the running logs of all nodes of the Klustron cluster, which is convenient for users to retrieve and view these nodes through Kibana log.
4. Klustron has a complete set of data import and export tools
These tools allow users to easily complete static data migration with full data backups of MySQL and PostgreSQL database instances, or complete hot data migration and replication with their full data backups plus streaming incremental data updates. At the same time, Klustron can cooperate with common third-party data migration tools to complete static full data migration and dynamic hot data migration from common database systems (such as Oracle, SQL Server, TiDB, etc.).