Klustron Core Competence

The previous chapter introduced the architecture of Klustron, and this chapter then introduces the core capabilities of Klustron.

01 High availability (High Availability)

High availability technology is used to ensure that data can be continuously written to the storage cluster after the primary node of the storage cluster fails. The method is to allow the standby machine to continuously receive the data changes of the host through active-standby replication, and to elect a new primary node when the primary node fails, and let other standby nodes continue to copy data updates from the new primary node.

Klustron's storage cluster (storage shard) uses Klustron-storage to form a master-slave replication cluster with 2N+1 nodes. The master and slave nodes copy the data changes of the master node to the slave node through binlog replication, and the slave node replays the data changes Events and thus the eventual consistency of primary and secondary data. Klustron uses the unique fullsync technology to ensure that for any transaction submitted or prepared on the primary node, its binlog will be received by the specified number of standby machines and the relay log will be flushed before returning success to the corresponding computing node. Usually, for a shard with 2N+1 nodes, Klustron will set by default that each submitted transaction must be replicated to at least N standby machines, which ensures that if the primary node fails at any time, one can always be selected The standby node with all committed and prepared transactions will continue to bear the write load as the new primary node.

After the master node fails, the computing node can automatically switch the master node of the storage cluster (auto failover) without manual intervention.

For details, see Klustron Fullsync Technology Introduction and Klustron Fullsync HA Technology Introduction .

02 Indestructible OLTP transaction processing capability

Klustron's transaction processing has a robust and complete disaster recovery capability (crash safety&fault tolerance), ensuring that any node failure or network failure in the cluster will not cause the cluster to stop services or lose user data--specifically, 2*N+1 nodes The shard allows N nodes to fail at the same time without losing the writing ability or the submitted user data; the failure of the computing node will not lose user data or system metadata or cause data inconsistency, and the only impact is that the client connection of the application software is disconnected On, causing the transaction in progress to be rolled back. As long as there is one computing node running, the system can still read and write normally.

Klustron uses the distributed transaction two-phase commit technology to ensure that the ACID properties of the distributed transaction initiated by the client can be guaranteed when the node/network fails --- the data update of the committed transaction will not be lost or damaged or cause any errors.

A Klustron cluster can contain any number of computing nodes, and computing nodes can be added or removed as needed at any time. After the client connects to any one of the computing nodes to execute the DDL statement, the DDL statement will be automatically executed by all other computing nodes in the cluster, and all computing nodes in the cluster execute all DDL in the same order, ensuring that all computing nodes have exactly the same metadata. data. The execution of each such DDL statement transaction involves computing nodes, metadata clusters, and storage clusters. Klustron's fault-tolerant technology ensures that the failure of computing nodes or other components and nodes in the cluster will not cause failures in DDL statement execution and replication replay.

03 Horizontal expansion capability

As the amount of user data, data read and write access load, and the number of concurrent connections continue to increase, the load on the database system will continue to increase. For stand-alone databases (such as the community version of MySQL, PostgreSQL, and traditional commercial databases), at this time, hardware with stronger processing capabilities can only be replaced to improve computing and storage capabilities. However, limited by hardware technology, the total amount of CPU and memory resources of a single server is limited. For servers in 2022, x86 CPUs with 128 cores and 1TB memory are already top-level server hardware. The load of the application system will far exceed the processing capacity of such a server, causing the performance of the database system to decrease as the load continues to increase until it cannot meet the minimum requirements of the user's business. For these stand-alone databases, this is an insurmountable obstacle. Even database services in public cloud scenarios are written by a single node, and its writing capability reaches the upper limit of the hardware resources and capabilities of a single database server of a public cloud manufacturer. It can no longer be expanded.

Klustron can be smoothly scaled horizontally, and each computing node and storage cluster in the cluster can carry read and write capabilities, thereby continuously increasing data read and write capabilities and transaction processing capabilities. As long as the DBA adds more computer servers to the Klustron cluster, Klustron will automatically expand to these new servers in a non-stop and user-unaware manner, allowing them to bear the storage and read and write access loads equally with the original servers, so that The cluster as a whole can undertake greater read and write access loads and store more data.

3.1 Flexible sharding methods

The basic requirement of auto scaling is that if the user's data table is very large, it needs to be defined as a partition table. We believe that professional application system architects, programmers and DBAs understand and master the data distribution characteristics and data read and write access characteristics of each data table in their business systems, and can design a reasonable partition scheme for each table based on this , so Klustron gives users the ability to completely customize the table partitioning (sharding) method.

The sharding methods currently supported by Klustron: hash/range/list, the function and usage of these three partition methods in Klustron are exactly the same as the PostgreSQL table partition scheme , so I won’t repeat them here.

Mirror tables are supported from version 1.1. Mirror tables are used to improve the performance of read-only queries. When a table has a small amount of data (such as less than 1GB) and is not frequently updated (such as only a few hundred times a day), and it is often joined with other tables, it is suitable to be defined For a Mirror table. A Mirror table is stored in each storage shard, and every time the Mirror table is updated, the copy of the Mirror table on each shard is updated in a global transaction. After adding a storage cluster, all Mirror tables will be automatically copied to all newly added storage clusters by Klustron. Click to learn more about Mirror .

The user table partition scheme mainly includes:

Choose partition method
- Single table: no partition
- Mirror table: one copy per shard
- hash/range/list method partition
  - Select sharding columns: any number of columns
  - Set other parameters of the partition, each partition method is different

Users can choose the number of partitions according to the size of the data table. There is no need to estimate the global fixed number of partitions, nor to divide all tables into a fixed number of partitions, and the number of partitions for each table can vary. For the range and list partition methods, users can add table partitions for each table as needed; by defining a reasonable partition method, the system's data read and write performance and transaction processing performance can be optimized.

The computing node automatically selects the appropriate storage cluster for each single table or table fragment, so that all storage clusters bear a balanced load; at the same time, users can also explicitly specify to store the table in a specific storage cluster when creating a table

3.2 Multi-point reading and writing

Users can increase/decrease computing nodes and/or storage clusters (storage shard) according to computing and storage requirements. The specific operation method can be through using XPanel GUI or calling cluster_mgr API .

Each computing node can receive user connections and process the SQL requests received in the connection, and execute data read and write requests; each storage cluster stores a part of the total user data, and can execute data read and write requests.

Klustron also supports read-write separation, that is, the computing node selects the standby machine of the storage cluster to read data , which is suitable for completing data read-only query. Especially in OLAP scenarios, this can avoid interference and impact on OLTP performance due to reading data from the primary node.

04 OLAP data analysis ability

The OLAP analysis capability of Klustron is a supplement and extension to the columnar data warehouse-based BI and data mining commonly used in the industry. Klustron benefits from the complete data analysis functions of PostgreSQL, coupled with the global multi-level parallel query processing technology developed by our deep kernel on this basis, Klustron supports the use of a large number of database server resources for OLAP data analysis. In the post-Moore era, it can Make full use of a large number of computing resources to achieve high-performance data processing and computing.

The application scenarios of Klustron are significantly different from traditional columnar data warehouses, mainly:

\1. No need for ETL, directly use the latest data analysis

Klustron does not need ETL, but can use the latest OLTP business system data to complete data analysis, so the analysis result is a response to the current latest status of the user's business. With read-write splitting, OLAP workloads do not impact OLTP workloads.

\2. Continue to gather multi-source data

Users can continuously stream and aggregate the data changes of the database systems of other multiple business systems into a Klustron cluster, and in this cluster, all the latest business data of all these business systems can be unified and comprehensively analyzed to discover rules and insights. Node and network failures are unavoidable during this long process of continuous aggregation and influx. Since Klustron supports transaction processing, node failures will not cause inconsistencies in the aggregated data, and under certain conditions, it can be resumed without loss.

Klustron's global multi-level parallel query processing technology allows Klustron to execute a SQL query statement while simultaneously achieving three levels of parallelism, namely:

Compute Node Layer Parallelism
Parallelism Between Compute Nodes and Storage Nodes
Parallelism at the Storage Node Layer

Klustron OLAP performance is continuously improving, here is the Klustron 1.1 TPC-H and TPC-DS performance report .

Klustron Core Competence

# Klustron Core Competence

# 01 High availability (High Availability)

# 02 Indestructible OLTP transaction processing capability

# 03 Horizontal expansion capability

# 3.1 Flexible sharding methods

# 3.2 Multi-point reading and writing

# 04 OLAP data analysis ability

# END

Klustron Core Competence

01 High availability (High Availability)

02 Indestructible OLTP transaction processing capability

03 Horizontal expansion capability

3.1 Flexible sharding methods

3.2 Multi-point reading and writing

04 OLAP data analysis ability

END