2. Best Practices for Peer-to-Peer Deployment
2. Best Practices for Peer-to-Peer Deployment
01 Understanding Peer-to-Peer Deployment
Klustron is a distributed database that separates computing and storage, supports massive data processing, and can solve the complexity problems caused by traditional database sub-databases and sub-tables.
The core components of Klustron are shown in the figure below, which consists of computing engine and storage engine.
The computing engine is mainly responsible for data processing and calculation, and is a CPU resource-intensive server, while the storage engine is an IO resource-intensive server. In the actual production environment deployment process, the computing engine and storage engine can be deployed on the same server to obtain resources. Efficient utilization while achieving high reliability through component redundancy among different nodes.
A typical peer-to-peer deployment structure is shown in the figure below:
There is no limit to the number of Klustron cluster services. The peer-to-peer deployment architecture is characterized by the coexistence of computing nodes and storage nodes in the same physical server, and each server runs both computing nodes and storage nodes.
For multi-copy shards, the master and slave copies cannot be located in the same server, but should be equally distributed in other servers.
1.1 Klustron-server (computing node)
The computing node runs a stateless computing & query engine, interacts with the storage to execute SQL tasks, and adopts the asynchronous communication mode. The computing node can not only interact with the storage node on the machine, but also interact with other storage nodes on the docking node through the network to exchange data. deal with.
Client applications can connect to any computing node to perform data processing tasks of the Klustron cluster.
1.2 Klustron-storage (storage node)
A storage node is composed of multiple storage shards (shards). The master-slave copies of the shards are equally distributed among the available servers. The storage node of each server is composed of the master copy of a shard and the slave copies of other shards.
Benefits of peer-to-peer deployment:
Effective use of server hardware resources
Components are mutually redundant between servers to improve system reliability
Easy expansion, flexible expansion
02 Peer-to-Peer Deployment Guide
2.1 Resource preparation: In the peer-to-peer deployment scheme, the hardware configuration of each server should be consistent, the operating system and version should be consistent, and the network bandwidth between servers is recommended to be gigabit or above.
2.2 Each server runs a computing node.
2.3 The number of shards in the cluster should not exceed the number of servers, and each server can only run the master node of one shard of the cluster. The parameter configuration of the master and slave replicas of each shard is consistent. The parameter configuration of different shards can be customized according to business requirements.
2.4 The number of copies of each shard can be defined according to business needs. The Klustron cluster has no limit on the number of copies, but considering reliability, it is recommended that each shard have at least 3 copies, and the number of copies should not exceed the number of available machines.
2.5 In the configuration, it is recommended that the master-slave replica not run in the same server.
2.6 Management components and other components:
- Metadata cluster: It is recommended to deploy in a high-availability mode with one master and two slaves, and there are few temporary resources. You can choose the corresponding server deployment according to the current network situation.
- Cluster manager: It is recommended to deploy it in a high-availability mode with one master and two slaves, and there are few temporary resources. You can choose the corresponding server deployment according to the current network situation.
- Node Manager: One deployed per server.
03 Installation and deployment process
For the detailed installation process of peer-to-peer deployment, please refer to the document: Klustron Quick Start Installation Guide, Klustron 1.0 will support the peer-to-peer mode installation based on the WEB UI interface, improving the ease of installation.
Refer to the table below for server resource planning for peer-to-peer deployment (take three nodes as an example)
04 Peer-to-peer deployment server configuration requirements
As a distributed database, the Klustron cluster has relatively low server configuration requirements. The server configuration of the entry-level Klustron cluster is as follows:
Klustron entry-level machine configuration:
- 3 servers: Amazon m5.4xlarge model (CPU 8cores 16Threads, memory 64G, storage gp3, general-purpose SSD volume 3000IOPS, 125MB/s throughput, network bandwidth between nodes 10G).
- Database software: Klustron 0.9.1.
- Deployment architecture: peer-to-peer deployment, 3Shard, each shard has 3 copies.