Skip to main content

Zetuo Kunlun Klustron Typical Configuration Guide

KlustronAbout 5 min

Zetuo Kunlun Klustron Typical Configuration Guide

01 Key Considerations for Cluster Configuration and Usage

  • Resource Preparation: In the cluster deployment plan, the hardware configuration of each server should be consistent, as well as the operating system and version. For production system deployment, the network bandwidth between servers is recommended to be 10 Gigabit, and the storage devices should ideally be PCIe NVMe SSDs of 512GB or more. It is best to install two such storage devices on each server: one for storing binlog log files and storage engine redo log files, and the other for storing data files. These two paths can be set during Storage Node installation. The CPU should have at least 4 cores, with 32 cores recommended; memory should be at least 16GB, with 128GB recommended.

  • Running Compute Node: Each server should run one Compute Node. Prioritize using the Compute Node service on the same primary node as the storage shard for OLTP loads, while other Compute Nodes are used for OLAP (reporting, analysis) loads, and for OLTP loads when the above Compute Node resources are tight.

  • Storage Shard Configuration: The number of Storage Shards in the cluster should not exceed the number of servers, and each server should run only one primary node of a shard in the cluster. The configuration parameters for the primary and replica nodes of each shard should be consistent. The parameters of different shards can be customized according to business needs. Multiple nodes of the same shard should not be deployed on the same server.

  • Shard Replica Configuration: The number of replicas for each shard can be defined according to business needs. Klustron cluster does not limit the number of replicas, but for reliability, it is recommended to have at least 3 replicas per shard, which means 1 primary node and 2 replica nodes. The number of shard replicas should not exceed the number of available machines.

  • Mixed Deployment: In actual production environment deployment, a Compute Node and several Storage Nodes can be deployed on the same server to save costs and make efficient use of resources. The characteristics of Klustron cluster mixed deployment and peer-to-peer deployment architecture are that Compute Nodes and Storage Nodes coexist on the same physical server. Each server runs both a Compute Node and a Storage Node. This is feasible for users with limited hardware resources and low load, but the following principles must be followed:

    • Multiple nodes of the same shard should not be on the same server but should be staggered across multiple servers.
    • For peer-to-peer deployment, the Klustron database automatically assigns shard nodes to servers during cluster creation without manual intervention.
    • For mixed deployment, refer to the deployment diagram in this document and use XPanel for deployment. The general principle is: deploy one Compute Node per server; deploy the primary node of each shard on one server, and this server should not deploy other Storage Nodes; deploy the R replica nodes of each shard staggered across other servers, ensuring that no two nodes of the same shard exist on the same server. Therefore, given 2 * N servers, deploy N shards, with the primary nodes of these N shards on N servers, each server deploying one primary node; the other N servers deploy the R * N replica nodes of these N shards, each server deploying R replica nodes of different shards. Here, R is the number of replicas per shard, and it must be ensured that 2 <= R <= N.
  • High Load and Budget Deployment: For users with high read/write access loads and sufficient budget, it is recommended to deploy one Compute Node and one Storage Node (primary or replica node) per server to achieve optimal performance. If there are N shards and R replicas per shard, then the cluster will need (R + 1) * N servers in total.

  • Metadata Cluster Deployment: The metadata cluster can be deployed on a randomly selected subset of servers or on all servers, ensuring the total number of servers is not less than 3. The Cluster_mgr cluster can be deployed on a randomly selected 2 * n + 1 servers.

02 Operating Systems and CPU Architectures

2.1 Systems and Architectures

Operating System VersionCPU Architecture (x86_64/AMD64/ARM64)
CentOS 7.5 and abovex86_64 ARM64
CentOS 8x86_64 ARM64
UnionTech Server OS V20 (1050a)AMD64 ARM64
UnionTech Server OS V20 (1050e)AMD64 ARM64
openSUSE 15x86_64
Ubuntu 20.04x86_64
Kylin v10x86_64 ARM64

TODO: Add ARM chips and corresponding OS combinations

2.2 Cluster Server Configurations

The peer-to-peer deployment modeand hybrid deployment mode mentioned below require using XPanel to complete cluster installation.

Scenario 1: Small Data Volume (<1TB), Hybrid Deployment on 3 Servers

QuantityCPUMemoryDiskNetworkSystem and Architecture
316128GBSSD, 1TB+ Adjusted according to actual data volume10Gb NICSee 2.1

Klustron Version: 1.3.1

Cluster Configuration:

ComponentNumber DeployedDeployment ArchitectureDeployment Details
Compute Node3One compute node per server NodeCan be co-deployed with storage nodes
Storage Node3One storage node per server, one serving as the primaryCan be co-deployed with compute nodes
Shard1One primary and two replicas per shardEach storage node hosts one primary and two replicas

In the diagram below, CN1/CN2/CN3 refer to Compute Node 1/2/3; M.1 refers to the primary node of shard 1; R.1.1 refers to the first replica of shard 1, and R.1.2 refers to the second replica of shard 1. The following diagrams follow a similar notation.

[Image: 3-server simple deployment with 1 shard]

Scenario 2: Medium Data Volume (1-5TB), Low Write Environment, Hybrid Deployment on 3 Servers

QuantityCPUMemoryDiskNetworkSystem and Architecture
332256GBSSD, 1.5 - 3TB, adjusted according to actual data volume10Gb NICSee 2.1

Klustron Version:1.3.1

Cluster Configuration:

ComponentNumber DeployedDeployment ArchitectureDeployment Details
Compute Node3One compute node per server NodeCan be co-deployed with storage nodes
Storage Node9Three storage nodes per server, peer-to-peer deploymentCan be co-deployed with compute nodes
Shard3One primary and two replicas per shard, peer-to-peer deploymentEach storage node hosts one primary and two replicas

Scenario 3: Medium Data Volume (1-5TB), High Write Environment, Hybrid Deployment on 4 Servers

QuantityCPUMemoryDiskNetworkSystem and Architecture
432256GBSSD, 1.5 - 4TB, adjusted according to actual data volume10Gb NICSee 2.1

Klustron Version:1.3.1

Cluster Configuration:

ComponentNumber DeployedDeployment ArchitectureDeployment Details
Compute Node4One compute node per server NodeCan be co-deployed with storage nodes
Storage Node63 Storage Nodes per server, peer-to-peer deploymentCan be co-deployed with compute nodes
Shard2One primary and two replicas per shardEach storage node hosts one primary and two replicas

[Image: 4-server mixed deployment]

Scenario 4: Medium Data Volume (5-30TB), Low Write Environment, Hybrid Deployment on 6 Servers

QuantityCPUMemoryDiskNetworkSystem and Architecture
632512GBSSD, 3TB - 6TB, adjusted according to actual data volume10Gb NICSee 2.1

Klustron Version:1.3.1

Cluster Configuration:

ComponentNumber DeployedDeployment ArchitectureDeployment Details
Compute Node6One compute node per server NodeCan be co-deployed with storage nodes
Storage Node183 Storage Nodes per server, peer-to-peer deploymentCan be co-deployed with compute nodes
Shard6One primary and two replicas per shard, peer-to-peer deploymentEach storage node hosts one primary and two replicas

Scenario 5: Medium Data Volume (5-30TB), High Write Environment, Hybrid Deployment on 6 Servers

QuantityCPUMemoryDiskNetworkSystem and Architecture
6321TBSSD, 4TB - 15TB, adjusted according to actual data volume10Gb NICSee 2.1

Klustron Version:1.3.1

Cluster Configuration:

ComponentNumber DeployedDeployment ArchitectureDeployment Details
Compute Node6One compute node per server NodeCan be co-deployed with storage nodes
Storage Node93 servers deploy one storage primary node each, the other 3 servers deploy two different shard replicas eachCan be co-deployed with compute nodes
Shard3One primary and two replicas per shardEach storage node hosts one primary and two replicas

[Image: 6-server mixed deployment]

Scenario 6: Large Data Volume (30-600TB), Hybrid Deployment on 6-15 Servers. Larger volumes can be scaled proportionally, without an upper limit.

QuantityCPUMemoryDiskNetworkSystem and Architecture
6 - 30641TBSSD, 6 - 30TB+, adjusted as needed10Gb NICSee 2.1

Klustron Version:1.3.1

Cluster Configuration:

ComponentNumber DeployedDeployment ArchitectureDeployment Details
Compute Node6 - 30One compute node per server NodeCan be co-deployed with storage nodes
Storage Node18 - 903 Storage Nodes per server, peer-to-peer deploymentCan be co-deployed with compute nodes
Shard6 - 30One primary and two replicas per shard, peer-to-peer deploymentEach storage node hosts one primary and two replicas

END