Skip to main content

Klustron Sequence Functionality and Implementation Mechanism

KlustronAbout 3 min

Klustron Sequence Functionality and Implementation Mechanism

Klustron's computing nodes are derived from PostgreSQL, and therefore inherit the Sequence functionality of PostgreSQL. This article introduces the usage, use cases, and implementation of Klustron's Sequence.

Usage and Use Cases

Compared to MySQL's autoincrement function, Klustron's Sequence is more powerful and flexible in several aspects:

1. Klustron's Sequence has a many-to-many relationship with tables, while MySQL's autoincrement has a one-to-one relationship with tables.

Specifically, each table in Klustron can have any number of sequence columns that use the same or different sequences to generate sequence values, and each sequence can be used by any number of columns in any number of tables to generate ID values. In contrast, each table in MySQL can have at most one autoincrement column that can only be used by that table.

2. The initial value, maximum value, step size, range, and other properties of the sequence can be adjusted at any time, and the sequence will continue to generate new sequence values according to the new properties.

3. The sequence does not depend on indexes, and the sequence value will not wrap around after the table is cleared.

4. Multiple computing nodes in the Klustron database cluster can directly or indirectly use the same sequence to generate globally unique sequence values.

Let's take an example. First, create a table t1, where the primary key column serial is of type sequence, indicating that it uses an implicitly created sequence to generate field values, so you don't need to specify a field value when inserting data.

create table t1(a serial primary key, b int);

![](KunlunBase Sequence 功能及其实现机制/1.png)

Then, create a sequence called seq_b, which will be used to generate field values. When creating the sequence, the properties can be optionally specified, and if not specified, the default values will be used.

create sequence seq_b;

Execute this statement to insert 9 rows, explicitly using seq_b to generate field values:

insert into t1(b) values(nextval('seq_b'));

The data in t1 can be seen as follows:

select*from t1;

![](KunlunBase Sequence 功能及其实现机制/2.png)

Then create table t2, where both the b and c columns use seq_b to generate default field values, and its primary key column also uses implicit sequence to generate field values:

create table t2(a serial primary key, b int default nextval('seq_b'), c int default nextval('seq_b'));

Since all fields in t2 have default values, use the following statement to insert 3 rows into t2.

insert into t2 default values;

View the data in t2, you can see that the b and c fields in each row are generated by seq_b in turn, and the sequence values start from 9 after the last time seq_b generated a sequence value.

select*from t2;

![](KunlunBase Sequence 功能及其实现机制/3.png)

Finally, you can also use the statement select nextval('seq_b'); to directly generate sequence values.

Modifying sequence metadata and others

You can use the ALTER SEQUENCE statement to modify the properties of a sequence, or use the ALTER TABLE ... ALTER COLUMN ... SET seqoptions statement to modify the properties of an implicit sequence in a column.

You can also use the above ALTER TABLE statement to restart a sequence, and use the lastval() function to get the last value returned by a sequence.

![](KunlunBase Sequence 功能及其实现机制/4.png)

Implementation of Sequence

Klustron's implementation of sequence inherits the sequence mechanism of PostgreSQL.

To enable sequence data with disaster recovery capabilities and to allow for any number of computing nodes to use them simultaneously, the numerical data related to the distribution of sequence values is stored in the MySQL.sequences table on storage nodes, with each row corresponding to a sequence.

The specific storage cluster for a sequence's metadata is dynamically assigned by the computing node during sequence creation.

Other metadata for the sequence is stored on the computing node, and the following statement can be used to view the sequence's metadata on the computing node:

select t2.relname, t2.oid, seqstart, seqincrement, seqmax, seqmin, seqcache, seqcycle from pg_sequence t1, pg_class t2 where t1.seqrelid = t2.oid;

![](KunlunBase Sequence 功能及其实现机制/5.png)

It can be seen that the implicit sequence for the primary key columns of t1 and t2 are t1_a_seq and t2_a_seq, respectively, and the explicitly created seq_b are stored in storage clusters 1, 2, and 1, respectively.

At the same time, it can be seen that the basic metadata for the sequence is also stored in the pg_class metadata table, while its specific properties are stored in the pg_sequence table.

Connecting to shards numbered 1 and 2 and viewing the numerical metadata for these 3 sequences in the MySQL.sequence tables in these two storage clusters reveals the following information:

![](KunlunBase Sequence 功能及其实现机制/6.png)

![](KunlunBase Sequence 功能及其实现机制/7.png)

When a sequence is used for the first time or when the reserved numerical range is exhausted, a computing node CN will use its cluster_log_applier process to reserve (curval, cur_val + max(10, seqcache)) for this sequence on the storage cluster. CN then uses this reserved range to distribute sequence values for the sequence until it is exhausted again.

Thus, even if multiple computing nodes use the same sequence to distribute sequence values, high performance can be maintained and all sequence values distributed by all computing nodes remain unique.

Conclusion

To put it simply, if both read and write operations on the database are performed on the same database server, the performance of the business system will be reduced.

To improve the performance of the business system and optimize the user experience, you can use master-slave replication to reduce the workload on the main database.

Moreover, if the main database crashes, the business system can be quickly switched to the slave database to avoid data loss.

END