Summary and advantages of Klustron
Summary and advantages of Klustron
The Value Klustron Brings to DBAs and Application Software Development Teams
Klustron is a distributed HTAP database system that focuses on solving application software, Web systems and SaaS cloud services in various industries in storing, managing and utilizing massive relational data, as well as supporting high-concurrency and high-load transaction processing and data reading and writing services Facing a series of great challenges as follows, in order to create value for application software developers, service providers and end users.
- The limited computing and storage resources of a single server form a huge contradiction with the continuous growth of data management scale and access load;
- Various software and hardware failures may lead to computer server node failures and network failures, which will lead to great challenges and difficulties in maintaining the long-term reliability and continuity of data read and write services, as well as data persistence and consistency;
- Under elastic and fluctuating data read and write loads, maintaining a sustained and stable high throughput rate and low latency is a huge technical difficulty, but it is extremely important for a smooth and smooth end-user experience, so it must be done;
- The compatibility of application ecology is of great value, and the compatibility of existing application software using MySQL and PostgreSQL, two world-class databases, is of great value.
In order to deal with the first type of challenge, many MySQL users used sub-database and sub-table middleware or implemented sub-table logic in the application system in the past few years. These "earth methods" have a series of serious defects as follows, and cannot cope with the second, 3,4 types of challenges. Using these "earth methods", users essentially need to implement data management functions, even transaction processing and fault tolerance functions, in the business system case by case, which is an impossible task for most application software development teams. The reliability, stability, and maintainability of the application software system will face serious problems, and the development difficulty will be greatly increased, the development cycle will be uncontrollable, the risk of project delay will be greatly increased, and the labor cost required for the project will be greatly increased. At the same time, automatic elastic scaling cannot be completed, because the data splitting logic is deeply bound to the application software.
All of the above challenges are met completely, thoroughly and reliably with Klustron ! Klustron completely encapsulates the functions of the database system, so that application software developers only need to focus on implementing business logic. Regardless of the amount of data that needs to be stored and managed and the load of online access, users (DBAs, application software developers and architects) can completely hand over the data management tasks to Klustron, and only need the DBA to increase or decrease the database server hardware as needed, Klustron The elastic scaling can be automatically completed to carry these elastically changing loads. This greatly improves the work efficiency of application software programmers, greatly reduces the workload and technical difficulty of application system development, ensures the quality, stability and reliability of application software systems, and greatly reduces project development cycles and costs.
For application developers, using Klustron is exactly the same as using MySQL and PostgreSQL databases, because Klustron supports JDBC, ODBC, Hibernate, MyBatis and client connection libraries of all common programming languages, and all software written in these languages can be connected to Klustron And correctly execute all standard SQL statements, as well as MySQL and PostgreSQL's private DML SQL statements, so the application software that originally used MySQL and PostgreSQL can use Klustron without any modification. At the same time, Klustron supports importing full and incremental data from all common relational databases, which is convenient for users to migrate to Klustron or from Klustron to other databases at any time.
Let's analyze in detail the problem of using application layer sub-database and sub-table or using sub-database and sub-table middleware.
01 The "big head" sub-database and sub-table middleware
The problem with data row routing middleware such as mysql_proxy, mysql_router, mycat, etc. is:
1.1 They do not support full distributed query processing
As long as the legal SQL statements sent by the user program to these middleware involve some advanced SQL functions, such as multi-table join, subquery, CTE, window function, aggregation, etc., such middleware usually cannot handle them; Various data analysis tools and algorithms, such as low-code tools, OR mapping middleware (such as hibernate), machine learning algorithms, etc., cannot interact and collaborate with these middleware.
If the application layer sub-database and table are used, the application software programmer needs to query the data fragments in the storage cluster where the target data is located in the business code, and then assemble the final result in the business code. These operations are a query processing and execution for a specific SQL statement at the application layer. Once the "query statement" needs to be modified, a large amount of application code needs to be modified, so the maintenance cost of the application software is very high.
This work could have directly sent SQL statements to the distributed database to get the results, but without a distributed database, the query processing function can only be implemented once for each SQL statement for a specific query, and the workload is naturally very huge. In particular, such query processing code may need to be modified repeatedly due to changes and iterations of business logic requirements. This development workload is larger and much more complicated than directly modifying SQL statements.
1.2 They do not support distributed transaction processing for reliable disaster recovery
Many application programmers are unaware of the business risks of not doing two-phase commit for distributed transactions, and are in a state of "unknowingly"; a few application programmers have thought of this potential risk, but they cannot solve it, so they muddle along.
A small number of programmers recognize the problem and can solve it, but they can only solve it case by case. For example, in order to realize the reliable transfer function, it is necessary to design a set of technologies to realize the disaster recovery capability of the transfer scenario at the business layer. In other scenarios, it is necessary to redesign and implement a set of algorithms.
This has led to a sharp increase in the technical threshold and workload of application development, and there are huge risks and uncertainties in product reliability and stability; there is a high risk of project delay and high development costs. Although some middleware uses MySQL's XA function for two-phase commit, it cannot reliably guarantee attributes such as user data consistency (ACID) when abnormal conditions such as node downtime/disconnection/timeout occur.
A common problem mentioned above is that application software developers need to know in which storage cluster each of their tables is stored in order to make correct data management and query functions, which further binds data management and business logic , which violates the original intention of the database system --- to completely encapsulate data management, so that application software developers do not consider any details of data storage management at all.
1.3 They cannot achieve automatic horizontal elastic expansion
The expansion needs to be done manually by the DBA, and the service needs to be suspended for a period of time (such as several hours). The service suspension will seriously affect the business continuity and user experience.
02 Application sub-database and sub-table in the Stone Age
There is also a more original method in the industry to solve the problem of excessive data storage scale and access load - application layer data partitioning. What makes this approach even more primitive is that, in addition to all the above-mentioned head-scratching problems, it has the following set of serious problems, so that we can say that the products and services of these companies are still in the Stone Age. Surprisingly, there are not a few such Stone Age companies.
Issues unique to application layer sharding include:
2.1 Hard-coded table split logic
In this way, similar functions must be implemented for each table, and the development burden and complexity are high. Especially if multiple application software/web services need to use the same set of data tables (this is a common situation), it is also necessary to keep the table division rules of all these programs for each table the same, and the development workload and complexity will increase exponentially. Not just linear exponentially.
Even if it is smarter to use configuration files like the above-mentioned middleware with a big head, there are still problems-it is necessary to implement table division logic, which still greatly increases the workload of business development. And in the end you have implemented a mediocre middleware. The reason why it is mediocre is that it is only used by your company/team, and it may only be applicable to your specific business scenarios. This makes data management and application logic further tightly bound and dependent, which is a very poor system design.
2.2 Horizontal expansion is more difficult than "bigger head"
In the case of hard-coded database and table logic, elastic expansion is almost impossible, because you need to modify the business code to implement new data partition rules to expand capacity, which is a nightmare for developers and DBAs.
Therefore, we decided to develop a real distributed database product to completely rescue the above-mentioned "big headed" users and users in the "Stone Age", so that they can come to this technological era and feel the cutting-edge modern The charm of technology.
From then on, they will no longer rack their brains to design and implement distributed data management and query programs, but simply send SQL statements to initiate and submit distributed transactions, execute distributed queries and directly obtain query results.
In this way, the data management is really separated from the application software again, and the data management is abstracted from the application logic. Lessons to come:
Use an independent database system for data management, separate application software development from general data management logic, maximize software reuse and simplify application development, and greatly improve developers' work efficiency and the integration of their business logic and products Reliability, lowering the technical threshold of the user's business system and greatly improving its reliability, reducing the company's development costs, and ensuring that the online time of its online business system is controllable and predictable.
03 Technical Features of Klustron
Klustron is a high-performance NewSQL OLTP distributed database, which is the core capability of Klustron. Our core goal is to solve the problems faced by users in the management and utilization of massive data storage.
Facing the all-round new requirements of massive data management and utilization in the new technology era, through the core key features such as horizontal elastic expansion, automatic data partitioning, distributed transaction processing and distributed query processing, disaster recovery, high availability, and strong consistency, it enables various industries Application software developers focus on application logic development and do not need to undertake the realization of data management functions at all, greatly improving the developer's work efficiency and the reliability of their business logic and products, lowering the technical threshold of the user's business system and greatly improving its reliability. The company's development costs ensure that the launch time of its online business system is controllable and predictable.
I believe you now know why you should choose to use Klustron, fundamentally solve these problems, completely hand over data management to Klustron, and do not need to know any details about data storage and management in the user's business system, which greatly reduces the difficulty of business system development and cost, and improve the reliability and stability of the business system.
Read the main content of this chapter below, learn more about the architecture, basic concepts and technical advantages of Klustron, and make a simple preparation for using Klustron.
1. Klustron system architecture
2. Core competencies of Klustron
3. Technical advantages of Klustron