跳至主要內容

Klustron(原KunlunBase) 不支持的 MySQL 语法和功能汇总

Klustron大约 10 分钟

Klustron(原KunlunBase) 不支持的 MySQL 语法和功能汇总

一、Klustron-Server(Klustron的计算节点)不支持 MySQL 私有的 DDL 语法

从Klustron-1.2开始,已经支持MySQL常见的DDL语法,包括CREATE/DROP/ALTER DATABASE/TABLE/INDEX等,不过Klustron仍然不支持下列MySQL DDL语法。

  1. 不支持 user defined vars(用户自定义变量)以及依赖于此的功能 比如在 text 类型的 MySQL 连接中使用 prepared statement 的execute 语句。不过,可以在text协议的MySQL连接中,对prepared statement 使用PostgreSQL的EXECUTE 命令来绑定参数和执行。

  2. 不支持 MySQL 存储过程、外键和触发器; Klustron支持PostgreSQL的存储过程和触发器及其语法。

  3. 不支持 MySQL 的用户和权限管理语法,不支持 binlog replication 相关的功能及其 SQL 语法。 Klustron支持PostgreSQL的用户、角色和权限管理语法

  4. 不支持 MySQL 特有的 DDL 功能及其语法,例如 不支持 MySQL 的 CREATE EVENT语法

  5. MySQL 的系统函数大多数已经定义在计算节点中

    从Klustron-1.2.2开始,几乎所有MySQL系统函数都已经在Klustron中定义并可以使用。如果用户需要增加任何函数包括MySQL系统函数,那么可以使用 PostgreSQL 支持的任何系统过程语言(例如 PL/SQL, Python,perl,java 等)实现需要的 MySQL 系统函数,然后在集群任何一个 Klustron-server 节点中执行这个 create function 语句创建这个存储过程。然后集群其他计算节点也会很快复制并执行这个语句从而也拥有了这个系统函数。拥有 MySQL 系统函数的好处是让 Klustron-server 可以识别这些系统函数为 Klustron-storage 支持的函数,从而可以实现更多的 SQL 计算下推。

  6. MySQL的performance_schema, sys, mysql 等系统数据库,在Klustron中不存在。 MySQL的information_schema 数据库下部分view和table在klustron中不存在,从Klustron-1.3版本开始,MySQL的information_schema 里面几乎全部视图和表在Klustron中已经存在。

二、Klustron-Server 对 MySQL 功能的支持的特殊情况汇总

  1. 关键词占用

    PG 特有的关键词作为标识符(表名,列名等)使用,可能会导致 Klustron-server 的 sql 解析出错;把 MySQL 特有的关键词作为标识符(表名,列名等)使用,可能会导致 Klustron-storage 中 sql 解析出错。

    举例:create table t1(key int); -- 这个语句会导致 Klustron-storage 报错因而无法执行。因为 key 在 MySQL 中是关键字,在postgreSQL中不是。

  2. SQL MODES

    虽然没有 sql_mode 这个变量,但是 Klustron-server 的行为相当于 MySQL 的 sql_mode 被设置为

    ONLY_FULL_GROUP_BY | STRICT_ALL_TABLES | NO_ZERO_IN_DATE | NO_ZERO_DATE
    ERROR_FOR_DIVISION_BY_ZERO | NO_ENGINE_SUBSTITUTION | IGNORE_SPACE
    

    因此,Klustron-server 不接受非法的日期、时间和时间戳值。

    Klustron-server 没有 MySQL 的 IGNORE_SPACE 的限制,也就是说可以在函数名和括号之间有空格,并且(这是 MySQL 做不到的)一部分关键字(保留字)可以做函数名。 对于 ANSI_QUOTES,情况略微复杂:

    in MySQL connections to Klustron, one can turn on/off mysql_ansi_quotes to produce exactly the same effect as if ANSI_QUOTES set/unset to sql_mode in MySQL.
    in pg connections it's as if always set, so only quote string constants with '' because "" are used to quote symbol names.
    
    
  3. 语句执行出错后自动回滚事务

    如果一个事务中某个语句执行出错,那么这个事务会被 Klustron-server 在其内部自动回滚,Klustron-server 收到该事务后续语句后全部忽略,直到收到 commit/rollback 语句后,Klustron-server 会清除和回滚这个事务。这是 PostgreSQL 的行为相同。

    从klustron-1.2 开始,在enable_stmt_subxact=true 时,可以达到MySQL中相同的效果 --- SQL语句执行出错并不回滚事务,而是返回错误给客户端,由客户端决定要回滚这个事务,还是继续执行并且最终提交这个事务。

  4. set 语句简化 MySQL 的 set session/global var=value 这个语句,这个 value 可以是任意表达式,但是 Klustron-server 中 value 只能是常量(形如数字 1,2,3,和 字符串 'abc', 'def' 等)。

  5. autocommit模式和事务隐式启动和提交

在事务未显式启动(执行begin语句)的情况下执行DML,在autocommit 变量为true(默认为true) 时会执行一个autocommit事务。这与MySQL完全相同。 在autocommit=false时,会隐式启动一个事务,直到这个事务被显式提交(执行commit语句)或者隐式提交(执行DDL语句、BEGIN语句 或者 set autocommit=on)。

  1. 支持常见的MySQL DDL语法

从klustron-1.2开始,计算节点支持常见的MySQL DDL语法,包括CREATE/alter/drop database/table/index 语法。 对于CREATE TABLE,不仅支持MySQL 对 create table 语句的私有扩展,比如表分区语法,而且支持 engine=xxx,以及一些其他常用的表选项 等。

三、Klustron 与 MySQL 语法的差异

  1. There must be a space around prepared statement parameter place holder (?), it should not be ajacent to any other operator letter

e.g. =?, +?, ?+, are wrong for Klustron-server, but OK for MySQL.

This requirement has been removed starting from klustron-1.2, spaces are not needed around the parameter place holder(?) .

  1. Data types tinyint and tinyint unsigned is 2-bytes, so treat it as a alias of MYSQL_TYPE_SHORT and use a short/unsigned short var to hold in/out paramter value.

This causes no real impact to users at all, except taking a little bit more storage space.

  1. Type modifiers deprecated from MySQL-8.0.17 and later are not supported in Klustron

They are: numeric display width, float/double(M,D), and float/double unsigned

  1. Default const value must match type of its column

For example, create table t1 (id1 int NOT NULL default '0') is NOT OK, should be create table t1 (id1 int NOT NULL default 0)

  1. Unresonable and irrational type conversions are not supported

Type conversion between a date/time type and a numeric type are not supported; and type conversions between any pair of date/time types except between datetime and timestamp, are not supported.

  1. CONVERT() to convert string encoding isn't supported.

  2. strlen(blob): blob values are longer than actual length

This is trivial and is due to an implementation detail --- a blob value is hex encoded and length() is pushed down to storage node which returns the encoded length

  1. SET time_zone = '+00:00'; not supported: do SET TimeZone = 'GMT';

All available time zone names can be found via sql: select name from pg_timezone_names;

  1. Timestamp constant must end with time zone

    e.g. 2011-02-02 15:31:06+00 is a valid Klustron timestamp constant, but 2011-02-02 15:31:06 isn't.

  2. Column alias names should not be single-quoted('), can be back quoted(```) or double quoted(")

    e.g. select abcdefg as 'a' from t1 is invalid, do select abcdefg as a from t1 or select abcdefg as `a` from t1 or select abcdefg as "a" from t1

  3. Representation of rows of all default values

    given table t1 created as create table t1(a int default 0, b int default 0), MySQL syntax insert into t1 values(), () should be written as insert into t1 values(default, default), (default, default)

  4. Fulltext match operator isn't same as MySQL

    MATCH str AGAINST pattern expression should be written to str ~ pattern

  5. CREATE TABLE ... SELECT not supported

    It should be split and written as CREATE TABLE ... (LIKE ...) ; INSERT INTO ...SELECT*FROM ... From Klustron-1.3, this statement is supported.

  6. group_concat() not exist for now

  7. Don't drop objects which are dependent upon:

    e.g. given view v1 depends on table t1, don't drop table t1,v1 in one stmt, or drop v1 before t1 in 2 stmts, this is OK in MySQL but not in Klustron, dependence check will reject this. Instead, simply do drop table t1 cascade , it will drop v1 and anything dependent on t1 (e.g. sequences).

  8. Table level collation and charset setting at end of a CREATE TABLE statement, and ENGINE=XX setting

    starting from klustron-1.2, the ENGINE=XXX clause is supported, along with a few other table options, for more see newly supported DDL grammars.

  9. Index creation syntax in CREATE TABLE statements

    Can't create non-unique index inside CREATE TABLE statements with KEY() clause; and also, UNIQUE KEY() isn't OK, but UNIQUE() is OK.

    e.g. create table(a int primary key, b int, c int, KEY(b), UNIQUE KEY(c)) isn't OK, and should be written as: create table t1(a int primary key, b int, c int, UNIQUE (c)); create index on t1(b);

  10. Conflicting value reference syntax in INSERT INTO t1 VALUES ... ON CONFLICT UPDATE NOT same as syntax of MySQL

    In INSERT INTO t1 VALUES ... ON CONFLICT UPDATE col=valexpr , here in the valexpr value expression, we refer to the current field in the row with t1.col, and refer to the proposed new field value with EXCLUDED.col.

    e.g.

    INSERT INTO t1 VALUES(1,2) ON CONFLICT UPDATE col1=t1.col1 +3, col2=t1.col2+5;
    INSERT INTO t1 VALUES(1,2) ON CONFLICT UPDATE col1=EXCLUDED.col1 + 3, col2=EXCLUDED.col2+5;
    

    This is different from MySQL. For MySQL, when refering to current field value in row, simply col is OK, no need for t1.col, and to refer to the new(proposed) value, latest MySQL approach is to define a row alias and use it, rather than using a fixed row alias EXCLUDED. So equivalent MySQL syntax is:

     INSERT INTO t1 VALUES(1,2) ON CONFLICT UPDATE col1=col1 +3, col2=col2+5;
    INSERT INTO t1 VALUES(1,2) AS newrow ON CONFLICT UPDATE col1=newrow.col1 + 3, col2=newrow.col2+5;
    

    and the VALUES() approach in older versions of MySQL for this purpose is deprecated.

  11. Klustron enforces STRICT_ALL_TABLES, so it doesn't allow varchar value longer than its max declared length.

  12. All symbol names are stored and compared as lower-case in Klustron-server, whether they are quoted or not.

    • For original PostgreSQL these DDLs are valid(although insane):
    CREATE TABLE t1("A" int, a int);
    CREATE TABLE "T1"("Aa" int, "aA" int);
    

    But for Klustron, this will be rejected by Klustron-server, and it was rejected before by Klustron-storage anyway, so there is no backward compatibility issues for MySQL compatibility in Klustron.

    Further, Klustron-storage has been set lower_case_table_names=1 so we never allowed two tables with names different only in lettercase, but original PostgreSQL allows a table named "T1" at creation and a table t1 to co-exist.

    • returned column name in result metadata will be lower case even if it was upper case in CREATE TABLE stmt.

    This might slightly affect application code which were using MySQL, and has no impacts to those using PostgreSQL before. For example, some application code which were using MySQL may refer to result rows' fields using column names like this:

    print(row["A"], row["Aa"])

    this code now will fail with Klustron --- the code must be modified as:

    print(row["a"], row["aa"])

  13. The 'unsigned' type modifier is NOT effective CREATE TABLE statements like this: CREATE TABLE t1(a int unsigned) parses and executes successfully, but users can actually insert rows with negative values into table t1. That is, the 'unsigned' type modifier isn't an effective constraint in Klustron, they are only parsed and then ignored.

  14. Fetching results from execution of prepared statements using cursor type CURSOR_TYPE_READ_ONLY behaves differently

    This is a difference between klustron-storage and community MySQL, so this won't affect Klustron users at all, you can safely ignore it.

    If CURSOR_TYPE_READ_ONLY is specified as an option for a prepared statement, a new type of server side cursor(Sliced_cursor) is created in klustron-storage to execute the query incrementally and send result rows by N(specified in PREFETCH_ROWS option) rows a time. No rows are cached at server side temp table, bringing better performance given large set of result rows. Client side can NOT fetch across transaction boundary otherwise the COM_STMT_FETCH would return inconsistent results because the FETCH'es in following transactions would use a different read view than the original one. Actually Klustron-storage will close all server side READ_ONLY cursors at end of a transaction.

    This restriction is newly introduced, and may cause errors if violated. community MySQL given CURSOR_TYPE_READ_ONLY works the same way as CURSOR_TYPE_SCROLLABLE or CURSOR_TYPE_FOR_UPDATE described below.

    if working with Klustron-storage directly without Klustron-server, applications doing FETCH'es to the same prepared statement across transaction boundary using a cursor of type CURSOR_TYPE_READ_VIEW may work wrong and need to be fixed.

    Between two FETCH'es of the same prepared statement, any other SQL stmts or mysql_stmt_prepare(), mysql_stmt_execute(), mysql_stmt_fetch() with other prepared statements can be executed. this is the same as community MySQL.

  15. In MySQL connections transform CREATE/DROP/SHOW DATABASE to CREATE/DROP/SHOW SCHEMA by default

    Klustron-server doesn't allow switching to another database like MySQL, but MySQL users assumes and may rely on this capability.

    So in MySQL connections, by default transform all CREATE/DROP DATABASE and SHOW DATABASES statements to CREATE/DROP SCHEMA and SHOW SCHEMAS statements to support that. If transform_create_db_to_schema is off, the CREATE/DROP DATABASE will be executed as is and the current connection can't access the created database.

    When migrating and loading data from a MySQL db instance, choose an existing database of Klustron to hold all the MySQL instance's 'databases' and create them as schemas now, then you can switch databases as before using USE statements or mysql_select_db().

    MySQL clients connect to a real Klustron database(not a schema), at initially its at 'public' schema, and then the client can use 'USE' statement or mysql_select_db() to switch between its original MySQL databases(now schemas).

    If a connection breaks and the client reconnects, it will reconnect with its cached 'current database', so at server side executing a USE or COM_INIT_DB, if we can find it as a schema name, we switch to that schema; if we can't find the specified name as a schema name, we look for it as a db name. And if it's currently connected db, we switch to its 'public' schema(which may be different from before the disconnection, this is a difference from MySQL); if it's another valid db or an non-existent db, we reject the request.

    Similarly in MySQL connections, by default a DROP DATABASE statement is transformed to DROP SCHEMA ... CASCADE, unless transform_create_db_to_schema is false.

    Users must be clear they are dropping a database or schema really, to drop a real database in a MySQL connection, set transform_create_db_to_schema to false before issuing the DROP DATABASE statement.

四、Klustron-Server 支持的 MySQL 私有语法汇总

总的来说Klustron支持几乎所有的MySQL私有DML语法,详见 Klustron 对 MySQL 私有 DML 语法的支持

五、Klustron 的 MySQL 连接协议以及支持的 MySQL 语法和功能列表

Klustron & MySQL 连接协议简介

END