From ac75e90e052ad70b40729cb7b4bdc0c691fc6b52 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adam=20Zieli=C5=84ski?= Date: Mon, 18 Nov 2024 12:18:59 +0100 Subject: [PATCH] Exhaustive MySQL Parser (#157) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Context This PR ships an exhaustive MySQL **lexer** and **parser** that produce a MySQL query AST. This is the first step to significantly improve MySQL compatibility and expand WordPress plugin support on SQLite. It's an easier, more stable, and an easier to maintain method than the current token processing. It will also dramatically improve WordPress Playground experience – database integration is the single largest source of issues. This PR is part of the [Advanced MySQL support project](https://github.com/WordPress/sqlite-database-integration/issues/162). See the [MySQL parser proposal](https://github.com/WordPress/sqlite-database-integration/issues/106#issuecomment-2276082489) for additional context. ## This PR ships 1. A **MySQL lexer**, adapted from the AI-generated one by @adamziel. It's over 3x smaller and close to 2x faster. 2. A **MySQL grammar** written in ANTLR v4 format, adapted from the [MySQL Workbench grammar](https://github.com/mysql/mysql-workbench/blob/8.0.38/library/parsers/grammars/MySQLParser.g4) by adding and fixing some cases and reordering some rules. 3. A **script to factor, convert, and compress the grammar** to a PHP array. 4. A **dynamic recursive parser** implemented by @adamziel. 5. A **script to extract tests** from the MySQL repository. 6. A **test suite of almost 70k queries**. 7. WIP **SQLite driver** by @adamziel, a demo and foundation for the next phase. At the moment, all the new files are omitted from the plugin build, so they have no effect on production whatsoever. ## Running tests The lexer & parser tests suite is not yet integrated into the CI and existing test commands. To run the tests, use: ```php php tests/parser/run-lexer-tests.php php tests/parser/run-parser-tests.php ``` This will lex / lex & parse all the ~70k queries. ## Implementation ### Parser A simple recursive parser to transform `(token stream, grammar) => parse tree`. In this PR, we use MySQL tokens and MySQL grammar, but the same parser could also support XML, IMAP, many other grammars (as long as they have some specific properties). The `parse_recursive()` method is just 100 lines of code (excluding comments). All of the parsing rules are provided by the grammar. ### run-mysql-driver.php A quick and dirty implementation of what a `MySQL parse tree ➔ SQLite` database driver could look like. It easily supports `WITH` and `UNION` queries that would be really difficult to implement the current SQLite integration plugin. The tree transformation is an order of magnitude easier to read, expand, and maintain than the current implementation. I stand by this, even though the temporary `ParseTreeTools`/`SQLiteTokenFactory` API included in this PR seems annoying, and I'd like to ship something better than that. Here's a glimpse: ```php function translateQuery($subtree, $rule_name=null) { if(is_token($subtree)) { $token = $subtree; switch ($token->type) { case MySQLLexer::EOF: return new SQLiteExpression([]); case MySQLLexer::IDENTIFIER: return SQLiteTokenFactory::identifier( SQLiteTokenFactory::identifierValue($token) ); default: return SQLiteTokenFactory::raw($token->text); } } switch($rule_name) { case 'indexHintList': // SQLite doesn't support index hints. Let's // skip them. return null; case 'fromClause': // Skip `FROM DUAL`. We only care about a singular // FROM DUAL statement, as FROM mytable, DUAL is a syntax // error. if( ParseTreeTools::hasChildren($ast, MySQLLexer::DUAL_SYMBOL) && !ParseTreeTools::hasChildren($ast, 'tableReferenceList') ) { return null; } case 'functionCall': $name = $ast[0]['pureIdentifier'][0]['IDENTIFIER'][0]->text; return translateFunctionCall($name, $ast[0]['udfExprList']); } } ``` ## Technical details ### MySQL Grammar We use the [MySQL workbench grammar](https://github.com/mysql/mysql-workbench/blob/8.0/library/parsers/grammars/MySQLParser.g4), manually adapted, modified, and fixed, and converted from ANTLR4 format to a PHP array. The grammar conversion pipeline is done by `convert-grammar.php` and goes like this: 1. Parse MySQLParser.g4 grammar into a PHP tree. 2. Flatten the grammar so that any nested rules become top-level and are referenced by generated names. This factors compound rules into separate rules, e.g. `query ::= SELECT (ALL | DISTINCT)` becomes `query ::= select %select_fragment0` and `%select_fragment0 ::= ALL | DISTINCT`. 3. Expand `*`, `+`, `?` modifiers into separate, right-recursive rules. For example, `columns ::= column (',' column)*` becomes `columns ::= column columns_rr` and `columns_rr ::= ',' column | ε`. 6. Compress and export the grammar as a PHP array. It replaces all string names with integers and ships an int->string map to reduce the file size. The `mysql-grammar.php` file size is ~70kb in size, which is small enough. The parser can handle about 1000 complex SELECT queries per second on a MacBook Pro. It only took a few easy optimizations to go from 50/seconds to 1000/second. There's a lot of further optimization opportunities once we need more speed. We could factor the grammar in different ways, explore other types of lookahead tables, or memoize the matching results per token. However, I don't think we need to do that in the short term. If we spend enough time factoring the grammar, we could potentially switch to a LALR(1) parser and cut most time spent on dealing with ambiguities. ## Known issues There are some small issues and incomplete edge cases. Here are the ones I'm currently aware of: 1. A very special case in the lexer is not handled — While identifiers can't consist solely of numbers, in the identifier part after a `.`, this is possible (e.g., `1ea10.1` is a table name & column name). This is not handled yet, and it may be worth checking if all cases in the identifier part after a `.` are handled correctly. 2. Another very special case in the lexer — While the lexer does support version comments, such as `/*!80038 ... /` and nested comments within them, a nested comment within a non-matched version is not supported (e.g., `SELECT 1 /*!99999 /* */ */`). Additionally, we currently support only 5-digit version specifiers (`80038`), but 6 digits should probably work as well (`080038`). 3. Version specifiers are not propagated to the PHP grammar yet, and versions are not applied in the grammar yet (only in the lexer). This will be better to bring in together with version-specific test cases. 4. Some rules in the grammar may not have version specifiers, or they may be incorrect. 7. The `_utf8` underscore charset should be version-dependent (only on MySQL 5), and maybe some others are too. We can check this by `SHOW CHARACTER SET` on different MySQL versions. 8. The PHPized grammar now contains array indexes of the main rules, while previously they were not listed. It seems there are numeric gaps. It might be a regression caused when manually parsing the grammar. I suppose it's an easy fix. 9. Some components need better test coverage (although the E2E 70k query test suite is pretty good for now). 10. The tests are not run on CI yet. 11. I'm not sure if the new code fully satisfies the plugin PHP version requirement. We need to check that — e.g., that there are no PHP 7.1 features used. Not fully sure, but I think there's no lint for PHP version in the repo, so we could add it. This list is mainly for me, in order not to forget these. I will later port it into a tracking issue with a checklist. ## Updates Since the thread here is pretty long, here are quick links to the work-in-progress updates: - [First update with a MySQL query test suite.](https://github.com/WordPress/sqlite-database-integration/pull/157#issuecomment-2383247665) - [Quick update, focusing on lexer.](https://github.com/WordPress/sqlite-database-integration/pull/157#issuecomment-2394474341) - [Custom grammer conversion script, preserving version, fixes, and more.](https://github.com/WordPress/sqlite-database-integration/pull/157#issuecomment-2419879660) - [Wrap up](https://github.com/WordPress/sqlite-database-integration/pull/157#issuecomment-2449822524). ## Next steps These could be implemented either in follow-up PRs or as updates to this PR – whichever is more convenient: * Bring in a comprehensive MySQL queries test suite, similar to [WHATWG URL test data](https://github.com/web-platform-tests/wpt/blob/master/url/resources/urltestdata.json) for parsing URLs. First, just ensure the parser either returns null or any parse tree where appropriate. Then, once we have more advanced tree processing, actually assert the parser outputs the expected query structures. * Create a `MySQLOnSQLite` database driver to enable running MySQL queries on SQLite. Read [this comment](https://github.com/WordPress/sqlite-database-integration/issues/106#issuecomment-2277425966) for more context. Use any method that's convenient for generating SQLite queries. Feel free to restructure and redo any APIs proposed in this PR. Be inspired by the idea we may build a `MySQLOnPostgres` driver one day, but don't actually build any abstractions upfront. Make the driver generic so it can be used without WordPress. Perhaps it could implement a PDO driver interface? * Port MySQL features already supported by the SQLite database integration plugin to the new `MySQLOnSQLite` driver. For example, `SQL_CALC_FOUND_ROWS` option or the `INTERVAL` syntax. * Run SQLite database integration plugin test suite on the new `MySQLOnSQLite` driver and ensure they pass. * Rewire this plugin to use the new `MySQLOnSQLite` driver instead of the current plumbing. --------- Co-authored-by: Jan Jakes --- .gitattributes | 6 +- composer.json | 1 + grammar-tools/MySQLParser.g4 | 5709 + grammar-tools/convert-grammar.php | 297 + phpunit.xml.dist | 4 + tests/bootstrap.php | 6 + tests/mysql/WP_MySQL_Lexer_Tests.php | 269 + .../WP_MySQL_Server_Suite_Lexer_Tests.php | 34 + .../WP_MySQL_Server_Suite_Parser_Tests.php | 103 + .../mysql/data/mysql-server-tests-queries.csv | 114950 +++++++++++++++ tests/tools/.gitignore | 1 + tests/tools/dump-ast.php | 38 + tests/tools/mysql-download-tests.sh | 48 + tests/tools/mysql-extract-queries.php | 370 + tests/tools/run-lexer-benchmark.php | 38 + tests/tools/run-parser-benchmark.php | 81 + wip/SQLiteDriver.php | 533 + wip/run-mysql-driver.php | 656 + wp-includes/mysql/class-wp-mysql-lexer.php | 2975 + wp-includes/mysql/class-wp-mysql-parser.php | 4 + wp-includes/mysql/class-wp-mysql-token.php | 35 + wp-includes/mysql/mysql-grammar.php | 4 + .../parser/class-wp-parser-grammar.php | 139 + wp-includes/parser/class-wp-parser-node.php | 184 + wp-includes/parser/class-wp-parser.php | 124 + 25 files changed, 126608 insertions(+), 1 deletion(-) create mode 100644 grammar-tools/MySQLParser.g4 create mode 100644 grammar-tools/convert-grammar.php create mode 100644 tests/mysql/WP_MySQL_Lexer_Tests.php create mode 100644 tests/mysql/WP_MySQL_Server_Suite_Lexer_Tests.php create mode 100644 tests/mysql/WP_MySQL_Server_Suite_Parser_Tests.php create mode 100644 tests/mysql/data/mysql-server-tests-queries.csv create mode 100644 tests/tools/.gitignore create mode 100644 tests/tools/dump-ast.php create mode 100755 tests/tools/mysql-download-tests.sh create mode 100644 tests/tools/mysql-extract-queries.php create mode 100644 tests/tools/run-lexer-benchmark.php create mode 100644 tests/tools/run-parser-benchmark.php create mode 100644 wip/SQLiteDriver.php create mode 100644 wip/run-mysql-driver.php create mode 100644 wp-includes/mysql/class-wp-mysql-lexer.php create mode 100644 wp-includes/mysql/class-wp-mysql-parser.php create mode 100644 wp-includes/mysql/class-wp-mysql-token.php create mode 100644 wp-includes/mysql/mysql-grammar.php create mode 100644 wp-includes/parser/class-wp-parser-grammar.php create mode 100644 wp-includes/parser/class-wp-parser-node.php create mode 100644 wp-includes/parser/class-wp-parser.php diff --git a/.gitattributes b/.gitattributes index ff14de0d..e1ddbbb2 100644 --- a/.gitattributes +++ b/.gitattributes @@ -5,5 +5,9 @@ composer.json export-ignore phpcs.xml.dist export-ignore phpunit.xml.dist export-ignore -tests/*.php export-ignore +/grammar-tools export-ignore +/tests export-ignore +/wip export-ignore +/wp-includes/mysql export-ignore +/wp-includes/parser export-ignore wp-includes/sqlite/class-wp-sqlite-crosscheck-db.php export-ignore diff --git a/composer.json b/composer.json index fb6b2899..dce2c54b 100644 --- a/composer.json +++ b/composer.json @@ -11,6 +11,7 @@ "php": ">=7.0" }, "require-dev": { + "ext-mbstring": "*", "dealerdirect/phpcodesniffer-composer-installer": "^0.7.0", "squizlabs/php_codesniffer": "^3.7", "wp-coding-standards/wpcs": "^3.1", diff --git a/grammar-tools/MySQLParser.g4 b/grammar-tools/MySQLParser.g4 new file mode 100644 index 00000000..b8fc8312 --- /dev/null +++ b/grammar-tools/MySQLParser.g4 @@ -0,0 +1,5709 @@ +/* + * Grammar from: https://github.com/mysql/mysql-workbench/blob/8.0.38/library/parsers/grammars/MySQLParser.g4 + * + * The grammar was manually fixed and factored. The original grammar is kept below in its entirety. + * The adjusted rules were kept in place, but commented out and redefined below with manual fixes. + */ + +parser grammar MySQLParser; + +/* + * Copyright (c) 2012, 2020, Oracle and/or its affiliates. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2.0, + * as published by the Free Software Foundation. + * + * This program is designed to work with certain software (including + * but not limited to OpenSSL) that is licensed under separate terms, as + * designated in a particular file or component or in included license + * documentation. The authors of MySQL hereby grant you an additional + * permission to link the program and your derivative works with the + * separately licensed software that they have either included with + * the program or referenced in the documentation. + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License, version 2.0, for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* + * Merged in all changes up to mysql-trunk git revision [6d4f66a] (16. January 2020). + * + * MySQL grammar for ANTLR 4.5+ with language features from MySQL 5.6.0 up to MySQL 8.0. + * The server version in the generated parser can be switched at runtime, making it so possible + * to switch the supported feature set dynamically. + * + * The coverage of the MySQL language should be 100%, but there might still be bugs or omissions. + * + * To use this grammar you will need a few support classes (which should be close to where you found this grammar). + * These classes implement the target specific action code, so we don't clutter the grammar with that + * and make it simpler to adjust it for other targets. See the demo/test project for further details. + * + * Written by Mike Lischke. Direct all bug reports, omissions etc. to mike.lischke@oracle.com. + */ + +//---------------------------------------------------------------------------------------------------------------------- + +// $antlr-format alignTrailingComments on, columnLimit 130, minEmptyLines 1, maxEmptyLinesToKeep 1, reflowComments off +// $antlr-format useTab off, allowShortRulesOnASingleLine off, allowShortBlocksOnASingleLine on, alignSemicolons ownLine + +options { + superClass = MySQLBaseRecognizer; + tokenVocab = MySQLLexer; + exportMacro = PARSERS_PUBLIC_TYPE; +} + +//---------------------------------------------------------------------------------------------------------------------- + +@header {/* + * Copyright (c) 2018, 2020, Oracle and/or its affiliates. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2.0, + * as published by the Free Software Foundation. + * + * This program is designed to work with certain software (including + * but not limited to OpenSSL) that is licensed under separate terms, as + * designated in a particular file or component or in included license + * documentation. The authors of MySQL hereby grant you an additional + * permission to link the program and your derivative works with the + * separately licensed software that they have either included with + * the program or referenced in the documentation. + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License, version 2.0, for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ +} + +@postinclude { +#include "MySQLBaseRecognizer.h" +} + +//---------------------------------------------------------------------------------------------------------------------- + +query: + EOF + | (simpleStatement | beginWork) (SEMICOLON_SYMBOL EOF? | EOF) +; + +simpleStatement: + // DDL + alterStatement + | createStatement + | dropStatement + | renameTableStatement + | truncateTableStatement + | {serverVersion >= 80000}? importStatement + + // DML + | callStatement + | deleteStatement + | doStatement + | handlerStatement + | insertStatement + | loadStatement + | replaceStatement + | selectStatement + | updateStatement + | transactionOrLockingStatement + | replicationStatement + | preparedStatement + + // Data Directory + | {serverVersion >= 80000}? cloneStatement + + // Database administration + | accountManagementStatement + | tableAdministrationStatement + | installUninstallStatment + | setStatement // SET PASSWORD is handled in accountManagementStatement. + | showStatement + | {serverVersion >= 80000}? resourceGroupManagement + | otherAdministrativeStatement + + // MySQL utilitity statements + | utilityStatement + | {serverVersion >= 50604}? getDiagnostics + | signalStatement + | resignalStatement +; + +//----------------- DDL statements ------------------------------------------------------------------------------------- + +alterStatement: + ALTER_SYMBOL ( + alterTable + | alterDatabase + | PROCEDURE_SYMBOL procedureRef routineAlterOptions? + | FUNCTION_SYMBOL functionRef routineAlterOptions? + | alterView + | alterEvent + | alterTablespace + | {serverVersion >= 80014}? alterUndoTablespace + | alterLogfileGroup + | alterServer + // ALTER USER is part of the user management rule. + | alterInstance /* @FIX: Add support for "ALTER INSTANCE ..." statement. */ + ) +; + +/* + * @FIX: + * Add support for "ALTER INSTANCE ..." statement. + */ +alterInstance: + {serverVersion >= 50711}? INSTANCE_SYMBOL ( + ROTATE_SYMBOL (INNODB_SYMBOL | {serverVersion >= 80016}? BINLOG_SYMBOL) MASTER_SYMBOL KEY_SYMBOL + | {serverVersion >= 80016}? RELOAD_SYMBOL TLS_SYMBOL (NO_SYMBOL ROLLBACK_SYMBOL ON_SYMBOL ERROR_SYMBOL)? + | {serverVersion >= 80021}? RELOAD_SYMBOL TLS_SYMBOL FOR_SYMBOL CHANNEL_SYMBOL identifier (NO_SYMBOL ROLLBACK_SYMBOL ON_SYMBOL ERROR_SYMBOL)? + | {serverVersion >= 80021}? (ENABLE_SYMBOL | DISABLE_SYMBOL) INNODB_SYMBOL REDO_LOG_SYMBOL + | {serverVersion >= 80024}? RELOAD_SYMBOL KEYRING_SYMBOL + ) +; + +alterDatabase: + /* @FIX: Make "schemaRef" optional. */ + DATABASE_SYMBOL schemaRef? ( + createDatabaseOption+ + | {serverVersion < 80000}? UPGRADE_SYMBOL DATA_SYMBOL DIRECTORY_SYMBOL NAME_SYMBOL + ) +; + +alterEvent: + definerClause? EVENT_SYMBOL eventRef (ON_SYMBOL SCHEDULE_SYMBOL schedule)? ( + ON_SYMBOL COMPLETION_SYMBOL NOT_SYMBOL? PRESERVE_SYMBOL + )? (RENAME_SYMBOL TO_SYMBOL identifier)? ( + ENABLE_SYMBOL + | DISABLE_SYMBOL (ON_SYMBOL SLAVE_SYMBOL)? + )? (COMMENT_SYMBOL textLiteral)? (DO_SYMBOL compoundStatement)? +; + +alterLogfileGroup: + LOGFILE_SYMBOL GROUP_SYMBOL logfileGroupRef ADD_SYMBOL UNDOFILE_SYMBOL textLiteral alterLogfileGroupOptions? +; + +alterLogfileGroupOptions: + alterLogfileGroupOption (COMMA_SYMBOL? alterLogfileGroupOption)* +; + +alterLogfileGroupOption: + tsOptionInitialSize + | tsOptionEngine + | tsOptionWait +; + +alterServer: + SERVER_SYMBOL serverRef serverOptions +; + +alterTable: + onlineOption? ({serverVersion < 50700}? IGNORE_SYMBOL)? TABLE_SYMBOL tableRef alterTableActions? +; + +/*alterTableActions: + alterCommandList (partitionClause | removePartitioning)? + | partitionClause + | removePartitioning + | (alterCommandsModifierList COMMA_SYMBOL)? standaloneAlterCommands +;*/ + +/* + * @FIX: + * Fix "alterTableActions" to solve conflicts between "alterCommandsModifierList" and "alterCommandList". + */ +alterTableActions: + (alterCommandsModifierList COMMA_SYMBOL)? standaloneAlterCommands + | alterCommandList (partitionClause | removePartitioning)? + | partitionClause + | removePartitioning +; + +/*alterCommandList: + alterCommandsModifierList + | (alterCommandsModifierList COMMA_SYMBOL)? alterList +;*/ + +/* + * @FIX: + * Fix "alterCommandList" to solve conflicts between "alterCommandsModifierList" prefixes. + */ +alterCommandList: + alterCommandsModifierList (COMMA_SYMBOL alterList)? + | alterList +; + +alterCommandsModifierList: + alterCommandsModifier (COMMA_SYMBOL alterCommandsModifier)* +; + +standaloneAlterCommands: + DISCARD_SYMBOL TABLESPACE_SYMBOL + | IMPORT_SYMBOL TABLESPACE_SYMBOL + | alterPartition + | {serverVersion >= 80014}? (SECONDARY_LOAD_SYMBOL | SECONDARY_UNLOAD_SYMBOL) +; + +alterPartition: + ADD_SYMBOL PARTITION_SYMBOL noWriteToBinLog? ( + partitionDefinitions + | PARTITIONS_SYMBOL real_ulong_number + ) + | DROP_SYMBOL PARTITION_SYMBOL identifierList + | REBUILD_SYMBOL PARTITION_SYMBOL noWriteToBinLog? allOrPartitionNameList + + // yes, twice "no write to bin log". + | OPTIMIZE_SYMBOL PARTITION_SYMBOL noWriteToBinLog? allOrPartitionNameList noWriteToBinLog? + | ANALYZE_SYMBOL PARTITION_SYMBOL noWriteToBinLog? allOrPartitionNameList + | CHECK_SYMBOL PARTITION_SYMBOL allOrPartitionNameList checkOption* + | REPAIR_SYMBOL PARTITION_SYMBOL noWriteToBinLog? allOrPartitionNameList repairType* + | COALESCE_SYMBOL PARTITION_SYMBOL noWriteToBinLog? real_ulong_number + | TRUNCATE_SYMBOL PARTITION_SYMBOL allOrPartitionNameList + | REORGANIZE_SYMBOL PARTITION_SYMBOL noWriteToBinLog? ( + identifierList INTO_SYMBOL partitionDefinitions + )? + | EXCHANGE_SYMBOL PARTITION_SYMBOL identifier WITH_SYMBOL TABLE_SYMBOL tableRef withValidation? + | {serverVersion >= 50704}? DISCARD_SYMBOL PARTITION_SYMBOL allOrPartitionNameList TABLESPACE_SYMBOL + | {serverVersion >= 50704}? IMPORT_SYMBOL PARTITION_SYMBOL allOrPartitionNameList TABLESPACE_SYMBOL +; + +alterList: + (alterListItem | createTableOptionsSpaceSeparated) ( + COMMA_SYMBOL ( + alterListItem + | alterCommandsModifier + | createTableOptionsSpaceSeparated + ) + )* +; + +alterCommandsModifier: + alterAlgorithmOption + | alterLockOption + | withValidation +; + +alterListItem: + ADD_SYMBOL COLUMN_SYMBOL? ( + identifier fieldDefinition checkOrReferences? place? + | OPEN_PAR_SYMBOL tableElementList CLOSE_PAR_SYMBOL + ) + | ADD_SYMBOL tableConstraintDef + | CHANGE_SYMBOL COLUMN_SYMBOL? columnInternalRef identifier fieldDefinition place? + | MODIFY_SYMBOL COLUMN_SYMBOL? columnInternalRef fieldDefinition place? + | DROP_SYMBOL ( + COLUMN_SYMBOL? columnInternalRef restrict? + | FOREIGN_SYMBOL KEY_SYMBOL ( + // This part is no longer optional starting with 5.7. + {serverVersion >= 50700}? columnInternalRef + | {serverVersion < 50700}? columnInternalRef? + ) + | PRIMARY_SYMBOL KEY_SYMBOL + | keyOrIndex indexRef + | {serverVersion >= 80017}? CHECK_SYMBOL identifier + | {serverVersion >= 80019}? CONSTRAINT_SYMBOL identifier + ) + | DISABLE_SYMBOL KEYS_SYMBOL + | ENABLE_SYMBOL KEYS_SYMBOL + | ALTER_SYMBOL COLUMN_SYMBOL? columnInternalRef ( + SET_SYMBOL DEFAULT_SYMBOL ( + {serverVersion >= 80014}? exprWithParentheses + | signedLiteral + ) + | DROP_SYMBOL DEFAULT_SYMBOL + | {serverVersion >= 80023}? SET_SYMBOL visibility /* @FIX: Add missing SET VISIBLE/INVISIBLE clause. */ + ) + | {serverVersion >= 80000}? ALTER_SYMBOL INDEX_SYMBOL indexRef visibility + | {serverVersion >= 80017}? ALTER_SYMBOL CHECK_SYMBOL identifier constraintEnforcement + | {serverVersion >= 80019}? ALTER_SYMBOL CONSTRAINT_SYMBOL identifier constraintEnforcement + | {serverVersion >= 80000}? RENAME_SYMBOL COLUMN_SYMBOL columnInternalRef TO_SYMBOL identifier + | RENAME_SYMBOL (TO_SYMBOL | AS_SYMBOL)? tableName + | {serverVersion >= 50700}? RENAME_SYMBOL keyOrIndex indexRef TO_SYMBOL indexName + | CONVERT_SYMBOL TO_SYMBOL charset ( + {serverVersion >= 80014}? DEFAULT_SYMBOL + | charsetName + ) collate? + | FORCE_SYMBOL + | ORDER_SYMBOL BY_SYMBOL alterOrderList + | {serverVersion >= 50708 && serverVersion < 80000}? UPGRADE_SYMBOL PARTITIONING_SYMBOL +; + +place: + AFTER_SYMBOL identifier + | FIRST_SYMBOL +; + +restrict: + RESTRICT_SYMBOL + | CASCADE_SYMBOL +; + +/*alterOrderList: + identifier direction? (COMMA_SYMBOL identifier direction?)* +;*/ + +/* + * @FIX: + * Fix ALTER TABLE with ORDER to use 'qualifiedIdentifier' instead of just 'identifier'. + * This is necessary to support "t.id" in a query like "ALTER TABLE t ORDER BY t.id". + */ +alterOrderList: + qualifiedIdentifier direction? (COMMA_SYMBOL qualifiedIdentifier direction?)* +; + +alterAlgorithmOption: + ALGORITHM_SYMBOL EQUAL_OPERATOR? (DEFAULT_SYMBOL | identifier) +; + +alterLockOption: + LOCK_SYMBOL EQUAL_OPERATOR? (DEFAULT_SYMBOL | identifier) +; + +indexLockAndAlgorithm: + alterAlgorithmOption alterLockOption? + | alterLockOption alterAlgorithmOption? +; + +withValidation: + {serverVersion >= 50706}? (WITH_SYMBOL | WITHOUT_SYMBOL) VALIDATION_SYMBOL +; + +removePartitioning: + REMOVE_SYMBOL PARTITIONING_SYMBOL +; + +allOrPartitionNameList: + ALL_SYMBOL + | identifierList +; + +alterTablespace: + TABLESPACE_SYMBOL tablespaceRef ( + (ADD_SYMBOL | DROP_SYMBOL) DATAFILE_SYMBOL textLiteral alterTablespaceOptions? + | {serverVersion < 80000}? ( + | CHANGE_SYMBOL DATAFILE_SYMBOL textLiteral ( + changeTablespaceOption (COMMA_SYMBOL? changeTablespaceOption)* + )? + | (READ_ONLY_SYMBOL | READ_WRITE_SYMBOL) + | NOT_SYMBOL ACCESSIBLE_SYMBOL + ) + | RENAME_SYMBOL TO_SYMBOL identifier + | {serverVersion >= 80014}? alterTablespaceOptions + ) +; + +alterUndoTablespace: + UNDO_SYMBOL TABLESPACE_SYMBOL tablespaceRef SET_SYMBOL ( + ACTIVE_SYMBOL + | INACTIVE_SYMBOL + ) undoTableSpaceOptions? +; + +undoTableSpaceOptions: + undoTableSpaceOption (COMMA_SYMBOL? undoTableSpaceOption)* +; + +undoTableSpaceOption: + tsOptionEngine +; + +alterTablespaceOptions: + alterTablespaceOption (COMMA_SYMBOL? alterTablespaceOption)* +; + +alterTablespaceOption: + INITIAL_SIZE_SYMBOL EQUAL_OPERATOR? sizeNumber + | tsOptionAutoextendSize + | tsOptionMaxSize + | tsOptionEngine + | {serverVersion >= 80021}? tsOptionEngineAttribute /* @FIX: Add missing "ENGINE_ATTRIBUTE" option. */ + | tsOptionWait + | tsOptionEncryption +; + +changeTablespaceOption: + INITIAL_SIZE_SYMBOL EQUAL_OPERATOR? sizeNumber + | tsOptionAutoextendSize + | tsOptionMaxSize +; + +alterView: + viewAlgorithm? definerClause? viewSuid? VIEW_SYMBOL viewRef viewTail +; + +// This is not the full view_tail from sql_yacc.yy as we have either a view name or a view reference, +// depending on whether we come from createView or alterView. Everything until this difference is duplicated in those rules. +viewTail: + columnInternalRefList? AS_SYMBOL viewSelect +; + +viewSelect: + queryExpressionOrParens viewCheckOption? +; + +viewCheckOption: + WITH_SYMBOL (CASCADED_SYMBOL | LOCAL_SYMBOL)? CHECK_SYMBOL OPTION_SYMBOL +; + +//---------------------------------------------------------------------------------------------------------------------- + +createStatement: + CREATE_SYMBOL ( + createDatabase + | createTable + | createFunction + | createProcedure + | createUdf + | createLogfileGroup + | createView + | createTrigger + | createIndex + | createServer + | createTablespace + | createEvent + | {serverVersion >= 80000}? createRole + | {serverVersion >= 80011}? createSpatialReference + | {serverVersion >= 80014}? createUndoTablespace + ) +; + +createDatabase: + DATABASE_SYMBOL ifNotExists? schemaName createDatabaseOption* +; + +createDatabaseOption: + defaultCharset + | defaultCollation + | {serverVersion >= 80016}? defaultEncryption +; + +/*createTable: + TEMPORARY_SYMBOL? TABLE_SYMBOL ifNotExists? tableName ( + (OPEN_PAR_SYMBOL tableElementList CLOSE_PAR_SYMBOL)? createTableOptions? partitionClause? duplicateAsQueryExpression? + | LIKE_SYMBOL tableRef + | OPEN_PAR_SYMBOL LIKE_SYMBOL tableRef CLOSE_PAR_SYMBOL + ) +;*/ + +/* + * @FIX: + * Fix "createTable" to solve support "LIKE tableRef" and "LIKE (tableRef)". + * They need to come before "tableElementList" to avoid misinterpreting "LIKE". + */ +createTable: + TEMPORARY_SYMBOL? TABLE_SYMBOL ifNotExists? tableName ( + LIKE_SYMBOL tableRef + | OPEN_PAR_SYMBOL LIKE_SYMBOL tableRef CLOSE_PAR_SYMBOL + | (OPEN_PAR_SYMBOL tableElementList CLOSE_PAR_SYMBOL)? createTableOptions? partitionClause? duplicateAsQueryExpression? + ) +; + +tableElementList: + tableElement (COMMA_SYMBOL tableElement)* +; + +tableElement: + columnDefinition + | tableConstraintDef +; + +duplicateAsQueryExpression: (REPLACE_SYMBOL | IGNORE_SYMBOL)? AS_SYMBOL? queryExpressionOrParens +; + +queryExpressionOrParens: + queryExpression + | queryExpressionParens +; + +createRoutine: // Rule for external use only. + CREATE_SYMBOL (createProcedure | createFunction | createUdf) SEMICOLON_SYMBOL? EOF +; + +/* + * @FIX: + * Add missing "ifNotExists?". + */ +createProcedure: + definerClause? PROCEDURE_SYMBOL ({serverVersion >= 80029}? ifNotExists?) procedureName OPEN_PAR_SYMBOL ( + procedureParameter (COMMA_SYMBOL procedureParameter)* + )? CLOSE_PAR_SYMBOL routineCreateOption* compoundStatement +; + +/* + * @FIX: + * Add missing "ifNotExists?". + */ +createFunction: + definerClause? FUNCTION_SYMBOL ({serverVersion >= 80029}? ifNotExists?) functionName OPEN_PAR_SYMBOL ( + functionParameter (COMMA_SYMBOL functionParameter)* + )? CLOSE_PAR_SYMBOL RETURNS_SYMBOL typeWithOptCollate routineCreateOption* compoundStatement +; + +createUdf: + AGGREGATE_SYMBOL? FUNCTION_SYMBOL udfName RETURNS_SYMBOL type = ( + STRING_SYMBOL + | INT_SYMBOL + | REAL_SYMBOL + | DECIMAL_SYMBOL + ) SONAME_SYMBOL textLiteral +; + +routineCreateOption: + routineOption + | NOT_SYMBOL? DETERMINISTIC_SYMBOL +; + +routineAlterOptions: + routineCreateOption+ +; + +routineOption: + option = COMMENT_SYMBOL textLiteral + | option = LANGUAGE_SYMBOL SQL_SYMBOL + | option = NO_SYMBOL SQL_SYMBOL + | option = CONTAINS_SYMBOL SQL_SYMBOL + | option = READS_SYMBOL SQL_SYMBOL DATA_SYMBOL + | option = MODIFIES_SYMBOL SQL_SYMBOL DATA_SYMBOL + | option = SQL_SYMBOL SECURITY_SYMBOL security = ( + DEFINER_SYMBOL + | INVOKER_SYMBOL + ) +; + +createIndex: + onlineOption? ( + UNIQUE_SYMBOL? type = INDEX_SYMBOL ( + {serverVersion >= 80014}? indexName indexTypeClause? + | indexNameAndType? + ) createIndexTarget indexOption* + | type = FULLTEXT_SYMBOL INDEX_SYMBOL indexName createIndexTarget fulltextIndexOption* + | type = SPATIAL_SYMBOL INDEX_SYMBOL indexName createIndexTarget spatialIndexOption* + ) indexLockAndAlgorithm? +; + +/* + The syntax for defining an index is: + + ... INDEX [index_name] [USING|TYPE] ... + + The problem is that whereas USING is a reserved word, TYPE is not. We can + still handle it if an index name is supplied, i.e.: + + ... INDEX type TYPE ... + + here the index's name is unmbiguously 'type', but for this: + + ... INDEX TYPE ... + + it's impossible to know what this actually mean - is 'type' the name or the + type? For this reason we accept the TYPE syntax only if a name is supplied. +*/ +/*indexNameAndType: + indexName (USING_SYMBOL indexType)? + | indexName TYPE_SYMBOL indexType +;*/ + +/* + * @FIX: + * Fix "indexNameAndType" to solve conflicts between "indexName USING_SYMBOL" + * and "indexName TYPE_SYMBOL" prefix by moving them to a single branch. + */ +indexNameAndType: + indexName? ((USING_SYMBOL | TYPE_SYMBOL) indexType)? +; + +createIndexTarget: + ON_SYMBOL tableRef keyListVariants +; + +createLogfileGroup: + LOGFILE_SYMBOL GROUP_SYMBOL logfileGroupName ADD_SYMBOL ( + UNDOFILE_SYMBOL + | REDOFILE_SYMBOL // No longer used from 8.0 onwards. Taken out by lexer. + ) textLiteral logfileGroupOptions? +; + +logfileGroupOptions: + logfileGroupOption (COMMA_SYMBOL? logfileGroupOption)* +; + +logfileGroupOption: + tsOptionInitialSize + | tsOptionUndoRedoBufferSize + | tsOptionNodegroup + | tsOptionEngine + | tsOptionWait + | tsOptionComment +; + +createServer: + SERVER_SYMBOL serverName FOREIGN_SYMBOL DATA_SYMBOL WRAPPER_SYMBOL textOrIdentifier serverOptions +; + +serverOptions: + OPTIONS_SYMBOL OPEN_PAR_SYMBOL serverOption (COMMA_SYMBOL serverOption)* CLOSE_PAR_SYMBOL +; + +// Options for CREATE/ALTER SERVER, used for the federated storage engine. +serverOption: + option = HOST_SYMBOL textLiteral + | option = DATABASE_SYMBOL textLiteral + | option = USER_SYMBOL textLiteral + | option = PASSWORD_SYMBOL textLiteral + | option = SOCKET_SYMBOL textLiteral + | option = OWNER_SYMBOL textLiteral + | option = PORT_SYMBOL ulong_number +; + +createTablespace: + TABLESPACE_SYMBOL tablespaceName tsDataFileName ( + USE_SYMBOL LOGFILE_SYMBOL GROUP_SYMBOL logfileGroupRef + )? tablespaceOptions? +; + +createUndoTablespace: + UNDO_SYMBOL TABLESPACE_SYMBOL tablespaceName ADD_SYMBOL tsDataFile undoTableSpaceOptions? +; + +tsDataFileName: + {serverVersion >= 80014}? (ADD_SYMBOL tsDataFile)? + | ADD_SYMBOL tsDataFile +; + +tsDataFile: + DATAFILE_SYMBOL textLiteral +; + +tablespaceOptions: + tablespaceOption (COMMA_SYMBOL? tablespaceOption)* +; + +tablespaceOption: + tsOptionInitialSize + | tsOptionAutoextendSize + | tsOptionMaxSize + | tsOptionExtentSize + | tsOptionNodegroup + | tsOptionEngine + | {serverVersion >= 80021}? tsOptionEngineAttribute /* @FIX: Add missing "ENGINE_ATTRIBUTE" option. */ + | tsOptionWait + | tsOptionComment + | {serverVersion >= 50707}? tsOptionFileblockSize + | {serverVersion >= 80014}? tsOptionEncryption +; + +tsOptionInitialSize: + INITIAL_SIZE_SYMBOL EQUAL_OPERATOR? sizeNumber +; + +tsOptionUndoRedoBufferSize: + (UNDO_BUFFER_SIZE_SYMBOL | REDO_BUFFER_SIZE_SYMBOL) EQUAL_OPERATOR? sizeNumber +; + +tsOptionAutoextendSize: + AUTOEXTEND_SIZE_SYMBOL EQUAL_OPERATOR? sizeNumber +; + +tsOptionMaxSize: + MAX_SIZE_SYMBOL EQUAL_OPERATOR? sizeNumber +; + +tsOptionExtentSize: + EXTENT_SIZE_SYMBOL EQUAL_OPERATOR? sizeNumber +; + +tsOptionNodegroup: + NODEGROUP_SYMBOL EQUAL_OPERATOR? real_ulong_number +; + +tsOptionEngine: + STORAGE_SYMBOL? ENGINE_SYMBOL EQUAL_OPERATOR? engineRef +; + +/* + * @FIX: + * Add missing "ENGINE_ATTRIBUTE" option. + */ +tsOptionEngineAttribute: + ENGINE_ATTRIBUTE_SYMBOL EQUAL_OPERATOR? textStringLiteral +; + +tsOptionWait: (WAIT_SYMBOL | NO_WAIT_SYMBOL) +; + +tsOptionComment: + COMMENT_SYMBOL EQUAL_OPERATOR? textLiteral +; + +tsOptionFileblockSize: + FILE_BLOCK_SIZE_SYMBOL EQUAL_OPERATOR? sizeNumber +; + +tsOptionEncryption: + ENCRYPTION_SYMBOL EQUAL_OPERATOR? textStringLiteral +; + +createView: + viewReplaceOrAlgorithm? definerClause? viewSuid? VIEW_SYMBOL viewName viewTail +; + +viewReplaceOrAlgorithm: + OR_SYMBOL REPLACE_SYMBOL viewAlgorithm? + | viewAlgorithm +; + +viewAlgorithm: + ALGORITHM_SYMBOL EQUAL_OPERATOR algorithm = ( + UNDEFINED_SYMBOL + | MERGE_SYMBOL + | TEMPTABLE_SYMBOL + ) +; + +viewSuid: + SQL_SYMBOL SECURITY_SYMBOL (DEFINER_SYMBOL | INVOKER_SYMBOL) +; + +/* + * @FIX: + * Add missing "ifNotExists?". + */ +createTrigger: + definerClause? TRIGGER_SYMBOL ({serverVersion >= 80029}? ifNotExists?) triggerName timing = (BEFORE_SYMBOL | AFTER_SYMBOL) event = ( + INSERT_SYMBOL + | UPDATE_SYMBOL + | DELETE_SYMBOL + ) ON_SYMBOL tableRef FOR_SYMBOL EACH_SYMBOL ROW_SYMBOL triggerFollowsPrecedesClause? compoundStatement +; + +triggerFollowsPrecedesClause: + {serverVersion >= 50700}? ordering = (FOLLOWS_SYMBOL | PRECEDES_SYMBOL) textOrIdentifier // not a trigger reference! +; + +createEvent: + definerClause? EVENT_SYMBOL ifNotExists? eventName ON_SYMBOL SCHEDULE_SYMBOL schedule ( + ON_SYMBOL COMPLETION_SYMBOL NOT_SYMBOL? PRESERVE_SYMBOL + )? (ENABLE_SYMBOL | DISABLE_SYMBOL (ON_SYMBOL SLAVE_SYMBOL)?)? ( + COMMENT_SYMBOL textLiteral + )? DO_SYMBOL compoundStatement +; + +createRole: + // The server grammar has a clear_privileges rule here, which is only used to clear internal state. + ROLE_SYMBOL ifNotExists? roleList +; + +createSpatialReference: + OR_SYMBOL REPLACE_SYMBOL SPATIAL_SYMBOL REFERENCE_SYMBOL SYSTEM_SYMBOL real_ulonglong_number srsAttribute* + | SPATIAL_SYMBOL REFERENCE_SYMBOL SYSTEM_SYMBOL ifNotExists? real_ulonglong_number srsAttribute* +; + +srsAttribute: + NAME_SYMBOL TEXT_SYMBOL textStringNoLinebreak + | DEFINITION_SYMBOL TEXT_SYMBOL textStringNoLinebreak + | ORGANIZATION_SYMBOL textStringNoLinebreak IDENTIFIED_SYMBOL BY_SYMBOL real_ulonglong_number + | DESCRIPTION_SYMBOL TEXT_SYMBOL textStringNoLinebreak +; + +//---------------------------------------------------------------------------------------------------------------------- + +dropStatement: + DROP_SYMBOL ( + dropDatabase + | dropEvent + | dropFunction + | dropProcedure + | dropIndex + | dropLogfileGroup + | dropServer + | dropTable + | dropTableSpace + | dropTrigger + | dropView + | {serverVersion >= 80000}? dropRole + | {serverVersion >= 80011}? dropSpatialReference + | {serverVersion >= 80014}? dropUndoTablespace + ) +; + +dropDatabase: + DATABASE_SYMBOL ifExists? schemaRef +; + +dropEvent: + EVENT_SYMBOL ifExists? eventRef +; + +dropFunction: + FUNCTION_SYMBOL ifExists? functionRef // Including UDFs. +; + +dropProcedure: + PROCEDURE_SYMBOL ifExists? procedureRef +; + +dropIndex: + onlineOption? type = INDEX_SYMBOL indexRef ON_SYMBOL tableRef indexLockAndAlgorithm? +; + +dropLogfileGroup: + LOGFILE_SYMBOL GROUP_SYMBOL logfileGroupRef ( + dropLogfileGroupOption (COMMA_SYMBOL? dropLogfileGroupOption)* + )? +; + +dropLogfileGroupOption: + tsOptionWait + | tsOptionEngine +; + +dropServer: + SERVER_SYMBOL ifExists? serverRef +; + +dropTable: + TEMPORARY_SYMBOL? type = (TABLE_SYMBOL | TABLES_SYMBOL) ifExists? tableRefList ( + RESTRICT_SYMBOL + | CASCADE_SYMBOL + )? +; + +dropTableSpace: + TABLESPACE_SYMBOL tablespaceRef ( + dropLogfileGroupOption (COMMA_SYMBOL? dropLogfileGroupOption)* + )? +; + +dropTrigger: + TRIGGER_SYMBOL ifExists? triggerRef +; + +dropView: + VIEW_SYMBOL ifExists? viewRefList (RESTRICT_SYMBOL | CASCADE_SYMBOL)? +; + +dropRole: + ROLE_SYMBOL ifExists? roleList +; + +dropSpatialReference: + SPATIAL_SYMBOL REFERENCE_SYMBOL SYSTEM_SYMBOL ifExists? real_ulonglong_number +; + +dropUndoTablespace: + UNDO_SYMBOL TABLESPACE_SYMBOL tablespaceRef undoTableSpaceOptions? +; + +//---------------------------------------------------------------------------------------------------------------------- + +renameTableStatement: + RENAME_SYMBOL (TABLE_SYMBOL | TABLES_SYMBOL) renamePair (COMMA_SYMBOL renamePair)* +; + +renamePair: + tableRef TO_SYMBOL tableName +; + +//---------------------------------------------------------------------------------------------------------------------- + +truncateTableStatement: + TRUNCATE_SYMBOL TABLE_SYMBOL? tableRef +; + +//---------------------------------------------------------------------------------------------------------------------- + +importStatement: + IMPORT_SYMBOL TABLE_SYMBOL FROM_SYMBOL textStringLiteralList +; + +//--------------- DML statements --------------------------------------------------------------------------------------- + +callStatement: + CALL_SYMBOL procedureRef (OPEN_PAR_SYMBOL exprList? CLOSE_PAR_SYMBOL)? +; + +deleteStatement: + ({serverVersion >= 80000}? withClause)? DELETE_SYMBOL deleteStatementOption* ( + FROM_SYMBOL ( + tableAliasRefList USING_SYMBOL tableReferenceList whereClause? // Multi table variant 1. + | tableRef ({serverVersion >= 80017}? tableAlias)? partitionDelete? + whereClause? orderClause? simpleLimitClause? // Single table delete. + ) + | tableAliasRefList FROM_SYMBOL tableReferenceList whereClause? // Multi table variant 2. + ) +; + +partitionDelete: + {serverVersion >= 50602}? PARTITION_SYMBOL OPEN_PAR_SYMBOL identifierList CLOSE_PAR_SYMBOL +; + +deleteStatementOption: // opt_delete_option in sql_yacc.yy, but the name collides with another rule (delete_options). + QUICK_SYMBOL + | LOW_PRIORITY_SYMBOL + | QUICK_SYMBOL + | IGNORE_SYMBOL +; + +/*doStatement: + DO_SYMBOL ( + {serverVersion < 50709}? exprList + | {serverVersion >= 50709}? selectItemList + ) +;*/ + +/* + * @FIX: + * Reorder "selectItemList" and "exprList" to match "selectItemList", as we don't handle versions yet. + */ +doStatement: + DO_SYMBOL ( + {serverVersion >= 50709}? selectItemList + | {serverVersion < 50709}? exprList + ) +; + +handlerStatement: + HANDLER_SYMBOL ( + tableRef OPEN_SYMBOL tableAlias? + | identifier ( + CLOSE_SYMBOL + | READ_SYMBOL handlerReadOrScan whereClause? limitClause? + ) + ) +; + +handlerReadOrScan: + (FIRST_SYMBOL | NEXT_SYMBOL) // Scan function. + | identifier ( + // The rkey part. + (FIRST_SYMBOL | NEXT_SYMBOL | PREV_SYMBOL | LAST_SYMBOL) + | ( + EQUAL_OPERATOR + | LESS_THAN_OPERATOR + | GREATER_THAN_OPERATOR + | LESS_OR_EQUAL_OPERATOR + | GREATER_OR_EQUAL_OPERATOR + ) OPEN_PAR_SYMBOL values CLOSE_PAR_SYMBOL + ) +; + +//---------------------------------------------------------------------------------------------------------------------- + +insertStatement: + INSERT_SYMBOL insertLockOption? IGNORE_SYMBOL? INTO_SYMBOL? tableRef usePartition? ( + insertFromConstructor ({ serverVersion >= 80018}? valuesReference)? + | SET_SYMBOL updateList ({ serverVersion >= 80018}? valuesReference)? + | insertQueryExpression + ) insertUpdateList? +; + +insertLockOption: + LOW_PRIORITY_SYMBOL + | DELAYED_SYMBOL // Only allowed if no select is used. Check in the semantic phase. + | HIGH_PRIORITY_SYMBOL +; + +insertFromConstructor: + (OPEN_PAR_SYMBOL fields? CLOSE_PAR_SYMBOL)? insertValues +; + +fields: + insertIdentifier (COMMA_SYMBOL insertIdentifier)* +; + +insertValues: + (VALUES_SYMBOL | VALUE_SYMBOL) valueList +; + +insertQueryExpression: + queryExpressionOrParens + | OPEN_PAR_SYMBOL fields? CLOSE_PAR_SYMBOL queryExpressionOrParens +; + +valueList: + OPEN_PAR_SYMBOL values? CLOSE_PAR_SYMBOL ( + COMMA_SYMBOL OPEN_PAR_SYMBOL values? CLOSE_PAR_SYMBOL + )* +; + +values: + (expr | DEFAULT_SYMBOL) (COMMA_SYMBOL (expr | DEFAULT_SYMBOL))* +; + +valuesReference: + AS_SYMBOL identifier columnInternalRefList? +; + +insertUpdateList: + ON_SYMBOL DUPLICATE_SYMBOL KEY_SYMBOL UPDATE_SYMBOL updateList +; + +//---------------------------------------------------------------------------------------------------------------------- + +loadStatement: + LOAD_SYMBOL dataOrXml (LOW_PRIORITY_SYMBOL | CONCURRENT_SYMBOL)? LOCAL_SYMBOL? INFILE_SYMBOL textLiteral ( + REPLACE_SYMBOL + | IGNORE_SYMBOL + )? INTO_SYMBOL TABLE_SYMBOL tableRef usePartition? charsetClause? xmlRowsIdentifiedBy? fieldsClause? linesClause? + loadDataFileTail +; + +dataOrXml: + DATA_SYMBOL + | XML_SYMBOL +; + +xmlRowsIdentifiedBy: + ROWS_SYMBOL IDENTIFIED_SYMBOL BY_SYMBOL textString +; + +loadDataFileTail: + (IGNORE_SYMBOL INT_NUMBER (LINES_SYMBOL | ROWS_SYMBOL))? loadDataFileTargetList? ( + SET_SYMBOL updateList + )? +; + +loadDataFileTargetList: + OPEN_PAR_SYMBOL fieldOrVariableList? CLOSE_PAR_SYMBOL +; + +fieldOrVariableList: + (columnRef | userVariable) (COMMA_SYMBOL (columnRef | userVariable))* +; + +//---------------------------------------------------------------------------------------------------------------------- + +replaceStatement: + REPLACE_SYMBOL (LOW_PRIORITY_SYMBOL | DELAYED_SYMBOL)? INTO_SYMBOL? tableRef usePartition? ( + insertFromConstructor + | SET_SYMBOL updateList + | insertQueryExpression + ) +; + +//---------------------------------------------------------------------------------------------------------------------- + +/*selectStatement: + queryExpression lockingClauseList? + | queryExpressionParens + | selectStatementWithInto +;*/ + +/* + * @FIX: + * Fix "selectStatement" to solve conflicts between "queryExpressionParens" and "selectStatementWithInto". + * Since "queryExpression" already contains "queryExpressionParens" as a subrule, we can remove it here. + */ +selectStatement: + queryExpression lockingClauseList? + | selectStatementWithInto +; + +/* + From the server grammar: + + MySQL has a syntax extension that allows into clauses in any one of two + places. They may appear either before the from clause or at the end. All in + a top-level select statement. This extends the standard syntax in two + ways. First, we don't have the restriction that the result can contain only + one row: the into clause might be INTO OUTFILE/DUMPFILE in which case any + number of rows is allowed. Hence MySQL does not have any special case for + the standard's syntax all the way down to the . So instead we solve it by writing an ambiguous grammar and use + precedence rules to sort out the shift/reduce conflict. + + The problem is when the parser has seen SELECT