apache · xiaokang · Jan 25, 2025 · Jan 25, 2025 · morrySnow · Feb 6, 2025
diff --git a/docs/sql-manual/sql-functions/scalar-functions/string-functions/multi-match-any.md b/docs/sql-manual/sql-functions/scalar-functions/string-functions/multi-match-any.md
@@ -24,31 +24,41 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## multi_match_any
-### Description
-#### Syntax
+## Description
 
-`TINYINT multi_match_any(VARCHAR haystack, ARRAY<VARCHAR> patterns)`
+Returns whether the string matches any of the given regular expressions.
 
+## Syntax
 
-Checks whether the string `haystack` matches the regular expressions `patterns` in re2 syntax. returns 0 if none of the regular expressions are matched and 1 if any of the patterns matches.
+```sql
+TINYINT multi_match_any(VARCHAR haystack, ARRAY<VARCHAR> patterns)
-TINYINT multi_match_any(VARCHAR haystack, ARRAY<VARCHAR> patterns)
+MULTI_MATCH_ANY(<haystack>, <patterns>)
-TINYINT multi_match_any(VARCHAR haystack, ARRAY<VARCHAR> patterns)
+MULTI_MATCH_ANY(<haystack>, <patterns>)
+```
 
-### example
+## Parameters
 
-```
-mysql> select multi_match_any('Hello, World!', ['hello', '!', 'world']);
+| Parameter | Description |
+| -- | -- |
+| `haystack` | The string to be checked |
+| `patterns` | Array of regular expressions |
-| `haystack` | The string to be checked |
-| `patterns` | Array of regular expressions |
+| `<haystack>` | The string to be checked |
+| `<patterns>` | Array of regular expressions |
-| `haystack` | The string to be checked |
-| `patterns` | Array of regular expressions |
+| `<haystack>` | The string to be checked |
+| `<patterns>` | Array of regular expressions |
+
+## Return Value
+
+Returns 1 if the string `haystack` matches any of the regular expressions in the `patterns` array, otherwise returns 0.
-Returns 1 if the string `haystack` matches any of the regular expressions in the `patterns` array, otherwise returns 0.
+Returns 1 if the string `<haystack>` matches any of the regular expressions in the `<patterns>` array, otherwise returns 0.
-Returns 1 if the string `haystack` matches any of the regular expressions in the `patterns` array, otherwise returns 0.
+Returns 1 if the string `<haystack>` matches any of the regular expressions in the `<patterns>` array, otherwise returns 0.
+
+## Examples
+
+```sql
+mysql> SELECT multi_match_any('Hello, World!', ['hello', '!', 'world']);
-mysql> SELECT multi_match_any('Hello, World!', ['hello', '!', 'world']);
+SELECT multi_match_any('Hello, World!', ['hello', '!', 'world']);
-mysql> SELECT multi_match_any('Hello, World!', ['hello', '!', 'world']);
+SELECT multi_match_any('Hello, World!', ['hello', '!', 'world']);
 +-----------------------------------------------------------+
 | multi_match_any('Hello, World!', ['hello', '!', 'world']) |
 +-----------------------------------------------------------+
 | 1                                                         |
 +-----------------------------------------------------------+
 
-mysql> select multi_match_any('abc', ['A', 'bcd']);
+mysql> SELECT multi_match_any('abc', ['A', 'bcd']);
 +--------------------------------------+
 | multi_match_any('abc', ['A', 'bcd']) |
 +--------------------------------------+
 | 0                                    |
 +--------------------------------------+
 ```
-### keywords
-    MULTI_MATCH,MATCH,ANY
diff --git a/...l/sql-functions/scalar-functions/string-functions/multi-search-all-positions.md b/...l/sql-functions/scalar-functions/string-functions/multi-search-all-positions.md
@@ -24,31 +24,41 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## multi_search_all_positions
-### Description
-#### Syntax
+## Description
 
-`ARRAY<INT> multi_search_all_positions(VARCHAR haystack, ARRAY<VARCHAR> needles)`
+Returns the positions of the first occurrence of a set of regular expressions in a string.
 
-Returns an `ARRAY` where the `i`-th element is the position of the `i`-th element in `needles`(i.e. `needle`)'s **first** occurrence in the string `haystack`. Positions are counted from 1, with 0 meaning the element was not found. **Case-sensitive**.
-
-### example
+## Syntax
 
+```sql
+ARRAY<INT> multi_search_all_positions(VARCHAR haystack, ARRAY<VARCHAR> patterns)
 ```
-mysql> select multi_search_all_positions('Hello, World!', ['hello', '!', 'world']);
+
+## Parameters
+
+| Parameter | Description |
+| -- | -- |
+| `haystack` | The string to be checked |
+| `patterns` | Array of regular expressions |
+
+## Return Value
+
+Returns an `ARRAY` where the `i`-th element represents the position of the first occurrence of the `i`-th element (regular expression) in the `patterns` array within the string `haystack`. Positions are counted starting from 1, and 0 indicates that the element was not found.
+
+## Examples
+
+```sql
+mysql> SELECT multi_search_all_positions('Hello, World!', ['hello', '!', 'world']);
 +----------------------------------------------------------------------+
 | multi_search_all_positions('Hello, World!', ['hello', '!', 'world']) |
 +----------------------------------------------------------------------+
-| [0,13,0]                                                             |
+| [0, 13, 0]                                                           |
 +----------------------------------------------------------------------+
 
-select multi_search_all_positions("Hello, World!", ['hello', '!', 'world', 'Hello', 'World']);
+mysql> SELECT multi_search_all_positions("Hello, World!", ['hello', '!', 'world', 'Hello', 'World']);
 +---------------------------------------------------------------------------------------------+
 | multi_search_all_positions('Hello, World!', ARRAY('hello', '!', 'world', 'Hello', 'World')) |
 +---------------------------------------------------------------------------------------------+
 | [0, 13, 0, 1, 8]                                                                            |
 +---------------------------------------------------------------------------------------------+
 ```
-
-### keywords
-    MULTI_SEARCH,SEARCH,POSITIONS
diff --git a/docs/sql-manual/sql-functions/scalar-functions/string-functions/ngram-search.md b/docs/sql-manual/sql-functions/scalar-functions/string-functions/ngram-search.md
@@ -26,42 +26,52 @@ under the License.
 
 ## Description
 
-Calculate the N-gram similarity between `text` and `pattern`. The similarity ranges from 0 to 1, where a higher similarity indicates greater similarity between the two strings. 
+Calculates the N-gram similarity between two strings.
 
-Both `pattern` and `gram_num` must be constants. If the length of either `text` or `pattern` is less than `gram_num`, return 0.
+N-gram similarity is a text similarity calculation method based on N-grams (N-gram sequences). N-gram similarity ranges from 0 to 1, where a higher value indicates greater similarity between the two strings.
 
-N-gram similarity is a method for calculating text similarity based on N-grams. An N-gram is a set of continuous N characters or words extracted from a text string. For example, for the string "text" with N=2 (bigram), the bigrams are: {"te", "ex", "xt"}.
+An N-gram is a contiguous sequence of N characters or words from a text. For example, for the string 'text', when N=2, its bi-grams are: {"te", "ex", "xt"}.
 
-The N-gram similarity is calculated as:
+The N-gram similarity is calculated as:  
+**2 * |Intersection| / (|haystack set| + |pattern set|)**  
 
-2 * |Intersection| / (|text set| + |pattern set|)
+Where |haystack set| and |pattern set| are the N-grams of `haystack` and `pattern`, respectively, and `Intersection` is the intersection of the two sets.
 
-where |text set| and |pattern set| are the N-grams of `text` and `pattern`, and `Intersection` is the intersection of the two sets.
+Note that, by definition, a similarity of 1 does not mean the two strings are identical.
 
-Note that, by definition, a similarity of 1 does not necessarily mean the two strings are identical.
+## Syntax
+
+```sql
+DOUBLE ngram_search(VARCHAR haystack, VARCHAR pattern, INT gram_num)
+```
 
-Only supports ASCII encoding.
+## Parameters
 
-## Syntax
+| Parameter | Description |
+| -- | -- |
+| `haystack` | The string to be checked, supports only ASCII encoding |
+| `pattern`  | The string used for similarity comparison, must be a constant, supports only ASCII encoding |
+| `gram_num` | The `N` in N-gram, must be a constant |
+
+## Return Value
 
-`DOUBLE ngram_search(VARCHAR text,VARCHAR pattern,INT gram_num)`
+Returns the N-gram similarity between `haystack` and `pattern`.  
+Special case: If the length of `haystack` or `pattern` is less than `gram_num`, returns 0.
 
-## Example
+## Examples
 
 ```sql
-mysql> select ngram_search('123456789' , '12345' , 3);
+mysql> SELECT ngram_search('123456789' , '12345' , 3);
 +---------------------------------------+
 | ngram_search('123456789', '12345', 3) |
 +---------------------------------------+
 |                                   0.6 |
 +---------------------------------------+
 
-mysql> select ngram_search("abababab","babababa",2);
+mysql> SELECT ngram_search('abababab', 'babababa', 2);
 +-----------------------------------------+
 | ngram_search('abababab', 'babababa', 2) |
 +-----------------------------------------+
 |                                       1 |
 +-----------------------------------------+
 ```
-## keywords
-    NGRAM_SEARCH,NGRAM,SEARCH
diff --git a/docs/sql-manual/sql-functions/scalar-functions/string-functions/tokenize.md b/docs/sql-manual/sql-functions/scalar-functions/string-functions/tokenize.md
@@ -1,6 +1,6 @@
 ---
 {
-    "title": "tokenize",
+    "title": "TOKENIZE",
     "language": "en"
 }
 ---
@@ -23,3 +23,36 @@ KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
+
+## Description
+
+Returns the result of text tokenization. Tokenization is the process of splitting text into a set of tokens.
+
+## Syntax
+
+```sql
+ARRAY<VARCHAR> tokenize(VARCHAR txt, VARCHAR tokenizer_args)
+```
+
+## Parameters
+
+| Parameter | Description |
+| -- | -- |
+| `txt`            | The text to be tokenized |
+| `tokenizer_args` | Tokenizer arguments, a Doris PROPERTIES format string. For detailed information, refer to the inverted index documentation. |
+
+## Return Value
+
+Returns the tokenization result of the text `txt` based on the tokenizer arguments `tokenizer_args`.
+
+## Examples
+
+```sql
+mysql> SELECT tokenize('I love Doris', '"parser"="english"');
++------------------------------------------------+
+| tokenize('I love Doris', '"parser"="english"') |
++------------------------------------------------+
+| ["i", "love", "doris"]                         |
++------------------------------------------------+
+1 row in set (0.02 sec)
+```
diff --git a/...t/sql-manual/sql-functions/scalar-functions/string-functions/multi-match-any.md b/...t/sql-manual/sql-functions/scalar-functions/string-functions/multi-match-any.md
@@ -24,31 +24,46 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## multi_match_any
 ## 描述
+
+返回字符串是否与给定的一组正则表达式匹配。
+
+
 ## 语法
 
-`TINYINT multi_match_any(VARCHAR haystack, ARRAY<VARCHAR> patterns)`
+```sql
+TINYINT multi_match_any(VARCHAR haystack, ARRAY<VARCHAR> patterns)
+```
+
+
+## 参数
 
+| 参数 | 说明 |
+| -- | -- |
+| `haystack` | 被检查的字符串 |
+| `patterns` | 正则表达式数组 |
+
+
+## 返回值
+
+如果字符串 `haystack` 匹配 `patterns` 数组中的任意一个正则表达式返回 1，否则返回 0。
 
-检查字符串 `haystack` 是否与 re2 语法中的正则表达式 `patterns` 相匹配。如果都没有匹配的正则表达式返回 0，否则返回 1。
 
 ## 举例
 
-```
-mysql> select multi_match_any('Hello, World!', ['hello', '!', 'world']);
+```sql
+mysql> SELECT multi_match_any('Hello, World!', ['hello', '!', 'world']);
 +-----------------------------------------------------------+
 | multi_match_any('Hello, World!', ['hello', '!', 'world']) |
 +-----------------------------------------------------------+
 | 1                                                         |
 +-----------------------------------------------------------+
 
-mysql> select multi_match_any('abc', ['A', 'bcd']);
+mysql> SELECT multi_match_any('abc', ['A', 'bcd']);
 +--------------------------------------+
 | multi_match_any('abc', ['A', 'bcd']) |
 +--------------------------------------+
 | 0                                    |
 +--------------------------------------+
 ```
-### keywords
-    MULTI_MATCH,MATCH,ANY
+
diff --git a/...l/sql-functions/scalar-functions/string-functions/multi-search-all-positions.md b/...l/sql-functions/scalar-functions/string-functions/multi-search-all-positions.md
@@ -24,31 +24,45 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## multi_search_all_positions
 ## 描述
+
+返回一组正则表达式在一个字符串中首次出现的位置。
+
+
 ## 语法
 
-`ARRAY<INT> multi_search_all_positions(VARCHAR haystack, ARRAY<VARCHAR> needles)`
+```sql
+ARRAY<INT> multi_search_all_positions(VARCHAR haystack, ARRAY<VARCHAR> patterns)
+```
+
+
+## 参数
+
+| 参数 | 说明 |
+| -- | -- |
+| `haystack` | 被检查的字符串 |
+| `patterns` | 正则表达式数组 |
+
+
+## 返回值
+
+返回一个 `ARRAY`，其中第 `i` 个元素为 `patterns` 数组中第 `i` 个元素（正则表达式），在字符串 `haystack` 中**首次**出现的位置，位置从 1 开始计数，0 代表未找到该元素。
 
-返回一个 `ARRAY`，其中第 `i` 个元素为 `needles` 中第 `i` 个元素 `needle`，在字符串 `haystack` 中**首次**出现的位置。位置从1开始计数，0代表未找到该元素。**大小写敏感**。
 
 ## 举例
 
-```
-mysql> select multi_search_all_positions('Hello, World!', ['hello', '!', 'world']);
+```sql
+mysql> SELECT multi_search_all_positions('Hello, World!', ['hello', '!', 'world']);
 +----------------------------------------------------------------------+
 | multi_search_all_positions('Hello, World!', ['hello', '!', 'world']) |
 +----------------------------------------------------------------------+
-| [0,13,0]                                                             |
+| [0, 13, 0]                                                             |
 +----------------------------------------------------------------------+
 
-select multi_search_all_positions("Hello, World!", ['hello', '!', 'world', 'Hello', 'World']);
+mysql> SELECT multi_search_all_positions("Hello, World!", ['hello', '!', 'world', 'Hello', 'World']);
 +---------------------------------------------------------------------------------------------+
 | multi_search_all_positions('Hello, World!', ARRAY('hello', '!', 'world', 'Hello', 'World')) |
 +---------------------------------------------------------------------------------------------+
 | [0, 13, 0, 1, 8]                                                                            |
 +---------------------------------------------------------------------------------------------+
 ```
-
-### keywords
-    MULTI_SEARCH,SEARCH,POSITIONS