Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc for APPROX_DISTINCT_COUNT aggregate function (#19732) #19779

Merged
Merged
28 changes: 27 additions & 1 deletion functions-and-operators/aggregate-group-by-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,33 @@ TiDB 支持的 MySQL `GROUP BY` 聚合函数如下所示:
1 row in set (0.00 sec)
```

上述聚合函数除 `GROUP_CONCAT()` 和 `APPROX_PERCENTILE()` 以外,均可作为[窗口函数](/functions-and-operators/window-functions.md)使用。
+ `APPROX_COUNT_DISTINCT(expr, [expr...])`

该函数的功能与 `COUNT(DISTINCT)` 相似,用于统计不同值的数量,但返回的是一个近似值。它采用 `BJKST` 算法,在处理具有幂律分布特征的大规模数据集时,可以显著降低内存消耗。此外,对于低基数(low cardinality)的数据,该函数的结果准确性较高,同时对 CPU 的使用效率也较优。
qiancai marked this conversation as resolved.
Show resolved Hide resolved

以下是一个使用该函数的示例:

```sql
DROP TABLE IF EXISTS t;
CREATE TABLE t(a INT, b INT, c INT);
INSERT INTO t VALUES(1, 1, 1), (2, 1, 1), (2, 2, 1), (3, 1, 1), (5, 1, 2), (5, 1, 2), (6, 1, 2), (7, 1, 2);
```

```sql
SELECT APPROX_COUNT_DISTINCT(a, b) FROM t GROUP BY c;
```

```
+-----------------------------+
| approx_count_distinct(a, b) |
+-----------------------------+
| 3 |
| 4 |
+-----------------------------+
2 rows in set (0.00 sec)
```

上述聚合函数除 `GROUP_CONCAT()`、 `APPROX_PERCENTILE()` 和 `APPROX_COUNT_DISTINCT` 以外,均可作为[窗口函数](/functions-and-operators/window-functions.md)使用。

## GROUP BY 修饰符

Expand Down