diff --git a/docs/en/integrations/data-ingestion/clickpipes/kafka.md b/docs/en/integrations/data-ingestion/clickpipes/kafka.md index d33276cc64f..46ee8da3746 100644 --- a/docs/en/integrations/data-ingestion/clickpipes/kafka.md +++ b/docs/en/integrations/data-ingestion/clickpipes/kafka.md @@ -102,45 +102,29 @@ The supported formats are: - [JSON](../../../interfaces/formats.md/#json) - [AvroConfluent](../../../interfaces/formats.md/#data-format-avro-confluent) -### JSON +### Supported Data Types -#### Supported data types +The following ClickHouse data types are currently supported in ClickPipes: -The following ClickHouse types are currently supported for JSON payloads: - -- Base numeric types - - Int8 - - Int16 - - Int32 - - Int64 - - UInt8 - - UInt16 - - UInt32 - - UInt64 - - Float32 - - Float64 +- Base numeric types - \[U\]Int8/16/32/64 and Float32/64 +- Large integer types - \[U\]Int128/256 +- Decimal Types - Boolean - String - FixedString - Date, Date32 -- DateTime, DateTime64 +- DateTime, DateTime64 (UTC timezones only) - Enum8/Enum16 -- LowCardinality(String) +- UUID +- IPv4 +- IPv6 +- all ClickHouse LowCardinality types - Map with keys and values using any of the above types (including Nullables) - Tuple and Array with elements using any of the above types (including Nullables, one level depth only) -- JSON/Object('json'). experimental - -:::note -Nullable versions of the above are also supported with these exceptions: - -- Nullable Enums are **not** supported -- LowCardinality(Nullable(String)) is **not** supported - -::: ### Avro -#### Supported data types +#### Supported Avro Data Types ClickPipes supports all Avro Primitive and Complex types, and all Avro Logical types except `time-millis`, `time-micros`, `local-timestamp-millis`, `local_timestamp-micros`, and `duration`. Avro `record` types are converted to Tuple, `array` types to Array, and `map` to Map (string keys only). In general the conversions listed [here](../../../../en/interfaces/formats.md#data-types-matching) are available. We recommend using exact type matching for Avro numeric types, as ClickPipes does not check for overflow or precision loss on type conversion. @@ -152,7 +136,7 @@ Nullable types in Avro are defined by using a Union schema of `(T, null)` or `(n - An empty Map for a null Avro Map - A named Tuple with all default/zero values for a null Avro Record -ClickPipes does not currently support schemas that contain other Avro Unions (this may change in the future with the maturity of the new Variant data type). If the Avro schema contains a "non-null" union, ClickPipes will generate an error when attempting to calculate a mapping between the Avro schema and Clickhouse column types. +ClickPipes does not currently support schemas that contain other Avro Unions (this may change in the future with the maturity of the new ClickHouse Variant and JSON data types). If the Avro schema contains a "non-null" union, ClickPipes will generate an error when attempting to calculate a mapping between the Avro schema and Clickhouse column types. #### Avro Schema Management @@ -178,6 +162,10 @@ The following virtual columns are supported for Kafka compatible streaming data | _topic | Kafka Topic | String | | _header_keys | Parallel array of keys in the record Headers | Array(String) | | _header_values | Parallel array of headers in the record Headers | Array(String) | +| _raw_message | Full Kafka Message | String | + +Note that the _raw_message column is only recommended for JSON data. For use cases where only the JSON string is required (such as using ClickHouse [`JsonExtract*`](https://clickhouse.com/docs/en/sql-reference/functions/json-functions#jsonextract-functions) functions to populate a downstream materialized +view), it may improve ClickPipes performance to delete all the "non-virtual" columns. ## Limitations diff --git a/docs/en/integrations/data-ingestion/clickpipes/kinesis.md b/docs/en/integrations/data-ingestion/clickpipes/kinesis.md index 85ed410d6d7..32425e5d2d2 100644 --- a/docs/en/integrations/data-ingestion/clickpipes/kinesis.md +++ b/docs/en/integrations/data-ingestion/clickpipes/kinesis.md @@ -67,44 +67,44 @@ You have familiarized yourself with the [ClickPipes intro](./index.md) and setup 10. **Congratulations!** you have successfully set up your first ClickPipe. If this is a streaming ClickPipe it will be continuously running, ingesting data in real-time from your remote data source. Otherwise it will ingest the batch and complete. -## Supported data formats +## Supported Data Formats The supported formats are: - [JSON](../../../interfaces/formats.md/#json) -## Supported data types (JSON) - -The following ClickHouse types are currently supported for JSON payloads: - -- Base numeric types - - Int8 - - Int16 - - Int32 - - Int64 - - UInt8 - - UInt16 - - UInt32 - - UInt64 - - Float32 - - Float64 +## Supported Data Types + +The following ClickHouse data types are currently supported in ClickPipes: + +- Base numeric types - \[U\]Int8/16/32/64 and Float32/64 +- Large integer types - \[U\]Int128/256 +- Decimal Types - Boolean - String - FixedString - Date, Date32 -- DateTime, DateTime64 +- DateTime, DateTime64 (UTC timezones only) - Enum8/Enum16 -- LowCardinality(String) +- UUID +- IPv4 +- IPv6 +- all ClickHouse LowCardinality types - Map with keys and values using any of the above types (including Nullables) - Tuple and Array with elements using any of the above types (including Nullables, one level depth only) -- JSON/Object('json'). experimental -:::note -Nullable versions of the above are also supported with these exceptions: +## Kinesis Virtual Columns + +The following virtual columns are supported for Kinesis stream. When creating a new destination table virtual columns can be added by using the `Add Column` button. -- Nullable Enums are **not** supported -- LowCardinality(Nullable(String)) is **not** supported +| Name | Description | Recommended Data Type | +|--------------|---------------------------------------------------------------|-----------------------| +| _key | Kinesis Partition Key | String | +| _timestamp | Kinesis Approximate Arrival Timestamp (millisecond precision) | DateTime64(3) | +| _stream | Kafka Stream Name | String | +| _raw_message | Full Kinesis Message | String | -::: +The _raw_message field can be used in cases where only full Kinesis JSON record is required (such as using ClickHouse [`JsonExtract*`](https://clickhouse.com/docs/en/sql-reference/functions/json-functions#jsonextract-functions) functions to populate a downstream materialized +view). For such pipes, it may improve ClickPipes performance to delete all the "non-virtual" columns. ## Limitations