diff --git a/docs/en/integrations/data-ingestion/etl-tools/apache-beam.md b/docs/en/integrations/data-ingestion/etl-tools/apache-beam.md index c7fcaaa93aa..13555309b1b 100644 --- a/docs/en/integrations/data-ingestion/etl-tools/apache-beam.md +++ b/docs/en/integrations/data-ingestion/etl-tools/apache-beam.md @@ -97,31 +97,44 @@ public class Main { ## Supported Data Types -| ClickHouse | Apache Beam | Is Supported | Notes | -|--------------------------------------|------------------------------|--------------|----------------------------------------------------------------------------------------------------------------------------------------| -| `TableSchema.TypeName.FLOAT32` | `Schema.TypeName#FLOAT` | ✅ | | -| `TableSchema.TypeName.FLOAT64` | `Schema.TypeName#DOUBLE` | ✅ | | -| `TableSchema.TypeName.INT8` | `Schema.TypeName#BYTE` | ✅ | | -| `TableSchema.TypeName.INT16` | `Schema.TypeName#INT16` | ✅ | | -| `TableSchema.TypeName.INT32` | `Schema.TypeName#INT32` | ✅ | | -| `TableSchema.TypeName.INT64` | `Schema.TypeName#INT64` | ✅ | | -| `TableSchema.TypeName.STRING` | `Schema.TypeName#STRING` | ✅ | | -| `TableSchema.TypeName.UINT8` | `Schema.TypeName#INT16` | ✅ | | -| `TableSchema.TypeName.UINT16` | `Schema.TypeName#INT32` | ✅ | | -| `TableSchema.TypeName.UINT32` | `Schema.TypeName#INT64` | ✅ | | -| `TableSchema.TypeName.UINT64` | `Schema.TypeName#INT64` | ✅ | | -| `TableSchema.TypeName.DATE` | `Schema.TypeName#DATETIME` | ✅ | | -| `TableSchema.TypeName.DATETIME` | `Schema.TypeName#DATETIME` | ✅ | | -| `TableSchema.TypeName.ARRAY` | `Schema.TypeName#ARRAY` | ✅ | | -| `TableSchema.TypeName.ENUM8` | `Schema.TypeName#STRING` | ✅ | | -| `TableSchema.TypeName.ENUM16` | `Schema.TypeName#STRING` | ✅ | | -| `TableSchema.TypeName.BOOL` | `Schema.TypeName#BOOLEAN` | ✅ | | -| `TableSchema.TypeName.TUPLE` | `Schema.TypeName#ROW` | ✅ | | -| `TableSchema.TypeName.FIXEDSTRING` | `FixedBytes` | ✅ | `FixedBytes` is a `LogicalType` representing a fixed-length
byte array located at
`org.apache.beam.sdk.schemas.logicaltypes` | -| | `Schema.TypeName#DECIMAL` | ❌ | | -| | `Schema.TypeName#MAP` | ❌ | | - - +| ClickHouse | Apache Beam | Is Supported | Notes | +|------------------------------------|----------------------------|--------------|------------------------------------------------------------------------------------------------------------------------------------------| +| `TableSchema.TypeName.FLOAT32` | `Schema.TypeName#FLOAT` | ✅ | | +| `TableSchema.TypeName.FLOAT64` | `Schema.TypeName#DOUBLE` | ✅ | | +| `TableSchema.TypeName.INT8` | `Schema.TypeName#BYTE` | ✅ | | +| `TableSchema.TypeName.INT16` | `Schema.TypeName#INT16` | ✅ | | +| `TableSchema.TypeName.INT32` | `Schema.TypeName#INT32` | ✅ | | +| `TableSchema.TypeName.INT64` | `Schema.TypeName#INT64` | ✅ | | +| `TableSchema.TypeName.STRING` | `Schema.TypeName#STRING` | ✅ | | +| `TableSchema.TypeName.UINT8` | `Schema.TypeName#INT16` | ✅ | | +| `TableSchema.TypeName.UINT16` | `Schema.TypeName#INT32` | ✅ | | +| `TableSchema.TypeName.UINT32` | `Schema.TypeName#INT64` | ✅ | | +| `TableSchema.TypeName.UINT64` | `Schema.TypeName#INT64` | ✅ | | +| `TableSchema.TypeName.DATE` | `Schema.TypeName#DATETIME` | ✅ | | +| `TableSchema.TypeName.DATETIME` | `Schema.TypeName#DATETIME` | ✅ | | +| `TableSchema.TypeName.ARRAY` | `Schema.TypeName#ARRAY` | ✅ | | +| `TableSchema.TypeName.ENUM8` | `Schema.TypeName#STRING` | ✅ | | +| `TableSchema.TypeName.ENUM16` | `Schema.TypeName#STRING` | ✅ | | +| `TableSchema.TypeName.BOOL` | `Schema.TypeName#BOOLEAN` | ✅ | | +| `TableSchema.TypeName.TUPLE` | `Schema.TypeName#ROW` | ✅ | | +| `TableSchema.TypeName.FIXEDSTRING` | `FixedBytes` | ✅ | `FixedBytes` is a `LogicalType` representing a fixed-length
byte array located at
`org.apache.beam.sdk.schemas.logicaltypes` | +| | `Schema.TypeName#DECIMAL` | ❌ | | +| | `Schema.TypeName#MAP` | ❌ | | + +## ClickHouseIO.Write Parameters + +You can adjust the `ClickHouseIO.Write` configuration with the following setter functions: + +| Parameter Setter Function | Argument Type | Default Value | Description | +|-----------------------------|-----------------------------|-------------------------------|-----------------------------------------------------------------| +| `withMaxInsertBlockSize` | `(long maxInsertBlockSize)` | `1000000` | Maximum size of a block of rows to insert. | +| `withMaxRetries` | `(int maxRetries)` | `5` | Maximum number of retries for failed inserts. | +| `withMaxCumulativeBackoff` | `(Duration maxBackoff)` | `Duration.standardDays(1000)` | Maximum cumulative backoff duration for retries. | +| `withInitialBackoff` | `(Duration initialBackoff)` | `Duration.standardSeconds(5)` | Initial backoff duration before the first retry. | +| `withInsertDistributedSync` | `(Boolean sync)` | `true` | If true, synchronizes insert operations for distributed tables. | +| `withInsertQuorum` | `(Long quorum)` | `null` | The number of replicas required to confirm an insert operation. | +| `withInsertDeduplicate` | `(Boolean deduplicate)` | `true` | If true, deduplication is enabled for insert operations. | +| `withTableSchema` | `(TableSchema schema)` | `null` | Schema of the target ClickHouse table. | ## Limitations diff --git a/docs/en/integrations/data-ingestion/google-dataflow/dataflow.md b/docs/en/integrations/data-ingestion/google-dataflow/dataflow.md new file mode 100644 index 00000000000..8beb39adbf1 --- /dev/null +++ b/docs/en/integrations/data-ingestion/google-dataflow/dataflow.md @@ -0,0 +1,31 @@ +--- +sidebar_label: Integrating Dataflow with ClickHouse +slug: /en/integrations/google-dataflow/dataflow +sidebar_position: 1 +description: Users can ingest data into ClickHouse using Google Dataflow +--- + +# Integrating Google Dataflow with ClickHouse + +[Google Dataflow](https://cloud.google.com/dataflow) is a fully managed stream and batch data processing service. It supports pipelines written in Java or Python and is built on the Apache Beam SDK. + +There are two main ways to use Google Dataflow with ClickHouse, both are leveraging [`ClickHouseIO Apache Beam connector`](../../apache-beam): + +## 1. Java Runner +The [Java Runner](./java-runner) allows users to implement custom Dataflow pipelines using the Apache Beam SDK `ClickHouseIO` integration. This approach provides full flexibility and control over the pipeline logic, enabling users to tailor the ETL process to specific requirements. +However, this option requires knowledge of Java programming and familiarity with the Apache Beam framework. + +### Key Features +- High degree of customization. +- Ideal for complex or advanced use cases. +- Requires coding and understanding of the Beam API. + +## 2. Predefined Templates +ClickHouse offers [predefined templates](./templates) designed for specific use cases, such as importing data from BigQuery into ClickHouse. These templates are ready-to-use and simplify the integration process, making them an excellent choice for users who prefer a no-code solution. + +### Key Features +- No Beam coding required. +- Quick and easy setup for simple use cases. +- Suitable also for users with minimal programming expertise. + +Both approaches are fully compatible with Google Cloud and the ClickHouse ecosystem, offering flexibility depending on your technical expertise and project requirements. diff --git a/docs/en/integrations/data-ingestion/google-dataflow/images/dataflow-inqueue-job.png b/docs/en/integrations/data-ingestion/google-dataflow/images/dataflow-inqueue-job.png new file mode 100644 index 00000000000..8c56eece629 Binary files /dev/null and b/docs/en/integrations/data-ingestion/google-dataflow/images/dataflow-inqueue-job.png differ diff --git a/docs/en/integrations/data-ingestion/google-dataflow/java-runner.md b/docs/en/integrations/data-ingestion/google-dataflow/java-runner.md new file mode 100644 index 00000000000..379b5689005 --- /dev/null +++ b/docs/en/integrations/data-ingestion/google-dataflow/java-runner.md @@ -0,0 +1,20 @@ +--- +sidebar_label: Java Runner +slug: /en/integrations/google-dataflow/java-runner +sidebar_position: 2 +description: Users can ingest data into ClickHouse using Google Dataflow Java Runner +--- + +# Dataflow Java Runner + +The Dataflow Java Runner lets you execute custom Apache Beam pipelines on Google Cloud's Dataflow service. This approach provides maximum flexibility and is well-suited for advanced ETL workflows. + +## How It Works + +1. **Pipeline Implementation** + To use the Java Runner, you need to implement your Beam pipeline using the `ClickHouseIO` - our official Apache Beam connector. For code examples and instructions on how to use the `ClickHouseIO`, please visit [ClickHouse Apache Beam](../../apache-beam). + +2. **Deployment** + Once your pipeline is implemented and configured, you can deploy it to Dataflow using Google Cloud's deployment tools. Comprehensive deployment instructions are provided in the [Google Cloud Dataflow documentation - Java Pipeline](https://cloud.google.com/dataflow/docs/quickstarts/create-pipeline-java). + +**Note**: This approach assumes familiarity with the Beam framework and coding expertise. If you prefer a no-code solution, consider using [ClickHouse's predefined templates](./templates). \ No newline at end of file diff --git a/docs/en/integrations/data-ingestion/google-dataflow/templates.md b/docs/en/integrations/data-ingestion/google-dataflow/templates.md new file mode 100644 index 00000000000..7c358f9851c --- /dev/null +++ b/docs/en/integrations/data-ingestion/google-dataflow/templates.md @@ -0,0 +1,30 @@ +--- +sidebar_label: Templates +slug: /en/integrations/google-dataflow/templates +sidebar_position: 3 +description: Users can ingest data into ClickHouse using Google Dataflow Templates +--- + +# Google Dataflow Templates + +Google Dataflow templates provide a convenient way to execute prebuilt, ready-to-use data pipelines without the need to write custom code. These templates are designed to simplify common data processing tasks and are built using [Apache Beam](https://beam.apache.org/), leveraging connectors like `ClickHouseIO` for seamless integration with ClickHouse databases. By running these templates on Google Dataflow, you can achieve highly scalable, distributed data processing with minimal effort. + + + + +## Why Use Dataflow Templates? + +- **Ease of Use**: Templates eliminate the need for coding by offering preconfigured pipelines tailored to specific use cases. +- **Scalability**: Dataflow ensures your pipeline scales efficiently, handling large volumes of data with distributed processing. +- **Cost Efficiency**: Pay only for the resources you consume, with the ability to optimize pipeline execution costs. + +## How to Run Dataflow Templates + +As of today, the ClickHouse official template is available via the Google Cloud CLI or Dataflow REST API. +For detailed step-by-step instructions, refer to the [Google Dataflow Run Pipeline From a Template Guide](https://cloud.google.com/dataflow/docs/templates/provided-templates). + + +## List of ClickHouse Templates +* [BigQuery To ClickHouse](./templates/bigquery-to-clickhouse) +* [GCS To ClickHouse](https://github.com/ClickHouse/DataflowTemplates/issues/3) (coming soon!) +* [Pub Sub To ClickHouse](https://github.com/ClickHouse/DataflowTemplates/issues/4) (coming soon!) \ No newline at end of file diff --git a/docs/en/integrations/data-ingestion/google-dataflow/templates/bigquery-to-clickhouse.md b/docs/en/integrations/data-ingestion/google-dataflow/templates/bigquery-to-clickhouse.md new file mode 100644 index 00000000000..ee8ff6e9aa7 --- /dev/null +++ b/docs/en/integrations/data-ingestion/google-dataflow/templates/bigquery-to-clickhouse.md @@ -0,0 +1,154 @@ +--- +sidebar_label: BigQuery To ClickHouse +sidebar_position: 1 +slug: /en/integrations/google-dataflow/templates/bigquery-to-clickhouse +description: Users can ingest data from BigQuery into ClickHouse using Google Dataflow Template +--- + +import TOCInline from '@theme/TOCInline'; + +# Dataflow BigQuery to ClickHouse template + +The BigQuery to ClickHouse template is a batch pipeline that ingests data from BigQuery table into ClickHouse table. +The template can either read the entire table or read specific records using a provided query. + + + +## Pipeline requirements + +* The source BigQuery table must exist. +* The target ClickHouse table must exist. +* The ClickHouse host Must be accessible from the Dataflow worker machines. + +## Template Parameters + +
+
+ +| Parameter Name | Parameter Description | Required | Notes | +|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `jdbcUrl` | The ClickHouse JDBC URL in the format `jdbc:clickhouse://:/`. | ✅ | Don't add the username and password as JDBC options. Any other JDBC option could be added at the end of the JDBC URL. For ClickHouse Cloud users, add `ssl=true&sslmode=NONE` to the `jdbcUrl`. | +| `clickHouseUsername` | The ClickHouse username to authenticate with. | ✅ | | +| `clickHousePassword` | The ClickHouse password to authenticate with. | ✅ | | +| `clickHouseTable` | The target ClickHouse table name to insert the data to. | ✅ | | +| `maxInsertBlockSize` | The maximum block size for insertion, if we control the creation of blocks for insertion (ClickHouseIO option). | | A `ClickHouseIO` option. | +| `insertDistributedSync` | If setting is enabled, insert query into distributed waits until data will be sent to all nodes in cluster. (ClickHouseIO option). | | A `ClickHouseIO` option. | +| `insertQuorum` | For INSERT queries in the replicated table, wait writing for the specified number of replicas and linearize the addition of the data. 0 - disabled. | | A `ClickHouseIO` option. This setting is disabled in default server settings. | +| `insertDeduplicate` | For INSERT queries in the replicated table, specifies that deduplication of inserting blocks should be performed. | | A `ClickHouseIO` option. | +| `maxRetries` | Maximum number of retries per insert. | | A `ClickHouseIO` option. | +| `InputTableSpec` | The BigQuery table to read from. Specify either `inputTableSpec` or `query`. When both are set, the `query` parameter takes precedence. Example: `:.`. | | Reads data directly from BigQuery storage using the [BigQuery Storage Read API](https://cloud.google.com/bigquery/docs/reference/storage). Be aware of the [Storage Read API limitations](https://cloud.google.com/bigquery/docs/reference/storage#limitations). | +| `outputDeadletterTable` | The BigQuery table for messages that failed to reach the output table. If a table doesn't exist, it is created during pipeline execution. If not specified, `_error_records` is used. For example, `:.`. | | | +| `query` | The SQL query to use to read data from BigQuery. If the BigQuery dataset is in a different project than the Dataflow job, specify the full dataset name in the SQL query, for example: `..`. Defaults to [GoogleSQL](https://cloud.google.com/bigquery/docs/introduction-sql) unless `useLegacySql` is true. | | You must specify either `inputTableSpec` or `query`. If you set both parameters, the template uses the `query` parameter. Example: `SELECT * FROM sampledb.sample_table`. | +| `useLegacySql` | Set to `true` to use legacy SQL. This parameter only applies when using the `query` parameter. Defaults to `false`. | | | +| `queryLocation` | Needed when reading from an authorized view without the underlying table's permission. For example, `US`. | | | +| `queryTempDataset` | Set an existing dataset to create the temporary table to store the results of the query. For example, `temp_dataset`. | | | +| `KMSEncryptionKey` | If reading from BigQuery using the query source, use this Cloud KMS key to encrypt any temporary tables created. For example, `projects/your-project/locations/global/keyRings/your-keyring/cryptoKeys/your-key`. | | | + + +:::note +All `ClickHouseIO` parameters default values could be found in [`ClickHouseIO` Apache Beam Connector](/docs/en/integrations/apache-beam#clickhouseiowrite-parameters) +::: + +## Source and Target Tables Schema + +In order to effectively load the BigQuery dataset to ClickHouse, and a column infestation process is conducted with the +following phases: + +1. The templates build a schema object based on the target ClickHouse table. +2. The templates iterate over the BigQuery dataset, and tried to match between column based on their names. + +
+ +:::important +Having said that, your BigQuery dataset (either table or query) must have the exact same column names as your ClickHouse +target table. +::: + +## Data Types Mapping + +The BigQuery types are converted based on your ClickHouse table definition. Therefore, the above table lists the +recommended mapping you should have in your target ClickHouse table (for a given BigQuery table/query): + +| BigQuery Type | ClickHouse Type | Notes | +|-----------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| [**Array Type**](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#array_type) | [**Array Type**](../../../sql-reference/data-types/array) | The inner type must be one of the supported primitive data types listed in this table. | +| [**Boolean Type**](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#boolean_type) | [**Bool Type**](../../../sql-reference/data-types/boolean) | | +| [**Date Type**](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#date_type) | [**Date Type**](../../../sql-reference/data-types/date) | | +| [**Datetime Type**](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#datetime_type) | [**Datetime Type**](../../../sql-reference/data-types/datetime) | Works as well with `Enum8`, `Enum16` and `FixedString`. | +| [**String Type**](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#string_type) | [**String Type**](../../../sql-reference/data-types/string) | In BigQuery all Int types (`INT`, `SMALLINT`, `INTEGER`, `BIGINT`, `TINYINT`, `BYTEINT`) are aliases to `INT64`. We recommend you setting in ClickHouse the right Integer size, as the template will convert the column based on the defined column type (`Int8`, `Int16`, `Int32`, `Int64`). | +| [**Numeric - Integer Types**](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#numeric_types) | [**Integer Types**](../../../sql-reference/data-types/int-uint) | In BigQuery all Int types (`INT`, `SMALLINT`, `INTEGER`, `BIGINT`, `TINYINT`, `BYTEINT`) are aliases to `INT64`. We recommend you setting in ClickHouse the right Integer size, as the template will convert the column based on the defined column type (`Int8`, `Int16`, `Int32`, `Int64`). The template will also convert unassigned Int types if used in ClickHouse table (`UInt8`, `UInt16`, `UInt32`, `UInt64`). | +| [**Numeric - Float Types**](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#numeric_types) | [**Float Types**](../../../sql-reference/data-types/float) | Supported ClickHouse types: `Float32` and `Float64` | + +## Running the Template + +The BigQuery to ClickHouse template is available for execution via the Google Cloud CLI. + +:::note +Be sure to review this document, and specifically the above sections, to fully understand the template's configuration +requirements and prerequisites. + +::: + +### Install & Configure `gcloud` CLI + +- If not already installed, install the [`gcloud` CLI](https://cloud.google.com/sdk/docs/install). +- Follow the `Before you begin` section + in [this guide](https://cloud.google.com/dataflow/docs/guides/templates/using-flex-templates#before-you-begin) to set + up the required configurations, settings, and permissions for running the DataFlow template. + +### Run Command + +Use the [`gcloud dataflow flex-template run`](https://cloud.google.com/sdk/gcloud/reference/dataflow/flex-template/run) +command to run a Dataflow job that uses the Flex Template. + +Below is an example of the command: + +```bash +gcloud dataflow flex-template run "bigquery-clickhouse-dataflow-$(date +%Y%m%d-%H%M%S)" \ + --template-file-gcs-location "gs://clickhouse-dataflow-templates/bigquery-clickhouse-metadata.json" \ + --parameters inputTableSpec="",jdbcUrl="jdbc:clickhouse://:/?ssl=true&sslmode=NONE",clickHouseUsername="",clickHousePassword="",clickHouseTable="" +``` + +### Command Breakdown + +- **Job Name:** The text following the `run` keyword is the unique job name. +- **Template File:** The JSON file specified by `--template-file-gcs-location` defines the template structure and + details about the accepted parameters. The mention file path is public and ready to use. +- **Parameters:** Parameters are separated by commas. For string-based parameters, enclose the values in double quotes. + +### Expected Response + +After running the command, you should see a response similar to the following: + +```bash +job: + createTime: '2025-01-26T14:34:04.608442Z' + currentStateTime: '1970-01-01T00:00:00Z' + id: 2025-01-26_06_34_03-13881126003586053150 + location: us-central1 + name: bigquery-clickhouse-dataflow-20250126-153400 + projectId: ch-integrations + startTime: '2025-01-26T14:34:04.608442Z' +``` + +### Monitor the Job + +Navigate to the [Dataflow Jobs tab](https://console.cloud.google.com/dataflow/jobs) in your Google Cloud Console to +monitor the status of the job. You’ll find the job details, including progress and any errors: + +DataFlow running job + +## Troubleshooting + +### Code: 241. DB::Exception: Memory limit (total) exceeded + +This error occurs when ClickHouse runs out of memory while processing large batches of data. To resolve this issue: + +* Increase the instance resources: Upgrade your ClickHouse server to a larger instance with more memory to handle the data processing load. +* Decrease the batch size: Adjust the batch size in your Dataflow job configuration to send smaller chunks of data to ClickHouse, reducing memory consumption per batch. +These changes might help balance resource usage during data ingestion. + +## Template Source Code + +The template's source code is available in ClickHouse's [DataflowTemplates](https://github.com/ClickHouse/DataflowTemplates) fork. \ No newline at end of file diff --git a/docs/en/integrations/images/logos/dataflow_logo.png b/docs/en/integrations/images/logos/dataflow_logo.png new file mode 100644 index 00000000000..fa55e09bd2d Binary files /dev/null and b/docs/en/integrations/images/logos/dataflow_logo.png differ diff --git a/docs/en/integrations/index.mdx b/docs/en/integrations/index.mdx index 1bcfde36b71..6b3404f84f7 100644 --- a/docs/en/integrations/index.mdx +++ b/docs/en/integrations/index.mdx @@ -253,6 +253,7 @@ We are actively compiling this list of ClickHouse integrations below, so it's no |Chat-DBT| |AI Integration|Create ClickHouse queries using Chat GPT.|[GitHub](https://github.com/plmercereau/chat-dbt)| |ClickHouse Monitoring Dashboard||Dashboard|A simple monitoring dashboard for ClickHouse|[Github](https://github.com/duyet/clickhouse-monitoring)| |Common Lisp|clickhouse-cl Logo|Language client|Common Lisp ClickHouse Client Library|[GitHub](https://github.com/juliojimenez/clickhouse-cl)| +| Dataflow|Dataflow logo|Data ingestion|Google Dataflow is a serverless service for running batch and streaming data pipelines using Apache Beam.|[Documentation](https://clickhouse.com/docs/en/integrations/google-dataflow/dataflow)| |DBNet|Airflow logo|Software IDE|Web-based SQL IDE using Go as a back-end, and the browser as the front-end.|[Github](https://github.com/dbnet-io/dbnet)| |DataLens|Datalens logo|Data visualization|An open-source data analytics and visualization tool.|[Website](https://datalens.tech/),
[Documentation](https://datalens.tech/docs/en/)| |Dataease|Dataease logo|Data visualization|Open source data visualization analysis tool to help users analyze data and gain insight into business trends.|[Website](https://dataease.io/)| diff --git a/scripts/aspell-dict-file.txt b/scripts/aspell-dict-file.txt index afcc01d3ae1..2cb122b2b04 100644 --- a/scripts/aspell-dict-file.txt +++ b/scripts/aspell-dict-file.txt @@ -321,6 +321,18 @@ westus intra --docs/en/cloud/manage/backups.md-- slideout +--docs/en/integrations/data-ingestion/google-dataflow/templates/bigquery-to-clickhouse.md-- +DataFlow +Dataflow +DataflowTemplates +GoogleSQL +linearize +--docs/en/integrations/data-ingestion/google-dataflow/templates.md-- +Dataflow +--docs/en/integrations/data-ingestion/google-dataflow/dataflow.md-- +Dataflow +--docs/en/integrations/data-ingestion/google-dataflow/java-runner.md-- +Dataflow --docs/en/deployment-modes.md-- chDB's memoryview diff --git a/sidebars.js b/sidebars.js index 919f0f04699..9dbef2d4f8d 100644 --- a/sidebars.js +++ b/sidebars.js @@ -872,6 +872,28 @@ const sidebars = { "en/integrations/data-ingestion/etl-tools/airbyte-and-clickhouse", "en/integrations/data-ingestion/aws-glue/index", "en/integrations/data-ingestion/etl-tools/apache-beam", + { + type: "category", + label: "Google Dataflow", + className: "top-nav-item", + collapsed: true, + collapsible: true, + items: [ + "en/integrations/data-ingestion/google-dataflow/dataflow", + "en/integrations/data-ingestion/google-dataflow/java-runner", + "en/integrations/data-ingestion/google-dataflow/templates", + { + type: "category", + label: "Dataflow Templates", + className: "top-nav-item", + collapsed: true, + collapsible: true, + items: [ + "en/integrations/data-ingestion/google-dataflow/templates/bigquery-to-clickhouse", + ], + }, + ], + }, "en/integrations/data-ingestion/etl-tools/dbt/index", "en/integrations/data-ingestion/etl-tools/dlt-and-clickhouse", "en/integrations/data-ingestion/etl-tools/fivetran/index",