Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE - BigQuery support for iceberg materialisations #1416

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20241126-215421.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Adds Iceberg support as a new table format configuration
time: 2024-11-26T21:54:21.990317Z
custom:
Author: borjavb
Issue: "1370"
15 changes: 12 additions & 3 deletions dbt/include/bigquery/macros/adapters.sql
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
{%- set raw_partition_by = config.get('partition_by', none) -%}
{%- set raw_cluster_by = config.get('cluster_by', none) -%}
{%- set sql_header = config.get('sql_header', none) -%}

{%- set table_format = config.get('table_format', 'default') -%}
{%- set partition_config = adapter.parse_partition_by(raw_partition_by) -%}
{%- if partition_config.time_ingestion_partitioning -%}
{%- set columns = get_columns_with_types_in_query_sql(sql) -%}
Expand All @@ -23,10 +23,19 @@
{#-- cannot do contracts at the same time as time ingestion partitioning -#}
{{ columns }}
{% endif %}
{{ partition_by(partition_config) }}
{%- if table_format == "iceberg" and partition_config is not none-%}
{#-- Nov 2024. Limitations: PARTITION BY cannot be used in iceberg-#}
{% do exceptions.raise_compiler_error("Partition by not yet available in iceberg tables, use cluster by instead") %}
{%- else -%}
{{ partition_by(partition_config) }}
{% endif %}

{{ cluster_by(raw_cluster_by) }}

{{ bigquery_table_options(config, model, temporary) }}
{% if table_format == "iceberg" %}
{{ bigquery_iceberg_connection(config) }}
{{ bigquery_iceberg_table_options(config, relation) }}
{% endif %}

{#-- PARTITION BY cannot be used with the AS query_statement clause.
https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#partition_expression
Expand Down
22 changes: 22 additions & 0 deletions dbt/include/bigquery/macros/relations/table/options.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,25 @@
{% set opts = adapter.get_table_options(config, node, temporary) %}
{%- do return(bigquery_options(opts)) -%}
{%- endmacro -%}

{% macro bigquery_iceberg_table_options(config, relation) %}
{% set base_location = config.get('base_location') %}
{%- if not base_location-%}
{% do exceptions.raise_compiler_error("base_location not found") %}
{% endif %}
{% set sub_path = relation.identifier %}
{% set storage_uri = base_location~'/'~sub_path %}
{% set opts = {'file_format':'"parquet"',
'table_format':'"iceberg"',
'storage_uri':'"'~storage_uri~'"' }
%}
{%- do return(bigquery_options(opts)) -%}
{%- endmacro -%}

{% macro bigquery_iceberg_connection(config) %}
{% set connection = config.get('bl_connection') %}
{%- if not connection-%}
{% do exceptions.raise_compiler_error("BigLake connection not found") %}
{% endif %}
{%- do return("WITH CONNECTION `"~connection~"`") %}
{%- endmacro -%}