Skip to content

Commit

Permalink
Feature: Union schema compatibility (#21)
Browse files Browse the repository at this point in the history
* MagicBot/add-union-schema updates

* add union schema

* update for databricks and version

* update identifiers

* update pkgs

* update materialization

* update grouping

* update changelog & readme

* add consistency tests

* update consistency report

* update changelog & add autoreleaser

* regen docs

* Update packages.yml
  • Loading branch information
fivetran-catfritz authored Jul 24, 2024
1 parent be158f0 commit aae8b47
Show file tree
Hide file tree
Showing 46 changed files with 627 additions and 57 deletions.
3 changes: 2 additions & 1 deletion .buildkite/hooks/pre-command
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,5 @@ export CI_SNOWFLAKE_DBT_USER=$(gcloud secrets versions access latest --secret="C
export CI_SNOWFLAKE_DBT_WAREHOUSE=$(gcloud secrets versions access latest --secret="CI_SNOWFLAKE_DBT_WAREHOUSE" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_HOST=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HOST" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_HTTP_PATH=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HTTP_PATH" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_CATALOG=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_CATALOG" --project="dbt-package-testing-363917")
1 change: 1 addition & 0 deletions .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,5 +69,6 @@ steps:
- "CI_DATABRICKS_DBT_HOST"
- "CI_DATABRICKS_DBT_HTTP_PATH"
- "CI_DATABRICKS_DBT_TOKEN"
- "CI_DATABRICKS_DBT_CATALOG"
commands: |
bash .buildkite/scripts/run_models.sh databricks
4 changes: 3 additions & 1 deletion .buildkite/scripts/run_models.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,6 @@ dbt deps
dbt seed --target "$db" --full-refresh
dbt run --target "$db" --full-refresh
dbt test --target "$db"
dbt run-operation fivetran_utils.drop_schemas_automation --target "$db"
dbt run --vars '{apple_store__using_subscriptions: true, google_play__using_earnings: true, google_play__using_subscriptions: true}' --target "$db" --full-refresh
dbt test --vars '{apple_store__using_subscriptions: true, google_play__using_earnings: true, google_play__using_subscriptions: true}' --target "$db"
dbt run-operation fivetran_utils.drop_schemas_automation --target "$db"
13 changes: 13 additions & 0 deletions .github/workflows/auto-release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: 'auto release'
on:
pull_request:
types:
- closed
branches:
- main

jobs:
call-workflow-passing-data:
if: github.event.pull_request.merged
uses: fivetran/dbt_package_automations/.github/workflows/auto-release.yml@main
secrets: inherit
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
# dbt_app_reporting v0.4.0
[PR #21](https://github.com/fivetran/dbt_app_reporting/pull/21) includes the following updates:

## 🚨 Breaking hanges 🚨
- Identifier variables for the following packages have been updated for consistency with the source name and compatibility with the union schema feature. See the package's changelog for a full list of changes.
- [dbt_apple_store](https://github.com/fivetran/dbt_linkedin/blob/main/CHANGELOG.md#dbt_apple_store-v040)
- [dbt_google_play](https://github.com/fivetran/dbt_microsoft_ads/blob/main/CHANGELOG.md#dbt_google_play-v040)

## Feature update 🎉
- Unioning capability! This adds the ability to union source data from multiple app_reporting connectors. Refer to the [README](https://github.com/fivetran/dbt_app_reporting/blob/main/README.md#union-multiple-connectors) for more details.
- Added a `source_relation` column in each upstream model for tracking the source of each record.
- The `source_relation` column is also persisted from the upstream models to the end models.

## Under the hood
- Included auto-releaser GitHub Actions workflow to automate future releases.

# dbt_app_reporting v0.3.2
## Bug Fixes
[PR #19](https://github.com/fivetran/dbt_app_reporting/pull/19) includes the following update:
Expand Down
30 changes: 23 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Include the following github package version in your `packages.yml`
```yaml
packages:
- package: fivetran/app_reporting
version: [">=0.3.0", "<0.4.0"] # we recommend using ranges to capture non-breaking changes automatically
version: [">=0.4.0", "<0.5.0"] # we recommend using ranges to capture non-breaking changes automatically
```
Do NOT include the individual app platform packages in this file. The app reporting package itself has dependencies on these packages and will install them as well.
Expand Down Expand Up @@ -114,15 +114,31 @@ models:
> Provide a blank `+schema: ` to write to the `target_schema` without any suffix.

## (Optional) Step 7: Additional configurations
<details><summary>Expand to view configurations</summary>
<details open><summary>Expand/collapse configurations</summary>

### Union multiple connectors
If you have multiple app reporting connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `<package_name>_union_schemas` OR `<package_name>_union_databases` variables (cannot do both) in your root `dbt_project.yml` file. Below are the variables and examples for each connector:

```yml
vars:
apple_store_union_schemas: ['apple_store_usa','apple_store_canada']
apple_store_union_databases: ['apple_store_usa','apple_store_canada']
google_play_union_schemas: ['google_play_usa','google_play_canada']
google_play_union_databases: ['google_play_usa','google_play_canada']
```
Please be aware that the native `source.yml` connection set up in the package will not function when the union schema/database feature is utilized. Although the data will be correctly combined, you will not observe the sources linked to the package models in the Directed Acyclic Graph (DAG). This happens because the package includes only one defined `source.yml`.

To connect your multiple schema/database sources to the package models, follow the steps outlined in the [Union Data Defined Sources Configuration](https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source) section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.

### Change the source table references
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:
> IMPORTANT: See the Apple Store [`dbt_project.yml`](https://github.com/fivetran/dbt_apple_store_source/blob/main/dbt_project.yml) and Google Play [`dbt_project.yml`](https://github.com/fivetran/dbt_google_play_source/blob/main/dbt_project.yml) variable declarations to see the expected names.

```yml
vars:
<default_source_table_name>_identifier: your_table_name
apple_store_<default_source_table_name>_identifier: your_table_name
google_play_<default_source_table_name>_identifier: your_table_name
```

</details>
Expand All @@ -143,16 +159,16 @@ This dbt package is dependent on the following dbt packages. For more informatio
```yml
packages:
- package: fivetran/apple_store
version: [">=0.3.0", "<0.4.0"]
version: [">=0.4.0", "<0.5.0"]
- package: fivetran/apple_store_source
version: [">=0.3.0", "<0.4.0"]
version: [">=0.4.0", "<0.5.0"]
- package: fivetran/google_play
version: [">=0.3.0", "<0.4.0"]
version: [">=0.4.0", "<0.5.0"]
- package: fivetran/google_play_source
version: [">=0.3.0", "<0.4.0"]
version: [">=0.4.0", "<0.5.0"]
- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'app_reporting'
version: '0.3.2'
version: '0.4.0'
config-version: 2
models:
app_reporting:
Expand Down
1 change: 1 addition & 0 deletions docs/catalog.json

Large diffs are not rendered by default.

75 changes: 75 additions & 0 deletions docs/index.html

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions docs/manifest.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions docs/run_results.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ integration_tests:
schema: app_reporting_integrations_test_5
threads: 8
databricks:
catalog: null
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: app_reporting_integrations_test_5
Expand Down
72 changes: 38 additions & 34 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,46 +1,49 @@
name: 'app_reporting_integration_tests'
version: '0.3.2'
version: '0.4.0'
profile: 'integration_tests'
config-version: 2
vars:
# apple_store__using_subscriptions: true # uncomment this line when generating docs!
# google_play__using_subscriptions: true # uncomment this line when regenerating docs!
# google_play__using_earnings: true # uncomment this line when regenerating docs!
google_play_schema: app_reporting_integrations_test_5
apple_store_schema: app_reporting_integrations_test_5
google_play_source:
stats_installs_app_version_identifier: "stats_installs_app_version"
stats_crashes_app_version_identifier: "stats_crashes_app_version"
stats_ratings_app_version_identifier: "stats_ratings_app_version"
stats_installs_device_identifier: "stats_installs_device"
stats_ratings_device_identifier: "stats_ratings_device"
stats_installs_os_version_identifier: "stats_installs_os_version"
stats_ratings_os_version_identifier: "stats_ratings_os_version"
stats_crashes_os_version_identifier: "stats_crashes_os_version"
stats_installs_country_identifier: "stats_installs_country"
stats_ratings_country_identifier: "stats_ratings_country"
stats_store_performance_country_identifier: "stats_store_performance_country"
stats_store_performance_traffic_source_identifier: "stats_store_performance_traffic_source"
stats_installs_overview_identifier: "stats_installs_overview"
stats_crashes_overview_identifier: "stats_crashes_overview"
stats_ratings_overview_identifier: "stats_ratings_overview"
earnings_identifier: "earnings"
financial_stats_subscriptions_country_identifier: "financial_stats_subscriptions_country"
google_play_stats_installs_app_version_identifier: "stats_installs_app_version"
google_play_stats_crashes_app_version_identifier: "stats_crashes_app_version"
google_play_stats_ratings_app_version_identifier: "stats_ratings_app_version"
google_play_stats_installs_device_identifier: "stats_installs_device"
google_play_stats_ratings_device_identifier: "stats_ratings_device"
google_play_stats_installs_os_version_identifier: "stats_installs_os_version"
google_play_stats_ratings_os_version_identifier: "stats_ratings_os_version"
google_play_stats_crashes_os_version_identifier: "stats_crashes_os_version"
google_play_stats_installs_country_identifier: "stats_installs_country"
google_play_stats_ratings_country_identifier: "stats_ratings_country"
google_play_stats_store_performance_country_identifier: "stats_store_performance_country"
google_play_stats_store_performance_traffic_source_identifier: "stats_store_performance_traffic_source"
google_play_stats_installs_overview_identifier: "stats_installs_overview"
google_play_stats_crashes_overview_identifier: "stats_crashes_overview"
google_play_stats_ratings_overview_identifier: "stats_ratings_overview"
google_play_earnings_identifier: "earnings"
google_play_financial_stats_subscriptions_country_identifier: "financial_stats_subscriptions_country"

apple_store_source:
app_identifier: "app"
app_store_platform_version_source_type_report_identifier: "app_store_platform_version_source_type"
app_store_source_type_device_report_identifier: "app_store_source_type_device"
app_store_territory_source_type_report_identifier: "app_store_territory_source_type"
crashes_app_version_device_report_identifier: "crashes_app_version"
crashes_platform_version_device_report_identifier: "crashes_platform_version"
downloads_platform_version_source_type_report_identifier: "downloads_platform_version_source_type"
downloads_source_type_device_report_identifier: "downloads_source_type_device"
downloads_territory_source_type_report_identifier: "downloads_territory_source_type"
sales_account_identifier: "sales_account"
sales_subscription_event_summary_identifier: "sales_subscription_events"
sales_subscription_summary_identifier: "sales_subscription_summary"
usage_app_version_source_type_report_identifier: "usage_app_version_source_type"
usage_platform_version_source_type_report_identifier: "usage_platform_version_source_type"
usage_source_type_device_report_identifier: "usage_source_type_device"
usage_territory_source_type_report_identifier: usage_territory_source_type
apple_store_app_identifier: "app"
apple_store_app_store_platform_version_source_type_report_identifier: "app_store_platform_version_source_type"
apple_store_app_store_source_type_device_report_identifier: "app_store_source_type_device"
apple_store_app_store_territory_source_type_report_identifier: "app_store_territory_source_type"
apple_store_crashes_app_version_device_report_identifier: "crashes_app_version"
apple_store_crashes_platform_version_device_report_identifier: "crashes_platform_version"
apple_store_downloads_platform_version_source_type_report_identifier: "downloads_platform_version_source_type"
apple_store_downloads_source_type_device_report_identifier: "downloads_source_type_device"
apple_store_downloads_territory_source_type_report_identifier: "downloads_territory_source_type"
apple_store_sales_account_identifier: "sales_account"
apple_store_sales_subscription_event_summary_identifier: "sales_subscription_events"
apple_store_sales_subscription_summary_identifier: "sales_subscription_summary"
apple_store_usage_app_version_source_type_report_identifier: "usage_app_version_source_type"
apple_store_usage_platform_version_source_type_report_identifier: "usage_platform_version_source_type"
apple_store_usage_source_type_device_report_identifier: "usage_source_type_device"
apple_store_usage_territory_source_type_report_identifier: usage_territory_source_type

apple_store__subscription_events:
- 'Renew'
Expand All @@ -55,6 +58,7 @@ models:
+persist_docs:
relation: "{{ false if target.type in ('spark','databricks') else true }}"
columns: "{{ false if target.type in ('spark','databricks') else true }}"
+schema: "app_reporting_{{ var('directed_schema','dev') }}" ## To be used for validation testing

seeds:
app_reporting_integration_tests:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

-- this test ensures the daily_activity end model matches the prior version
with prod as (
select *
from {{ target.schema }}_app_reporting_prod.app_reporting__app_version_report
),

dev as (
select *
from {{ target.schema }}_app_reporting_dev.app_reporting__app_version_report
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

-- this test ensures the daily_activity end model matches the prior version
with prod as (
select *
from {{ target.schema }}_app_reporting_prod.app_reporting__country_report
),

dev as (
select *
from {{ target.schema }}_app_reporting_dev.app_reporting__country_report
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

-- this test ensures the daily_activity end model matches the prior version
with prod as (
select *
from {{ target.schema }}_app_reporting_prod.app_reporting__device_report
),

dev as (
select *
from {{ target.schema }}_app_reporting_dev.app_reporting__device_report
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final
Loading

0 comments on commit aae8b47

Please sign in to comment.