Skip to content

Commit

Permalink
fix: get sqlmesh started without a default project ID (#2093)
Browse files Browse the repository at this point in the history
* We should load the GCP project ID from .env
* Adds a README for how to get started with sqlmesh
  • Loading branch information
ryscheng authored Sep 6, 2024
1 parent 010553a commit 67659c1
Show file tree
Hide file tree
Showing 4 changed files with 43 additions and 6 deletions.
3 changes: 3 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ DAGSTER__CLICKHOUSE__HOST=
DAGSTER__CLICKHOUSE__USER=
DAGSTER__CLICKHOUSE__PASSWORD=

## sqlmesh
SQLMESH_DUCKDB_LOCAL_PATH=/tmp/oso.duckdb

###################
# DEPRECATED
###################
Expand Down
34 changes: 34 additions & 0 deletions warehouse/metrics_mesh/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# OSO sqlmesh pipeline

## Setup

Make sure to set the following environment variables
in your .env file (at the root of the oso repo)

```
GOOGLE_PROJECT_ID=opensource-observer
SQLMESH_DUCKDB_LOCAL_PATH=/tmp/oso.duckdb
```

Make sure you've logged into Google Cloud on your terminal

```bash
gcloud auth application-default login
```

Now install dependencies and download playground data into
a local DuckDB instance.

```bash
poetry install
poetry shell
oso metrics local initialize
```

## Run

```bash
cd warehouse/metrics_mesh
sqlmesh plan dev --start 2024-07-01 --end 2024-08-01 # to run for specific date rates (fast)
sqlmesh plan # to run the entire pipeline (slow)
```
4 changes: 3 additions & 1 deletion warehouse/metrics_mesh/lib/local/utils.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
import typing as t
import duckdb
import os
from google.cloud import bigquery

project_id = os.getenv("GOOGLE_PROJECT_ID")

def bq_to_duckdb(table_mapping: t.Dict[str, str], duckdb_path: str):
"""Copies the tables in table_mapping to tables in duckdb
The table_mapping is in the form { "bigquery_table_fqn": "duckdb_table_fqn" }
"""
bqclient = bigquery.Client()
bqclient = bigquery.Client(project=project_id)
conn = duckdb.connect(duckdb_path)

conn.sql("CREATE SCHEMA IF NOT EXISTS sources;")
Expand Down
8 changes: 3 additions & 5 deletions warehouse/oso_lets_go/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,13 @@
A catchall for development environment tools related to the python tooling.
"""

import os

import click
import dotenv
dotenv.load_dotenv()

import os
import click
from metrics_mesh.lib.local.utils import initialize_local_duckdb, reset_local_duckdb

dotenv.load_dotenv()


@click.group()
@click.option("--debug/--no-debug", default=False)
Expand Down

0 comments on commit 67659c1

Please sign in to comment.