Skip to content

Commit

Permalink
source-bigquery-batch: Fix DATETIME serialization
Browse files Browse the repository at this point in the history
BigQuery DATETIME columns only support microsecond precision, but
the default serialization of the Go `civil.DateTime` struct uses
nanosecond precision. Which is fine for output only, and *also*
is fine for cursors until the capture restarts, and *also* is
fine when the datetime values are whole seconds.

But a fractional-second datetime used as a cursor will be stored
as a string in the state checkpoint, and trying to feed that back
to BigQuery after a restart will produce an error because there
are too many digits of precision.

The fix is to handle the formatting of `civil.DateTime` values
into strings ourselves and make sure it only has six digits.
  • Loading branch information
willdonnelly committed Oct 24, 2024
1 parent 6d78535 commit 4fa0a96
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions source-bigquery-batch/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,10 @@ import (
"fmt"
"math"
"strings"
"time"

"cloud.google.com/go/bigquery"
"cloud.google.com/go/civil"
"github.com/estuary/connectors/go/schedule"
schemagen "github.com/estuary/connectors/go/schema-gen"
boilerplate "github.com/estuary/connectors/source-boilerplate"
Expand Down Expand Up @@ -62,8 +64,15 @@ func (c *Config) SetDefaults() {
}
}

const (
// Google Cloud DATETIME columns support microsecond precision at most
datetimeFormatMicros = "2006-01-02T15:04:05.000000"
)

func translateBigQueryValue(val any, fieldType bigquery.FieldType) (any, error) {
switch val := val.(type) {
case civil.DateTime:
return val.In(time.UTC).Format(datetimeFormatMicros), nil
case string:
if fieldType == "JSON" && json.Valid([]byte(val)) {
return json.RawMessage([]byte(val)), nil
Expand Down

0 comments on commit 4fa0a96

Please sign in to comment.