source-klaviyo: fix incremental pagination #2037

Alex-Bair · 2024-10-09T22:16:56Z

Description:

Incremental streams were using > instead of >= when filtering results with a specific datetime. Although subsequent requests use cursors to paginate through responses, it's possible to miss results within the same second on connector restarts (like for schema inference updates).

Snapshot names have also been updated to align with what's generated automatically by pytest.

Workflow steps:

(How does one use this feature, and how has it changed)

Documentation links affected:

(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)

Notes for reviewers:

Tested on a local stack. Confirmed more data is captured using greater-or-equal instead of greater-than in the filter query param.

This change is

Incremental streams were using `>` instead of `>=` when filtering results with a specific datetime. Although subsequent requests use cursors to paginate through responses, it's possible to miss results within the same second on connector restarts (like for schema inference updates).

williamhbaker

LGTM

There's the obvious consideration that rather than missing records, we'll end up with some duplicates, but that seems better than missing some. And there may be scenarios where the capture could get "stuck" if a huge number of records all have the same timestamp? But I'm not sure how likely that is, and it's probably not something that we can practically address right now.

Alex-Bair · 2024-10-10T02:26:50Z

Agreed, I figured for a quick fix, it'd be better to have some duplicates rather than miss records. And the way the IncrementalKlaviyoStream is set up, I don't think it'll get stuck in a loop. When the connector starts up, it uses the checkpointed datetime for the initial request, then it paginates through the remaining records up to the present with page cursors. Then the datetime of the last emitted record is checkpointed to be used when the connector starts up again. I'm not sure why the connector was built to checkpoint datetime values when Klaviyo does support page cursors.

Changing the checkpointed values to be those page cursors would help reduce duplicates, but threading that through & ensuring nothing breaks is a larger effort that'd take a bit longer than just changing the filter query param.

Alex-Bair added the change:unplanned Unplanned change, useful for things like doc updates label Oct 9, 2024

Alex-Bair requested a review from williamhbaker October 9, 2024 22:16

Alex-Bair added 2 commits October 9, 2024 18:38

source-klaviyo: update snapshot file names

d95d51d

Alex-Bair force-pushed the bair/source-klaviyo-gte branch from 2c38467 to d95d51d Compare October 9, 2024 22:38

williamhbaker approved these changes Oct 10, 2024

View reviewed changes

Alex-Bair merged commit f799669 into main Oct 10, 2024
71 of 76 checks passed

Alex-Bair deleted the bair/source-klaviyo-gte branch October 10, 2024 12:17

Alex-Bair mentioned this pull request Nov 1, 2024

source-klaviyo: filter Profiles with > instead of >= #2115

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

source-klaviyo: fix incremental pagination #2037

source-klaviyo: fix incremental pagination #2037

Alex-Bair commented Oct 9, 2024 •

edited

Loading

williamhbaker left a comment

Alex-Bair commented Oct 10, 2024 •

edited

Loading

source-klaviyo: fix incremental pagination #2037

source-klaviyo: fix incremental pagination #2037

Conversation

Alex-Bair commented Oct 9, 2024 • edited Loading

williamhbaker left a comment

Choose a reason for hiding this comment

Alex-Bair commented Oct 10, 2024 • edited Loading

Alex-Bair commented Oct 9, 2024 •

edited

Loading

Alex-Bair commented Oct 10, 2024 •

edited

Loading