Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visit model not deterministic? #47

Open
katy-sadowski opened this issue Jul 13, 2024 · 6 comments
Open

Visit model not deterministic? #47

katy-sadowski opened this issue Jul 13, 2024 · 6 comments
Assignees

Comments

@katy-sadowski
Copy link
Collaborator

katy-sadowski commented Jul 13, 2024

I observed differences between the ETL-Synthea output and the dbt-synthea output (1 visit existed in ETL-Synthea that didn't exist in dbt-synthea, and vice versa). It appears this might to be due to the fact that in some cases an arbitrary record is chosen from among 2 or more records in the process of populating visit_occurrence. I'm still not 100% sure what is going on, but things to look into include:

In general the visit logic feels very complicated; I wonder if there is a simpler way to generate IDs than what's being done here. Let's look into this as well :)

@katy-sadowski
Copy link
Collaborator Author

Encounter ID bd0581d7-28e9-30e9-7e4a-846556992490 appears in dbt's visit_occurrence table but not ETL-Synthea's. Encounter ID d910aaaf-872f-dabd-c0f1-742eacdde64c appears in ETL-Synthea's visit_occurrence but not dbt's. I think it has to do with the way IDs are being assigned here and then prioritization rules are assigned here.

This is also impacting the cost table, which links into the visit tables to get cost information for events. Thus costs are being calculated differently, and there are some discrepancies in the presence/absence of cost rows, as events are being linked to different encounters in the 2 runs.

@lawrenceadams
Copy link
Collaborator

Having trouble replicating this issue! When I compare both tables they look the same:

image

I have probably messed up though!

I do agree the logic could probably be cleaned up

@katy-sadowski
Copy link
Collaborator Author

huh! i'm honestly not 100% sure how the logic works so need to dig in further to determine what the issue is... i think this can be accomplished via the refactors i'd like to do to move from the ETL-Synthea SQL into something more "dbt-esque". which i'd like to start on soon 😃

@katy-sadowski katy-sadowski self-assigned this Sep 21, 2024
@lawrenceadams
Copy link
Collaborator

To be honest neither do I! but yes definitely worth doing!! 😄

@katy-sadowski
Copy link
Collaborator Author

I have a new theory. I'm cleaning up typecasting, and realized that because ETL-Synthea truncates datetimes to dates and dbt-synthea does not, the visit logic is going to work differently. There are several comparisons of visit dates here which in ETL-Synthea compare dates, and in dbt-synthea compare datetimes.

TBD if it's better to use dates or datetimes for the purpose of this logic. I still need to dive in and understand exactly what's going on in this model.

@lawrenceadams
Copy link
Collaborator

I have a new theory. I'm cleaning up typecasting, and realized that because ETL-Synthea truncates datetimes to dates and dbt-synthea does not, the visit logic is going to work differently

Good catch - I completely missed that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants