-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visit model not deterministic? #47
Comments
Encounter ID bd0581d7-28e9-30e9-7e4a-846556992490 appears in dbt's visit_occurrence table but not ETL-Synthea's. Encounter ID d910aaaf-872f-dabd-c0f1-742eacdde64c appears in ETL-Synthea's visit_occurrence but not dbt's. I think it has to do with the way IDs are being assigned here and then prioritization rules are assigned here. This is also impacting the cost table, which links into the visit tables to get cost information for events. Thus costs are being calculated differently, and there are some discrepancies in the presence/absence of cost rows, as events are being linked to different encounters in the 2 runs. |
huh! i'm honestly not 100% sure how the logic works so need to dig in further to determine what the issue is... i think this can be accomplished via the refactors i'd like to do to move from the ETL-Synthea SQL into something more "dbt-esque". which i'd like to start on soon 😃 |
To be honest neither do I! but yes definitely worth doing!! 😄 |
I have a new theory. I'm cleaning up typecasting, and realized that because ETL-Synthea truncates datetimes to dates and dbt-synthea does not, the visit logic is going to work differently. There are several comparisons of visit dates here which in ETL-Synthea compare dates, and in dbt-synthea compare datetimes. TBD if it's better to use dates or datetimes for the purpose of this logic. I still need to dive in and understand exactly what's going on in this model. |
Good catch - I completely missed that! |
I observed differences between the ETL-Synthea output and the dbt-synthea output (1 visit existed in ETL-Synthea that didn't exist in dbt-synthea, and vice versa). It appears this might to be due to the fact that in some cases an arbitrary record is chosen from among 2 or more records in the process of populating visit_occurrence. I'm still not 100% sure what is going on, but things to look into include:
In general the visit logic feels very complicated; I wonder if there is a simpler way to generate IDs than what's being done here. Let's look into this as well :)
The text was updated successfully, but these errors were encountered: