-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue 1414 #1416
Fix issue 1414 #1416
Conversation
Test: test_2_rounds_1k_duckdbPercentage change: -0.5%
Test: test_2_rounds_1k_sqlitePercentage change: 3.7%
Click here for vega lite time series charts |
- Added logic to add the source dataset column to new records if required and not already present. - Added logic to add the unique ID column to new records if not already present.
# If our df_concat_with_tf table already exists, use backwards inference to | ||
# find all underlying term frequency tables. | ||
# If our df_concat_with_tf table already exists, derive the term frequency | ||
# tables from df_concat_with_tf rather than computing them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to the PR, but i wanted to clarify this comment, it took me a little while to understand 'backwards inference'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works as expected so is good to go in from my perspective.
This is also related to an issue that @aliceoleary0 has found (not yet written up in github) so there will likely be further developments on this function in the near future.
@RossKen yep, I will write up the issue you mentioned this week and post a link here for clarity. |
Issue
If you run code like:
you get an error like:
or
This is frustrating from the users point of view because:
See #1414 for bug report.
Testing script: