-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New polars and Duckdb Times #87
Conversation
I wouldn't say that's fair. There is quite a performance benefit with regard to cache sizes on |
Hi @ritchie46, Sorry for the wait, but I had other duckdblabs matters to attend to, and wanted to be sure the deviations were within tolerance. I understand that the answer might be different because of the change from float64 to float32, but a tolerance of of 0.001 (or 0.1%) is allowed between answer for the same question between versions. I think this is an acceptable tolerance and solutions/answers should remain precise within this tolerance between versions. I've gone through many of the non-matching answers and while most respect the tolerance there are a few deviations that concern me Take the following results for example (groupby q5).
chk is created using the following SQL queries
v1, v2, and v3 are then combined with a ';' separator. Polars using float64 vs. float32 have results of 47498842805.648 and 47271718912 respectively for other system results for q5. From what I've seen, no other systems use float32. Most likely because the results are no longer valid within the tolerances. Because of the deviation, I will not be merging results that use float32. I thought about partial results, but if one question is omitted, then the whole solution posts "exception/internal error etc." on the leaderboard and is not ranked, which I also don't think is fair. You can check these answers with the following SQL queries (in the branch Tmonster/polars_results_july_5)
|
I am going to merge these results today, and if you want to continue a discussion on moving to float32, we can move it to another issue/PR |
I assumed DuckDB used float32 because of this line: db-benchmark/duckdb/groupby-duckdb.R Line 47 in e54b17f
Of course I am fine with float64, and if you say all systems do so, great! I just want to make sure we all benchmark the same data-types otherwise it's not apples to apples. |
Keep v3 a float64 in polars, otherwise answers are no longer consistent with previous polars versions