Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison error while assertApproximateDataFrameEquality #55

Closed
belgacea opened this issue Jun 12, 2019 · 3 comments
Closed

Comparison error while assertApproximateDataFrameEquality #55

belgacea opened this issue Jun 12, 2019 · 3 comments

Comments

@belgacea
Copy link

Hello,
I'm trying to use the approximate equality assertion between 2 dataframes, but I get a exception saying that both dataframe doesn't have the same number of rows, which isn't true as you can see.

assertApproximateDataFrameEquality(output, expected, 0.01, ignoreNullable = true)
[info]   com.github.mrpowers.spark.fast.tests.DatasetContentMismatch: Actual DataFrame Row Count: '61'
[info] Expected DataFrame Row Count: '61'
[info]   at com.github.mrpowers.spark.fast.tests.DatasetComparer$class.throwIfDatasetsAreUnequal$1(DatasetComparer.scala:197)
[info]   at com.github.mrpowers.spark.fast.tests.DatasetComparer$class.assertLargeDatasetEquality(DatasetComparer.scala:213)
[info]   at com.test.TestSpec.assertLargeDatasetEquality(TestSpec.scala:14)
[info]   at com.github.mrpowers.spark.fast.tests.DatasetComparer$class.assertApproximateDataFrameEquality(DatasetComparer.scala:240)
[info]   at com.test.TestSpec.assertApproximateDataFrameEquality(TestSpec.scala:14)

Note : There is a lot of DoubleType in those dataframes. Is this related to #29 ?

@belgacea
Copy link
Author

belgacea commented Jul 7, 2019

Hey @MrPowers ! Could you take a look when you have some time ?
I still don't get what's my issue on the assertApproximateDataFrameEquality function :/

@MrPowers
Copy link
Collaborator

@belgacea - Think this commit should fix the bug you noticed.

Can you bump your project to spark-fast-tests v0.20.0 and see if you get a better error message?

The library was throwing a count mismatch message when it really should have been throwing a content mismatch message. The error message won't be the best... I'll try to make it better soon! Let me know if this helps!

@belgacea
Copy link
Author

Works like a charm ! Thank you very much ☺️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants