Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better readability for numbers in output #89

Merged
merged 4 commits into from
Aug 28, 2023
Merged

Conversation

YYYasin19
Copy link
Contributor

Closes #88

Adds a method for cutting decimal numbers at their first differing place (cf. issue for a better explanation of the problem).

I also added a method that rounds numbers to their order of magnitude so the difference in large numbers can be seen more easily. Otherwise, you sometimes have an output like 12389239823 rows instead of 12259385943.
I just realized that the current method would round both to 120,000,... but an even better solution would be to round them to their first differing order of magnitude as well, i.e. 12,300,000,000 rows instead of 12,200,000,000.

I'll have to integrate them more though and add tests 👍

@codecov
Copy link

codecov bot commented Nov 20, 2022

Codecov Report

Merging #89 (b3e0ee7) into main (84eaeaa) will increase coverage by 1.91%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #89      +/-   ##
==========================================
+ Coverage   91.85%   93.76%   +1.91%     
==========================================
  Files          17       18       +1     
  Lines        1878     1894      +16     
==========================================
+ Hits         1725     1776      +51     
+ Misses        153      118      -35     
Files Changed Coverage Δ
src/datajudge/constraints/nrows.py 94.93% <100.00%> (+0.19%) ⬆️
src/datajudge/formatter.py 100.00% <100.00%> (ø)
src/datajudge/utils.py 100.00% <100.00%> (ø)

... and 2 files with indirect coverage changes

@YYYasin19 YYYasin19 marked this pull request as ready for review December 11, 2022 16:25
@YYYasin19
Copy link
Contributor Author

Not really really ready for review, but I'd like to hear some thoughts.
The problem seems a little more complicated than I thought at first. The issue regarding UX I have is the following:
How would you communicate to user that the numbers are not perfectly accurate but rounded for better readability?
A global setting? A method arg? Could be an arg to the Requirement or Constraint itself, e.g.

req.add_numeric_mean_constraint(..., accurate_numbers = True)

or maybe an argument for the pytest_integration that then sets a global variable?

@YYYasin19
Copy link
Contributor Author

As you can see in my usage of the two functions, I'm also not really sure what's the best way from a dev experience POV.
Smart would be to just use them at a 'higher level' point in the abstraction tree we have in datajudge, e.g. apply formatting in get_factual_value once.

@kklein
Copy link
Collaborator

kklein commented Dec 12, 2022

The issue regarding UX I have is the following:
How would you communicate to user that the numbers are not perfectly accurate but rounded for better readability?
A global setting? A method arg? Could be an arg to the Requirement or Constraint itself

I'm not sure I find it totally necessary to to communicate to the end user what kind of rounding has taken place, as long as the rounding works in the interest of the prototypical user. Potentially one could add a short phrase to assertion texts indicating what kind of rounding has taken place, though I'm not sure about how bloated the texts might become then.

A global setting? A method arg? Could be an arg to the Requirement or Constraint itself, e.g.

If you truly want to make this parametrizable, the test method of a Constraint seems the most natural to me. It's not clear to me why this should be state of a Requirement or Constraint - I'd rather think of it as a parameter to the testing itself. If an end user is not using the pytest_integration module, they could pass it every time they call test. If they are using the pytest_integration module, they only need to pass it once.

As you can see in my usage of the two functions, I'm also not really sure what's the best way from a dev experience POV.
Smart would be to just use them at a 'higher level' point in the abstraction tree we have in datajudge, e.g. apply formatting in get_factual_value once.

I'm not really sure I know what you mean. Your exemplary use in NRowsMaxLoss doesn't look too bad to me. :)

@YYYasin19
Copy link
Contributor Author

Nice! I think you're right about all the things you have said.
Rethinking it through, I don't think any user should need more readability since we only round at the differing place, i.e., there is enough precision to see the difference.

What do you think about percentages instead of floats?

@YYYasin19
Copy link
Contributor Author

image

Ignoring the messy code above: We could also color-code the output for better readability without having a trade-off on the user's precision. very nice idea from @jonashaag

@ivergara
Copy link
Collaborator

Great idea with the colorization! Just be cautious of cases where the place it's being printed on doesn't support that and then you end up (eventually) with garbage in the string.

@YYYasin19
Copy link
Contributor Author

Great idea with the colorization! Just be cautious of cases where the place it's being printed on doesn't support that and then you end up (eventually) with garbage in the string.

Yep, routing that output into a file gives you not-so-well formatted string.

@YYYasin19
Copy link
Contributor Author

YYYasin19 commented Dec 14, 2022

We can of course hack around like this

import sys
if sys.stdout.isatty():
    print(console.colorize("red", "This text is red."))
else:
    print("This text is red.")

but that may not be worth the hurdle, too..

@jonashaag
Copy link
Contributor

It's not a hack actually, you might want to have a look at how other libraries identify color-readiness, I expect similar code

@YYYasin19
Copy link
Contributor Author

image

What do you think about the following implementation? This works using coloroma, which is already a dependency of pytest, so it's at no additional cost for us.

When piping output to a file, it strips all formatting, s.t. something like the first example does not happen. It also works in f-strings, which are used in our assert statements.

@kklein
Copy link
Collaborator

kklein commented Jan 17, 2023

I still really like the idea of the coloring, yet it doesn't seem to work perfectly for my local machine just yet. When I use the given functionality in an assertion string, I obtain the following:

Screenshot 2023-01-17 at 4 38 18 PM

The default pytest coloring and the coloring introduced in this PR outdo each other. :/

@YYassin19 would you be open to making this PR about rounding and tackling the coloring in a follow-up PR? That might allow it to separate concerns and facilitate progress.

@YYYasin19
Copy link
Contributor Author

YYYasin19 commented Jan 17, 2023

@YYassin19 would you be open to making this PR about rounding and tackling the coloring in a follow-up PR?

I thought the idea was that this was the alternative to rounding. By adding colors, we can still show full complexity while making it very easy to read.

@YYYasin19
Copy link
Contributor Author

YYYasin19 commented Jan 17, 2023

The default pytest coloring and the coloring introduced in this PR outdo each other. :/

:(
I'll look for a workaround. Detecting pytest and disabling coloring is not an option, though.
Do you know if pytest prints to stderr the red part and the terminal shows it colored because of that? Or do they color it themselves?

@kklein
Copy link
Collaborator

kklein commented Jan 17, 2023

I thought the idea was that this was the alternative to rounding. By adding colors, we can still show full complexity while making it very easy to read.

I see - sorry about the misunderstanding.

Do you know if pytest prints to stderr the red part and the terminal shows it colored because of that? Or do they color it themselves?

Sadly I don't know.

@YYYasin19
Copy link
Contributor Author

Hmm, I have no clue so far.
pytest prints different parts separately (e.g., the short summary to regular stdout, which gets printed in white in most terminals).
We can't disable the feature for pytest as well, as this is the main use case.

@YYYasin19
Copy link
Contributor Author

YYYasin19 commented Jan 21, 2023

Okay, I did some digging, and it turns out that the coloring of the section is currently hard-coded in pytest.
There is a --color=no option in pytest that disables coloring entirely, and we could make changes upstream to make that customizable.
But, since our tests are called from pytest (and not the other way around), it's not really possible to modify the pytest call anyway.

The output seems fine on most UIs I have checked (e.g. PyCharm, VSCode, GitHub Output) but breaks for low contrast environments.

@YYYasin19
Copy link
Contributor Author

YYYasin19 commented Jun 12, 2023

image New try: After talking with @0xbe7a about #153 we had a look on this PR as well, since it's pretty close to the topic.

TL;DR:

  • Rounding (for all use cases) is hard, not always obvious and looses precision where it might be needed.
  • This approach uses coloring and thousand-seperators (cf. image)
  • The chosen color is cyan on purpose since it's not likely to collide with pytest colors.

@ivergara
Copy link
Collaborator

Awesome!

Copy link
Collaborator

@kklein kklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Should we add colorma to environment.yaml and pyproject.toml?

src/datajudge/utils.py Outdated Show resolved Hide resolved
@YYYasin19
Copy link
Contributor Author

Looks great! Should we add colorma to environment.yaml and pyproject.toml?

Yes, I'll do that!

Should we also start re-writing some (all?) of the assertion messages to use this or have these few test cases, refine the feature further and refactor the others from time to time?

@kklein
Copy link
Collaborator

kklein commented Jun 16, 2023

Should we also start re-writing some (all?) of the assertion messages to use this or have these few test cases, refine the feature further and refactor the others from time to time?

Both are fine with me. In the latter case we should probably just make sure to create some shared overview of what has already been done and what remains to be done.

@YYYasin19
Copy link
Contributor Author

Should we also start re-writing some (all?) of the assertion messages to use this or have these few test cases, refine the feature further and refactor the others from time to time?

Both are fine with me. In the latter case we should probably just make sure to create some shared overview of what has already been done and what remains to be done.

I've created an issue to maybe track this task for the next time, hope that's a good middle way!

@kklein
Copy link
Collaborator

kklein commented Jun 17, 2023

I've created an issue to maybe track this task for the next time, hope that's a good middle way!

Thanks! I'm not sure I understand 100%. What do you think about creating a concrete list of constraints with a binary indicator expressing completion?

Copy link
Collaborator

@kklein kklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! Just to be sure: has anyone tested this change with the html reports yet?

@@ -58,9 +61,10 @@ def compare(self, n_rows_factual: int, n_rows_target: int) -> Tuple[bool, str]:
class NRowsEquality(NRows):
def compare(self, n_rows_factual: int, n_rows_target: int) -> Tuple[bool, str]:
result = n_rows_factual == n_rows_target
n1, n2 = diff_color(n_rows_factual, n_rows_target)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would either suggest

n_rows_factual_fmt, n_rows_target_fmt = diff_color(n_rows_factual, n_rows_target)

'as before' or

 factual_fmt, target_fmt = diff_color(n_rows_factual, n_rows_target)

as in other places in this PR.

@@ -145,10 +153,13 @@ def compare(self, n_rows_factual: int, n_rows_target: int) -> Tuple[bool, str]:
if n_rows_factual < n_rows_target:
return False, "Row loss."
relative_gain = (n_rows_factual - n_rows_target) / n_rows_target
relative_gain_fmt, min_gain_fmt = diff_color(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
relative_gain_fmt, min_gain_fmt = diff_color(
relative_gain_fmt, min_relative_gain_fmt = diff_color(

@kklein
Copy link
Collaborator

kklein commented Jun 21, 2023

I tested it myself, unfortunately this doesn't work with reports. :/

Screenshot 2023-06-21 at 12 22 16 PM

Screenshot 2023-06-21 at 12 21 55 PM

@ivergara
Copy link
Collaborator

@YYYasin19 it should be possible to ask pytest if the html report option is active and if so, then the coloring should be deactivated.

@YYYasin19
Copy link
Contributor Author

Alternatively, what do you think about a global option COLORING: bool that is set during the intial call of our pytest_integration?

@kklein
Copy link
Collaborator

kklein commented Jun 21, 2023

Alternatively, what do you think about a global option COLORING: bool that is set during the intial call of our pytest_integration?

Do I understand correctly that this wouldn't allow for coloring if people don't use pytest_integration but collect constraints themselves? I think that some people do the latter.

@0xbe7a
Copy link
Contributor

0xbe7a commented Jun 21, 2023

What do you guys think about if we support Colored Tags in the Failure message like Normal Text <blue>Hello from Blue Text</blue>? Then we could extend TestResult.formatted_message(self) -> str to TestResult.formatted_message(self, formatter: Formatter = NoColor) -> str and provide for example the formatters AnsiColorFormatter, HTMLFormatter, NoColorFormatter etc. which replace the colors with platform specific Escape Sequences. Downstream users like the pytest_helper function could then choose which formatter to use based on their own settings and we would avoid global state / side effects. More importantly, all existing tests can still output normal ASCII text as failure messages and can be unaware of any formatters.

@ivergara
Copy link
Collaborator

What do you guys think about if we support Colored Tags in the Failure message like Normal Text <blue>Hello from Blue Text</blue>? Then we could extend TestResult.formatted_message(self) -> str to TestResult.formatted_message(self, formatter: Formatter = NoColor) -> str and provide for example the formatters AnsiColorFormatter, HTMLFormatter, NoColorFormatter etc. which replace the colors with platform specific Escape Sequences. Downstream users like the pytest_helper function could then choose which formatter to use based on their own settings and we would avoid global state / side effects. More importantly, all existing tests can still output normal ASCII text as failure messages and can be unaware of any formatters.

That's certainly a better solution. If you can pull it it'd be great!

@YYYasin19
Copy link
Contributor Author

What do you guys think about if we support Colored Tags in the Failure message

Nice idea!
How would we do the number highlighting (which started this whole journey) in this system? Have diff_color output our tags that get then formatted per platform?

@YYYasin19
Copy link
Contributor Author

I am in favor (of atleast trying it out) iff we have other use cases as well. Having more readable output is important, but does not only depend on better formatting for "long" numbers, only.

@0xbe7a
Copy link
Contributor

0xbe7a commented Jun 22, 2023

Nice idea! How would we do the number highlighting (which started this whole journey) in this system? Have diff_color output our tags that get then formatted per platform?

Yes, exactly. We can even do proper HTML Colouring using this approach

@0xbe7a
Copy link
Contributor

0xbe7a commented Jun 22, 2023

I implemented my concept in 567d6cd

@ivergara
Copy link
Collaborator

@0xbe7a looks good!

@ivergara
Copy link
Collaborator

@YYYasin19 How do we move with this PR after #170? Turn it into an example use in one constraint?

@YYYasin19
Copy link
Contributor Author

@YYYasin19 How do we move with this PR after #170? Turn it into an example use in one constraint?

I'll rebase, refactor some code and then add some examples to the more popular constraints we use!

@YYYasin19
Copy link
Contributor Author

I think the PR's mostly ready. It has some examples of the usage (that we can further extend), it has some tests..
The only thing "missing" would be the codecov, any ideas on how to improve that? The new lines that were not measured are the ones that apply the formatting in the test messages.

src/datajudge/utils.py Outdated Show resolved Hide resolved
@ivergara
Copy link
Collaborator

Thank you @YYYasin19 and @0xbe7a for this undertaking!

@ivergara ivergara merged commit 6df9980 into main Aug 28, 2023
43 checks passed
@ivergara ivergara deleted the fmt-float-increase branch August 28, 2023 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Output: Round numbers for cleaner output
5 participants