Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix exclude_defaults when value's __eq__ method raises an exception. #1490

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

sneakers-the-rat
Copy link

@sneakers-the-rat sneakers-the-rat commented Oct 22, 2024

First opening this as a draft with just the failing test to demo that we fix it...
edit: and now added the fix :) ready to review

Change Summary

Catch errors in comparing a value to the default value when either the default or the provided value has some __eq__ method that raises an error, treat those as "not equal to the default."

This is especially important for types like arrays which traditionally raise a ValueError like "ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()" when comparing equality - i.e. the exception in the __eq__ method is expected behavior, which breaks optional fields with a default None that can take an array.

Related issue number

Fix: pydantic/pydantic#10547

Checklist

  • Unit tests for the changes exist
  • Documentation reflects the changes where applicable
  • Pydantic tests pass with this pydantic-core (except for expected changes)
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @davidhewitt

Copy link

codspeed-hq bot commented Oct 22, 2024

CodSpeed Performance Report

Merging #1490 will not alter performance

Comparing sneakers-the-rat:exclude-eq-exception (e80518a) with main (1ced3e6)

Summary

✅ 155 untouched benchmarks

Copy link

codecov bot commented Oct 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.11%. Comparing base (ab503cb) to head (e80518a).
Report is 197 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1490      +/-   ##
==========================================
- Coverage   90.21%   89.11%   -1.10%     
==========================================
  Files         106      112       +6     
  Lines       16339    17826    +1487     
  Branches       36       40       +4     
==========================================
+ Hits        14740    15886    +1146     
- Misses       1592     1920     +328     
- Partials        7       20      +13     
Files with missing lines Coverage Δ
src/serializers/fields.rs 96.13% <100.00%> (+0.49%) ⬆️

... and 51 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ced3e6...e80518a. Read the comment docs.

@sneakers-the-rat sneakers-the-rat marked this pull request as ready for review October 22, 2024 01:04
@sneakers-the-rat
Copy link
Author

I hope I am reading the PR template right that i am supposed to now make a comment saying "please review" because that is what i am doing :)

Copy link
Contributor

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This seems reasonable to me, though silently ignoring exceptions can always potentially be confusing.

It's a change of behaviour, and we might want to document that exclude_defaults will serialize anyway if the default comparison fails.

I wonder if, as a backwards-compatible alternative, we should expose a config value to override the default_eq comparison would be safer? Users could then supply a function which swallows exceptions (or does whatever else is suitable for their use case).

I suppose even more general, rather than just default_eq we could have an exclude_if function which checks the value and returns True / False. This would be a way to generalize exclude_defaults and exclude_none etc.

Or I guess rather than making exclude_if, we could just extend exclude to accept a callable as an alternative to a bool? 🤔

cc @sydney-runkle

src/serializers/fields.rs Outdated Show resolved Hide resolved
@sneakers-the-rat
Copy link
Author

though silently ignoring exceptions can always potentially be confusing.

This is fair - although in this case I think it makes sense semantically: "exclude things that are equal to the default. If something cannot be meaningfully compared to the default such that it emits an error when you try, then that means that it probably isn't equal to the default." I think worst case is that we include a few strange custom-classes values in the output that are technically equal to the default, but that is more of a burden on the implementers to implement equality operators (in this case, I can't, since i'm wrapping array libraries where there is strong and meaningful precedent to emit an error on ==).

Equality comparisons for all the standard pydantic types should not raise an exception and always return False if they are not equal, right? I am assuming since I couldn't find this issue having been raised before. So I wonder how many workflows we would break by allowing errored equality comparisons.

We should probably emit a warning when this happens though, that would be polite. If we want that I can read through the code and figure out how to do that (although it might take me a bit because i am new to rust)

we might want to document that exclude_defaults will serialize anyway if the default comparison fails.

I can of course add this to the docs if this is what we decide to go with! Since there are a few other ideas on the table (love the enthusiasm <3) i'll hold off for now.

we should expose a config value to override the default_eq comparison would be safer? Users could then supply a function which swallows exceptions (or does whatever else is suitable for their use case).

So are you thinking of something like this?

def my_comparator(default: Any, value: Any) -> bool:
    # do whatever i want here
    return random.random() >= 0.5
    

model.model_dump_json(exclude_default = True, default_eq = my_comparator)

If we go with that, it would be really great to be able to declare something like that in a SchemaSerializer or otherwise in a core_schema definition so that I could just declare a proper array equality function that's used by default when serializing NDArray types. It does sound like it has the potential to blow up in complexity tho, bc once it's there, you'd expect it in annotated field types, model config, etc...

All the eq checks are actually calls to python __eq__ methods, not some faster rust equality checking right? just asking because if they aren't, and there's some perf disadvantage, we might want to either make that behave like the python json "default" hook which only calls the function when there is an error using the normal serializer or add that as another option.

I suppose even more general, rather than just default_eq we could have an exclude_if function which checks the value and returns True / False. This would be a way to generalize exclude_defaults and exclude_none etc...

I would actually love this, but it doesn't necessarily resolve my goal of "make arrays behave as you would expect with all pydantic functionality" and would require an additional term for every serialization with arrays in it. maybe that's for another PR?

@sneakers-the-rat
Copy link
Author

Bump on this, I see discussion going forward about the functional exclude filter here, which sounds like a good idea: pydantic/pydantic#10728

But this PR fixes an orthogonal problem specifically with the exclude_defaults flag, it's still breaking my tests for a custom type being compliant with all pydantic features, and im not sure where we landed on what I need to change here.

I think emitting a warning when a field is considered not a default by way of an exception being caught in comparing equality makes sense, and if maintainers agree then ill go ahead and add it, or lmk if we want other changes :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle equality exceptions from exclude_default as False when serializing
2 participants