-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a memory leak caused by schematics json schemas #50
Conversation
Schematics types are instantiated before being used as a key to identify whether the json schema has already been extracted. As a result, every time the schema is extracted, the previous cached value is not used, causing the cache dict to grow indefinitely.
1af51a7
to
9db64a9
Compare
uranium
Outdated
@@ -10,8 +10,8 @@ URANIUM_GITHUB_ACCOUNT = os.getenv("URANIUM_GITHUB_ACCOUNT", "toumorokoshi") | |||
URANIUM_GITHUB_BRANCH = os.getenv("URANIUM_GITHUB_BRANCH", "master") | |||
URANIUM_STANDALONE_URL = "https://raw.githubusercontent.com/{account}/uranium/{branch}/uranium/scripts/uranium_standalone".format( | |||
account=URANIUM_GITHUB_ACCOUNT, branch=URANIUM_GITHUB_BRANCH | |||
) | |||
CACHE_DIRECTORY = os.path.join(ROOT, ".env", ".uranium") | |||
virtual) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this slipped in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, will clean that up.
assert original_payload is second_payload | ||
|
||
# classes are also valid output to recieve that the json schema | ||
original_payload = serializer.to_json_schema(SchematicsBody) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These SchematicsBody
classes do not get instantiated within the to_json_schema
method because it fails the issubclass(model_or_class, BaseType)
check in _enforce_instance
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SchematicsBody classes should not get instantiated. you could argue _enforce_instance is a misnomer here, as schematics has an incongruent apis between "types" and "models":
- types need to be an instance to call validate
- models need to be a class to instantiate and call validate
that incongruency leads to a lot of the code that exists in this file, dealing with specific edge cases around the various classes and their similar, but slightly mismatched, APIs.
to clarify, I could rename this method to "_enforce_instance_type_or_model". How does that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, got it, thanks for clarifying. That change sounds great!
instance = ModelType(SchematicsBody) | ||
original_payload = serializer.to_json_schema(instance) | ||
second_payload = serializer.to_json_schema(instance) | ||
assert original_payload is second_payload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we expect two different instances of the same thing to be the same payload? e.g.
instance = ModelType(SchematicsBody)
second_instance = ModelType(SchematicsBody)
original_payload = serializer.to_json_schema(instance)
second_payload = serializer.to_json_schema(second_instance)
assert original_payload is second_payload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as transmute is concerned, it would be a weird pattern to wrap your type in a ModelType in a non-nested scenario. But could definitely see people doing that.
This won't cause a memory leak as ModelTypes are only created dynamically once when the raw type is passed through, and cached from that point on. Assuming the user isn't dynamically creating modeltypes, that cache won't grow unbounded.
I think it's a good optimization, but if I was to clean this up properly I'd probably add more strict validation to reject weird edge cases like this. If you'd like to follow up with that optimization in a PR I encourage it, but I'm personally not sure the extra complexity is worth a few fewer cache entries.
fixing uranium typo _enforce_instance wasn't an accurate description of the desired behavior, so it should be renamed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for this!
Schematics types are instantiated before being used as a key to
identify whether the json schema has already been extracted. As
a result, every time the schema is extracted, the previous cached
value is not used, causing the cache dict to grow indefinitely.