Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a memory leak caused by schematics json schemas #50

Merged
merged 2 commits into from
Dec 19, 2019

Conversation

toumorokoshi
Copy link
Owner

Schematics types are instantiated before being used as a key to
identify whether the json schema has already been extracted. As
a result, every time the schema is extracted, the previous cached
value is not used, causing the cache dict to grow indefinitely.

Schematics types are instantiated before being used as a key to
identify whether the json schema has already been extracted. As
a result, every time the schema is extracted, the previous cached
value is not used, causing the cache dict to grow indefinitely.
@toumorokoshi toumorokoshi force-pushed the feature/fix-memory-leak branch from 1af51a7 to 9db64a9 Compare December 19, 2019 00:19
uranium Outdated
@@ -10,8 +10,8 @@ URANIUM_GITHUB_ACCOUNT = os.getenv("URANIUM_GITHUB_ACCOUNT", "toumorokoshi")
URANIUM_GITHUB_BRANCH = os.getenv("URANIUM_GITHUB_BRANCH", "master")
URANIUM_STANDALONE_URL = "https://raw.githubusercontent.com/{account}/uranium/{branch}/uranium/scripts/uranium_standalone".format(
account=URANIUM_GITHUB_ACCOUNT, branch=URANIUM_GITHUB_BRANCH
)
CACHE_DIRECTORY = os.path.join(ROOT, ".env", ".uranium")
virtual)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this slipped in.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, will clean that up.

assert original_payload is second_payload

# classes are also valid output to recieve that the json schema
original_payload = serializer.to_json_schema(SchematicsBody)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These SchematicsBody classes do not get instantiated within the to_json_schema method because it fails the issubclass(model_or_class, BaseType) check in _enforce_instance.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SchematicsBody classes should not get instantiated. you could argue _enforce_instance is a misnomer here, as schematics has an incongruent apis between "types" and "models":

  • types need to be an instance to call validate
  • models need to be a class to instantiate and call validate

that incongruency leads to a lot of the code that exists in this file, dealing with specific edge cases around the various classes and their similar, but slightly mismatched, APIs.

to clarify, I could rename this method to "_enforce_instance_type_or_model". How does that sound?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, got it, thanks for clarifying. That change sounds great!

instance = ModelType(SchematicsBody)
original_payload = serializer.to_json_schema(instance)
second_payload = serializer.to_json_schema(instance)
assert original_payload is second_payload

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we expect two different instances of the same thing to be the same payload? e.g.

    instance = ModelType(SchematicsBody)
    second_instance = ModelType(SchematicsBody)
    original_payload = serializer.to_json_schema(instance)
    second_payload = serializer.to_json_schema(second_instance)
    assert original_payload is second_payload

Copy link
Owner Author

@toumorokoshi toumorokoshi Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as transmute is concerned, it would be a weird pattern to wrap your type in a ModelType in a non-nested scenario. But could definitely see people doing that.

This won't cause a memory leak as ModelTypes are only created dynamically once when the raw type is passed through, and cached from that point on. Assuming the user isn't dynamically creating modeltypes, that cache won't grow unbounded.

I think it's a good optimization, but if I was to clean this up properly I'd probably add more strict validation to reject weird edge cases like this. If you'd like to follow up with that optimization in a PR I encourage it, but I'm personally not sure the extra complexity is worth a few fewer cache entries.

fixing uranium typo

_enforce_instance wasn't an accurate description of the desired
behavior, so it should be renamed.
@coveralls
Copy link

Coverage Status

Coverage remained the same at ?% when pulling 811978c on feature/fix-memory-leak into d69ad41 on master.

Copy link

@colinjmiller colinjmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for this!

@toumorokoshi toumorokoshi merged commit 4282d08 into master Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants