Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Serialization #323

Draft
wants to merge 4 commits into
base: dev
Choose a base branch
from
Draft

Automatic Serialization #323

wants to merge 4 commits into from

Conversation

LarsKue
Copy link
Contributor

@LarsKue LarsKue commented Feb 19, 2025

The goals of this PR are:

  • Remove the need to implement from_config and get_config methods for most objects
  • Remove the need to save a config object in the constructor of most objects
  • Unify the serialization of types, functions, and other objects
  • Improve the serialization of functions, especially with respect to numpy

To achieve the above goals, this PR introduces:

  • A custom serialize and deserialize function that wrap around the keras utilities
  • A custom Serializable class that automatically implements from_config and get_config

@LarsKue LarsKue self-assigned this Feb 19, 2025
@LarsKue LarsKue added the sugar Syntactical sugar or quality of life improvements label Feb 19, 2025
@LarsKue LarsKue added this to the BayesFlow 2.0 milestone Feb 19, 2025
@stefanradev93
Copy link
Contributor

That's a super nice utility! Which networks are already tested through?

@LarsKue
Copy link
Contributor Author

LarsKue commented Feb 19, 2025

It's still in progress, so not functional for all networks yet. But I made changes such that most networks should already be tested.

@vpratz
Copy link
Collaborator

vpratz commented Feb 20, 2025

This looks really nice! Please take care when removing get_config functions, some attributes might accumulate state that is not automatically serialized by Keras, so already serializing them when __init__ is called might not be sufficient. I'm not totally sure, but I think this might apply for example for ContinuousApproximator.adapter.

@LarsKue LarsKue force-pushed the automatic-serialization branch from b5c5369 to e2304e7 Compare February 25, 2025 17:47
@LarsKue
Copy link
Contributor Author

LarsKue commented Feb 25, 2025

@vpratz Thank you for raising the issue. After investigating further, I opted to go with a semi-automatic and more classical OOP approach of using a Serializable base class that implements from_config and get_config and allows semi-automatic saving of stateless and stateful variables with a single self.initialize_config() call. Could you take a brief look?

@vpratz
Copy link
Collaborator

vpratz commented Feb 26, 2025

Thanks for taking this into account. I currently don't have a clear opinion, so I'll just share some thoughts:

  • can we somehow get rid of the self.initialize_config() without arguments, e.g. by moving it to __init__subclass? Instead I would opt for a register_stateful_variables function that appends to self.config['statefule_fields'].
  • if we make the creation of config implicit, do we want to make it private/protected, i.e., rename it to self._config?
  • currently, I think in most places we follow the convention that everything that is not Keras weights but needed to construct an object is passed through __init__. With those changes we do away with this convention, which is okay in my opinion, but it comes at the cost that we can no longer point at how Keras does it to explain how we do it.
  • I liked that the decorator was somewhat explicit. The sub-sub-classing might make it hard to track what is going on, so we need to document it prominently for devs.

@LarsKue
Copy link
Contributor Author

LarsKue commented Feb 26, 2025

Thanks for sharing. To address your points:

  • No, it is not possible to safely get rid of the call entirely. There is no way to inspect the actual values that are passed to the constructor from the outside without major hacking (e.g. replacing the constructor function entirely) or additional user code (e.g. another decorator on the __init__)
  • __init_subclass__ is called when the class is created, not the instance.
  • We could make a register_stateful_fields function, but I would opt to make it in addition to the initialize_config

make self.config protected

  • I have thought about this, but I don't mind either solution. I think it is ok to have the config attribute public, since it is only created semi-automatically.

everything is passed through init

  • This is incorrect. Any class that internally changes state violates this property. We still mostly follow the keras idiom in that we remember the constructor arguments and by default assume no such internal state changes.

  • I agree that the decorator is more explicit, but also not particularly typical for OOP and too limiting for the automation of this process. In particular, if we create the from_config and get_config methods in a decorator, static type checkers and auto completion will have a hard time figuring out what is going on. Allowing to override the methods is then also complicated. Essentially, we would emulate OOP by non-OOP methods, which is just not idiomatic.

@vpratz
Copy link
Collaborator

vpratz commented Feb 26, 2025

Thanks a lot for the detailed response. With this additional information on the constraints, I'm happy with the proposed solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sugar Syntactical sugar or quality of life improvements
Projects
Status: Future
Development

Successfully merging this pull request may close these issues.

3 participants