Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor model class and runners to be more independent #494
refactor model class and runners to be more independent #494
Changes from 26 commits
8df6acb
622461c
1fc5c22
cb9f59c
9fe3cf5
134e026
a29661e
c612812
24c2595
f2db76f
ddc384a
2312951
921ad10
a05a8b3
e851890
86a07f5
991ede3
b1337d7
7a4a8a9
ee5987e
8fd2217
556ab26
1123b63
1e20682
e317b91
829a55a
a3738a4
ed00d68
ab52d07
12daf0e
7f34602
27fb1ec
7cb44d5
5fe1e7a
2d9af83
cd45884
9974c4d
c0716ba
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These parameters would provide much greater flexibility in defining the model. However, I'm uncertain whether we should include these additional parameters in the config.yaml file or establish a fixed structure for defining the model
Could this potentially confuse the users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main reason not to allow this, would be if we don't want to support and maintain these options. However if we allow init args in the config, it feels odd and incomplete without also being able to specify module and class. I found it convenient to be able to specify args in particular, but also the model file for tests. Note the defaults are what we have now, where it looks for a ModelClass in model.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm less stuff to customize in config.yaml is better I think. Right now we expect 1/model.py as the model file, it hink that can atleast be a default here and then let people override it.
We also need to badly get this config spec into a proto so it's defined what the fields are and we can parse it from the proto. Otherwise we'll have what we had before with config maps which got out of hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed them for now. We can figure out a way to better include init args in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, what these args could be for initialising model class?
because we are not using any
__init__
method in ModelClassThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should allow init args for specifying checkpoint files, paths to config files or other params. This would allow us to more easily init models for testing to point them at different paths. The other option would be to hardcode paths in a top-level main python file, and use those to init another model that has init args, but I like being able to specify them along with the rest of the config. That way we can write those configs with the rest of the model config, and not require to write a templated python file as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's a model specific thing why not just put it in model.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or these args are the inspected arts of the python function and not something we put into config.yaml right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to be able to pass in some config args to the model, like
checkpoint_path
or potentially preprocessing params or a path to another config file. That way when you're developing, you can init the model directly with different settings, and deploy it with a particular set of args for the model upload.The alternative, would be to essentially make model.py a config file written in python with some extra boilerplate, rather than have the model definition. That would turn model.py into something like the following, which I think is worse than putting args in the config.yaml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, it seems we're a little unsure of how to handle these sorts of things now. I think allowing args or a path to another config file for init would be best. Otherwise we'll end up with boilerplate python files that are config files. But we can try to do that in a separate PR, and I can limit this one to just the ModelClass inheritance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we want to guide people into always have to implement this, even if their implementation is "pass"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above. I don't know if we need this now. However, I'll change back to NotImplementedError for now, and the examples can keep the checkpoint loads there to guide users to that, as you mention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does model.load_model() get called now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That happens here, when the model is constructed. I'm actually not sure if we need to separate
__init__
fromload_model
now, we could also have it load in its init. To me it looks like reason we didn't do that in the old class structure, was because the init function was for the Runner, not the Model, and so had all the params for runner ids etc. as its init args, which we didn't want the user to have to deal with in their implementation. Now that the model is not inheriting from runner, we can use init instead and remove load_model. The only advantage to keeping it would be if we wanted to allow for construction before checkpoint loading, e.g. for examining config without loading the checkpoint; there isn't a place we do that yet, though.https://github.com/Clarifai/clarifai-python/pull/494/files#diff-265e698c90224658ed2a2213722fc5ddac5cec98e352802afcbf75760789e383R105
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i think we should keep load_model to make it more clear what you have to implement as a user.