refactor model class and runners to be more independent #494

deigen · 2025-01-23T17:41:10Z

Currently, the ModelRunner is the primary object of instantiation for a model in a server. This has several drawbacks:

The user can't directly create their model without also creating a Runner, unless always including boilerplate workarounds. Creating a model directly is useful for testing and local use.
It's cumbersome to create gRPC-served models, as they also require instantating a Runner class.
The instantiation process of the Model in the server isn't clear, at no point is there a straightforward call to model = Model(**args); the init is instead the Runner init.

Instead, we should define the model implementation using ModelClass directly, and create Runner and gRPC servers that can each call the Model object. This separates the model class from the server method, while still maintaining the same interfaces.

NOTE: This change also requires us to change the base class of all models in the examples repo, to use ModelClass instead of ModelRunner as their base class.

luv-bansal

Looks good to me!! Just commented few doubts.

I like the idea of creating Model from ModelClass instead of ModelRunner and separating it from the server.

clarifai/cli/model.py

luv-bansal · 2025-01-27T13:20:49Z

clarifai/runners/models/model_builder.py

+        )
+      model_class = classes[0]
+
+    model_args = class_config.get("args", {})


Not sure, what these args could be for initialising model class?
because we are not using any __init__ method in ModelClass

I think we should allow init args for specifying checkpoint files, paths to config files or other params. This would allow us to more easily init models for testing to point them at different paths. The other option would be to hardcode paths in a top-level main python file, and use those to init another model that has init args, but I like being able to specify them along with the rest of the config. That way we can write those configs with the rest of the model config, and not require to write a templated python file as well.

if it's a model specific thing why not just put it in model.py?

or these args are the inspected arts of the python function and not something we put into config.yaml right?

I'd like to be able to pass in some config args to the model, like checkpoint_path or potentially preprocessing params or a path to another config file. That way when you're developing, you can init the model directly with different settings, and deploy it with a particular set of args for the model upload.

The alternative, would be to essentially make model.py a config file written in python with some extra boilerplate, rather than have the model definition. That would turn model.py into something like the following, which I think is worse than putting args in the config.yaml.

import yaml from my_model import MyModel model_config_file = 'model_config.yaml' model_config = yaml.safe_load(open(model_config_file)) class MyModelClass(MyModel): def __init__(self): super().__init__(**model_config)

However, it seems we're a little unsure of how to handle these sorts of things now. I think allowing args or a path to another config file for init would be best. Otherwise we'll end up with boilerplate python files that are config files. But we can try to do that in a separate PR, and I can limit this one to just the ModelClass inheritance.

luv-bansal · 2025-01-27T13:24:24Z

clarifai/runners/models/model_builder.py

+    class_config = self.config.get("class_info", {})
+
+    model_file = class_config.get("file_path")


These parameters would provide much greater flexibility in defining the model. However, I'm uncertain whether we should include these additional parameters in the config.yaml file or establish a fixed structure for defining the model

Could this potentially confuse the users?

I think the main reason not to allow this, would be if we don't want to support and maintain these options. However if we allow init args in the config, it feels odd and incomplete without also being able to specify module and class. I found it convenient to be able to specify args in particular, but also the model file for tests. Note the defaults are what we have now, where it looks for a ModelClass in model.py.

hmm less stuff to customize in config.yaml is better I think. Right now we expect 1/model.py as the model file, it hink that can atleast be a default here and then let people override it.

We also need to badly get this config spec into a proto so it's defined what the fields are and we can parse it from the proto. Otherwise we'll have what we had before with config maps which got out of hand.

I removed them for now. We can figure out a way to better include init args in a separate PR.

luv-bansal · 2025-01-27T13:54:37Z

clarifai/runners/server.py

+  parser.add_argument(
+      '--download_checkpoints',
+      action='store_true',
+      help='Downloads any remote checkpoints specified in the config',
+  )


Not sure when this would be use. I think when local testing download_checkpoints should always be called

I put this in because the old switch --start_dev_server called download_checkpoints. I replaced that with --grpc which just enables grpc server instead of the runner, but doesn't download checkpoints on its own. So now --start_dev_server would be equivalent to --grpc --download_checkpoints. I don't know if that's useful or needed, but kept it for that reason. However, I deleted this download_checkpoints option now, since now there is a new cli command for that. If you think --grpc --download_checkpoints in this one command would still be useful to correspond to the old --start_dev_server we can put it back in.

zeiler · 2025-01-29T17:04:56Z

clarifai/runners/models/model_runner.py

@@ -43,7 +41,7 @@ def __init__(
        num_parallel_polls,
        **kwargs,
    )
-    self.load_model()
+    self.model = model


where does model.load_model() get called now?

That happens here, when the model is constructed. I'm actually not sure if we need to separate __init__ from load_model now, we could also have it load in its init. To me it looks like reason we didn't do that in the old class structure, was because the init function was for the Runner, not the Model, and so had all the params for runner ids etc. as its init args, which we didn't want the user to have to deal with in their implementation. Now that the model is not inheriting from runner, we can use init instead and remove load_model. The only advantage to keeping it would be if we wanted to allow for construction before checkpoint loading, e.g. for examining config without loading the checkpoint; there isn't a place we do that yet, though.

https://github.com/Clarifai/clarifai-python/pull/494/files#diff-265e698c90224658ed2a2213722fc5ddac5cec98e352802afcbf75760789e383R105

yeah i think we should keep load_model to make it more clear what you have to implement as a user.

zeiler · 2025-01-29T17:05:37Z

clarifai/runners/models/model_class.py

  def load_model(self):
-    raise NotImplementedError("load_model() not implemented")
+    pass


don't we want to guide people into always have to implement this, even if their implementation is "pass"?

See above. I don't know if we need this now. However, I'll change back to NotImplementedError for now, and the examples can keep the checkpoint loads there to guide users to that, as you mention.

github-actions · 2025-01-31T17:54:05Z

Package	Line Rate	Health
clarifai	43%	❌
clarifai.cli	64%	➖
clarifai.client	71%	➖
clarifai.client.auth	74%	➖
clarifai.constants	100%	✔
clarifai.datasets	100%	✔
clarifai.datasets.export	80%	✔
clarifai.datasets.upload	75%	➖
clarifai.datasets.upload.loaders	37%	❌
clarifai.models	100%	✔
clarifai.modules	0%	❌
clarifai.rag	72%	➖
clarifai.runners	13%	❌
clarifai.runners.models	56%	➖
clarifai.runners.utils	65%	➖
clarifai.schema	100%	✔
clarifai.urls	80%	✔
clarifai.utils	77%	➖
clarifai.utils.evaluation	67%	➖
clarifai.workflows	94%	✔
Summary	68% (4423 / 6541)	➖

Minimum allowed line rate is 50%

deigen added 10 commits January 22, 2025 21:39

get model class from config

8df6acb

download checkpoints command

622461c

pass model to server class, change grpc flag name

1fc5c22

download_checkpoints option in server

cb9f59c

rename ModelUploader to ModelBuilder

9fe3cf5

create_model_instance function

134e026

include main for now to reduce diff

a29661e

refactor upload model main to reduce diff

c612812

fix arg

24c2595

fixes

f2db76f

deigen requested review from zeiler and luv-bansal January 23, 2025 17:41

deigen added 4 commits January 23, 2025 17:55

fix formatting

ddc384a

make class_info section optional

2312951

edit unittest

921ad10

edit tests to use model class

a05a8b3

deigen force-pushed the model-class-refactor branch 3 times, most recently from ebd1ca8 to 19e18db Compare January 24, 2025 03:05

init model directly for unittest

e851890

deigen force-pushed the model-class-refactor branch from 19e18db to e851890 Compare January 24, 2025 03:08

deigen added 4 commits January 24, 2025 16:58

update cmdline

86a07f5

Merge branch 'master' into model-class-refactor

991ede3

fix runner inheritence and simplify test

b1337d7

update calls in grpc server

7a4a8a9

deigen requested a review from ackizilkale January 24, 2025 19:04

deigen added 4 commits January 24, 2025 11:07

update names in test

ee5987e

fix test

8fd2217

fix test

556ab26

formatting

1123b63

deigen added 3 commits January 24, 2025 14:36

fix name in test

1e20682

fix use of modelclass in tests

e317b91

move load_model into default construction

829a55a

luv-bansal approved these changes Jan 27, 2025

View reviewed changes

deigen added 2 commits January 27, 2025 14:50

Merge branch 'master' into model-class-refactor

a3738a4

remove download_checkpoints arg

ed00d68

deigen force-pushed the model-class-refactor branch from 3a75123 to ed00d68 Compare January 27, 2025 21:49

deigen added 4 commits January 27, 2025 18:16

set uuid in test setup

ab52d07

write configs to tempfiles to keep tests independent

12daf0e

set compute cluster id in nodepool test

7f34602

revert unrelated changes to test_compute_orchestration

27fb1ec

zeiler reviewed Jan 29, 2025

View reviewed changes

deigen added 6 commits January 29, 2025 13:23

add error message for updaing runner class

7cb44d5

put back load_model as abstract method

5fe1e7a

remove init args for this pr

2d9af83

fix tests

cd45884

copy all of 1/ dir

9974c4d

Merge branch 'master' into model-class-refactor

c0716ba

deigen force-pushed the model-class-refactor branch from f6baf39 to c0716ba Compare January 31, 2025 17:45

deigen mentioned this pull request Jan 31, 2025

Updates examples for new ModelClass base Clarifai/examples#115

Merged

deigen merged commit 324491e into master Jan 31, 2025
17 checks passed

deigen deleted the model-class-refactor branch January 31, 2025 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor model class and runners to be more independent #494

refactor model class and runners to be more independent #494

deigen commented Jan 23, 2025 •

edited

Loading

luv-bansal left a comment

luv-bansal Jan 27, 2025 •

edited

Loading

deigen Jan 27, 2025 •

edited

Loading

zeiler Jan 29, 2025

zeiler Jan 29, 2025

deigen Jan 29, 2025

deigen Jan 29, 2025

luv-bansal Jan 27, 2025

deigen Jan 27, 2025

zeiler Jan 29, 2025

deigen Jan 29, 2025

luv-bansal Jan 27, 2025

deigen Jan 27, 2025

zeiler Jan 29, 2025

deigen Jan 29, 2025

zeiler Jan 29, 2025

zeiler Jan 29, 2025

deigen Jan 29, 2025

github-actions bot commented Jan 31, 2025

		class_config = self.config.get("class_info", {})

		model_file = class_config.get("file_path")

refactor model class and runners to be more independent #494

refactor model class and runners to be more independent #494

Conversation

deigen commented Jan 23, 2025 • edited Loading

luv-bansal left a comment

Choose a reason for hiding this comment

luv-bansal Jan 27, 2025 • edited Loading

Choose a reason for hiding this comment

deigen Jan 27, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Jan 31, 2025

deigen commented Jan 23, 2025 •

edited

Loading

luv-bansal Jan 27, 2025 •

edited

Loading

deigen Jan 27, 2025 •

edited

Loading