-
-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upload only predictions instead of trained model via run #1231
Comments
Yes / no. Currently all runs must be linked to a flow (which is created automatically within that function). Here is a tutorial on creating a custom flow and using that to upload run results without using Hopefully that should work for you :) |
Thank you for the quick answer! I am trying to follow the tutorial you referenced. I am using a flow that contains a KNeighborsClassifier (Flow ID: 90800, https://test.openml.org/f/90800) but I'm not sure how to define the parameter 'components' as listed here https://openml.github.io/openml-python/main/generated/openml.flows.OpenMLFlow.html#openml.flows.OpenMLFlow I defined all other parameters required by How can I specify the used flow in |
Inspecting the JSON file from the autosklearn_flow provided in the tutorial I see a component identifier |
Heyho, So I have looked into this and have come to the following conclusion (@PGijsbers feel free to correct me): You can specify If you want to reference an already existing flow in your new flow, you would need to first get the flow and then reference it as part of the some_flow = openml.flows.get_flow(X)
components=OrderedDict(some_key=some_flow) However, I think you can only reference one sub-flow. While the Python API technically allows specifying multiple sub-flows ( The API also calls this key @ArturDev42, I hope this answers your questions. Feel free to ask any other questions regarding custom flows so that this issue may function as additional temporary documentation for custom flows. |
Hi @LennartPurucker, thanks a lot for your response! I had actually already tried In my previous comment #1231 (comment) I was trying to use an already existing flow, because I thought this is necessary for a custom flow. But for my specific use case, I am not sure why I would need an already existing flow. Bascially, I only want to compare predictions that can be uploaded (without a trained model) for a given task and compare with the ground truth from that task. Is my understanding correct that I can create a custom flow without referencing any already existing flow and basically only use it to be able to upload run results for a given task? It seemed to work for me so far. If yes, then I was wondering why it is still necessary to specify Thanks! |
I do not think the key itself would be enough, you would need to use
Yes, to my understanding, that should work, and I do not see a reason why it would fail. I think a lot of flows do not reference a subflow anyways. Your use case seems to be a good example of sharing prediction data without a trained model.
From a code perspective, if you set |
Can I also use an empty dict for
I get the following error when running it:
|
You would need to use If you define some hyperparameters with some_flow = openml.flows.OpenMLFlow(
**general,
parameters=OrderedDict(your_hp="1"), # default value is 1
parameters_meta_info=OrderedDict(your_hp=OrderedDict(description="A hyperparameter", data_type="int")),
components=OrderedDict(),
model=None,
)
some_flow.publish()
my_run = openml.runs.OpenMLRun(
task_id=task_id,
flow_id=flow_id,
dataset_id=dataset_id,
parameter_settings=[OrderedDict([("oml:name", "your_hp"), ("oml:value", 2)])], # set value to 2 for this run
data_content=predictions,
description_text="Run generated by the Custom Flow tutorial.",
)
my_run.publish() |
|
Description
My understanding is that in order to upload the results for a particular task, I need to create a run with a trained model and task such as follows:
Would it be possible to upload only the prediction (e.g. as a csv file) for a given task? Instead of needing to upload a trained model. Like it is done in Kaggle competitions or challenges on https://eval.ai/.
The text was updated successfully, but these errors were encountered: