-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Kedro run as a Package (2023) #3237
Comments
So I see there are two different ways of going: I. Improve the current approach with CLI entrypoints - Parent #3237It consists 3 sub-tasks:
II. Use
|
I'd like to explore the idea of using the In the end, users will want to define their own scripts, ways of launching the
Could you explain this in a bit more detail? |
Extend questions:
|
Oh, I commented on #2682 after I saw your comment on #3680. Is there any leftover? To clarify, I was withdrawing my opposition in case we ever have to do that. For this particular issue, I still think pursuing the
Yes, that's what I understood. Isn't it possible to extend the CLI developing a normal plugin, like |
Correct, you can extend and add subcommand but not change existing |
Got it. Assuming something like this: def main(*args, **kwargs):
package_name = Path(__file__).parent.name
configure_project(package_name)
session = KedroSession.create() # need to handle `env` and `extra_params`
result = session.run(*args, **kwargs) (from your comment above) Can't the users add any CLI arguments they want, with argparse, click, fire, tyro, or anything else? |
I think a redesign of the Is it possible to do:
in a non-breaking way first and then tackle the |
Enable running kedro project/package programatically was one of my main focus while developing kedro-boot. I enumerated these three entry points for running kedro :
These three entry points could reuse the same def run_function(**kwargs):
# some args preprocessing
# Session creation & running
with KedroSession.create(
env=kedro_args.get("env", ""),
extra_params=kedro_args.get("params", ""),
conf_source=kedro_args.get("conf_source", ""),
) as session:
return session.run(
tags=tuple_tags,
runner=runner,
node_names=tuple_node_names,
from_nodes=kedro_args.get("from_nodes", ""),
to_nodes=kedro_args.get("to_nodes", ""),
from_inputs=kedro_args.get("from_inputs", ""),
to_outputs=kedro_args.get("to_outputs", ""),
load_versions=kedro_args.get("load_versions", {}),
pipeline_name=kedro_args.get("pipeline", ""),
namespace=kedro_args.get("namespace", ""),
)
@click.command(name="run", short_help="")
def run(**kwargs) -> Any:
return run_function(**kwargs)
run_params = [
click.option("--pipeline", type=str, help=""),
click.option("--env", type=str, help=""),
.
.
]
for param in run_params:
run = param(run) |
All subtasks are completed, so I'm closing this issue as complete as well! 🎉 |
Context
I have sat down with @idanov today and try to recall our memory about #1423 that was made by @antonymilne. Solving this PR would make Kedro more compatible anywhere (particularly Databricks) and potentially simplify our documentation on Databricks.
#1423 summarize how Kedro run is supported currently
In summary, there are 3 things that #1423 attempt to fix and we can break it down.
click
is emitting asys.exit
which make it hard to integrate Kedro and causing Databricks Job "failure" despite a success run.databricks_run.py
to keep a single__main__
entrypoint across project._find_run_command
to Framework #3051 - The incomplete change in Improve kedro run as a package #1423kedro/framework/project/__init__.py
is trying to address this.kedro run
or Kedro's entrypoint does not return anything -features/steps/test_starter/{{ cookiecutter.repo_name }}/src/{{ cookiecutter.python_package }}/__main__.py
Added one more:
run(standalone_mode=True)
to fix packaged Kedro project getting asys.exit
#2682The text was updated successfully, but these errors were encountered: