-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise documentation in the "Kedro for notebook users" section #2845
Comments
Let's do this together, I've done this process a dozen times or more already. It's far from perfect but there are several issues tracking how to make it easier already.
So far I haven't seen many direct requests. But by casually walking around the office I see lots of DS using Kedro on Jupyter. And we should pay attention to Databricks as well. |
Do you see there is anything we can build to simplify the process? @astrojuanlu
Same as my experience, I would add a comment that many are not using it in the most efficient way. "I don't know which nodes I need to re-run so I just re-run the whole pipeline". Although kedro run support many different options, it is not well used I wonder if we can show some example in notebook section. |
I may reorder session -> catalog -> pipeline -> context, since
|
Fab, thanks for your help on this @noklam and @astrojuanlu. I think I can start on Pages 1 & 3 but will leave page 2 until your return @astrojuanlu (so I have made a separate ticket for that work #2855) |
I opened a few over time, see for example #2583, #2593, #2700, #2777, #2819. |
All done and released in 0.18.4 |
Child of #2799
Description
Looking into popular content (and content that could be popular if it was any good) I have identified this section on notebook/Kedro usage as problematic.
Context
Lots of potential to help notebook users cross the rubicon to use Kedro
Possible Implementation
Currently we have two pages but in my view they're the wrong way around and there's a big chunk missing on the conversion from Notebook -> Kedro and/or phased introduction of Kedro support to notebooks. I think we should go with this ordering:
Looking at the pages in more detail:
Page 1: Phased support to use the Kedro
DataCatalog
as a data registry (terminology TBC)DataCatalog
within your existing notebookstandalone-datacatalog
starterpandas-iris
examplePage 2: How to convert your existing notebook to a Kedro project
Holy grail example. TBD. I need to pair with someone on this to work out how to write it up (it has potential to be a blog post too). I have a separate ticket for this work #2855.
Page 3: How to use Kedro and a notebook side-by-side
Tidy up what we have in this page https://docs.kedro.org/en/stable/notebooks_and_ipython/kedro_and_notebooks.html to illustrate how to use a notebook to explore side-by-side with your Kedro project.
Remove the complexity in the early part of the page under "A custom Kedro kernel" and just summarise what you get:
catalog
(typeDataCatalog
): Data Catalog instance that contains all defined datasets; this is a shortcut forcontext.catalog
context
(typeKedroContext
): Kedro project context that provides access to Kedro's library componentspipelines
(typeDict[str, Pipeline]
): Pipelines defined in your pipeline registrysession
(typeKedroSession
): Kedro session that orchestrates a pipeline runIris dataset example: Shows with the
pandas-iris
starterhow to add a notebook with
kedro jupyter notebook
catalog
,context
,pipelines
andsession
%reload_kedro
line magic%run_viz
line magicHow to convert functions from Jupyter Notebooks into Kedro nodes
Work with managed services
Connect an IPython shell to a Kedro project kernel
Create a custom Jupyter kernel that automatically loads the extension and launches JupyterLab / QtConsole
Page 4: Jupyter notebook/Kedro FAQs
A page that covers some the commonly asked questions that we get
How does this look? I'm interested in those that wizard or see questions coming in, or generally have a vision on how we should present ourselves when it comes to Notebook support: @astrojuanlu @merelcht @noklam @deepyaman
The text was updated successfully, but these errors were encountered: