Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bundle ipykernel #6294

Merged
merged 6 commits into from
Feb 14, 2025
Merged

Bundle ipykernel #6294

merged 6 commits into from
Feb 14, 2025

Conversation

seeM
Copy link
Contributor

@seeM seeM commented Feb 11, 2025

Addresses #6233. I still need to add some tests, but thought I'd create this for some early feedback and testing. Will also update the release notes once the PR is finalized.

Release Notes

New Features

  • N/A

Bug Fixes

  • N/A

QA Notes

Try out a few different Python environments. Try with/without ipykernel installed, and with the setting enabled/disabled.

Copy link

github-actions bot commented Feb 11, 2025

E2E Tests 🚀
This PR will run tests tagged with: @:critical

readme  valid tags


// See https://github.com/microsoft/vscode-python/issues/7136
gulp.task('installDebugpy', async (done) => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was used anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➕ ➕ (its been removed upstream as well)

'--upgrade',
'-t',
'./python_files/lib/temp',
async function installPythonScriptRequirements() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned this up a bit. I learned that gulp can accept named functions as tasks rather than the gulp.task('name', ...) syntax. They can also be async functions, and any errors will be handled appropriately without having to call back to done.

gulp.task(
'installPythonLibs',
// Run in parallel since vendoring rewrites imports which is somewhat CPU-bound.
gulp.parallel(vendorPythonKernelRequirements, gulp.series(installPythonScriptRequirements, bundleIPykernel)),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By bundling ipykernel + running this in parallel we get back to close to the same install time as before.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to watch out for is Windows based builds and EMFILE errors with limits on concurrent files being accessed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads up. I suspect it would be fine since (I think) the parallel tasks each access files serially.

@@ -104,6 +104,7 @@
"python.venvFolders.description": "Folders in your home directory to look into for virtual environments (supports pyenv, direnv and virtualenvwrapper by default).",
"python.venvPath.description": "Path to folder with a list of Virtual Environments (e.g. ~/.pyenv, ~/Envs, ~/.virtualenvs).",
"python.installModulesInTerminal.description": "Whether to install Python modules (such as `ipykernel`) in the Terminal, instead of in a background process. Installing modules in the Terminal allows you to see the output of the installation command.",
"python.useBundledIpykernel.description": "Whether to prepend the bundled `ipykernel` (and its dependencies) to the Python search path for supported interpreters. When enabled, bundled dependencies will be used instead of those installed in the active environment.",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any feedback on this wording?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a link to a longer-format writeup of this decision would be useful to give more context to those who are interested. Maybe hosted on positron.posit.co?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think we will need to update what we have at https://positron.posit.co/start.html#python-prerequisites anyway, and it may be a good idea to do a bit more explanation there and link from here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my sanity check for the wording was "can ChatGPT explain this setting to me correctly?" and it did a great job 💯

my only tweak would be from When enabled, bundled dependencies will be used to When enabled, a bundled ipykernel installation will be used, or something similar to be extra clear this is just the one package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'd like to work on that after merging this PR. Will make a follow-up PR here adding the link once the website is updated.

Updated wording in 793ac4a.

# wheels for CPython and PyPy, and we require impementation-agnostic wheels (via the 'pip install
# --implementation py' arg in scripts/vendoring.py). We inherited that requirement from upstream,
# but we could revisit it.
# Stick to pydantic v1 since it has a pure Python implementation which is much easy to vendor.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now understand why we require --implementation py!


export interface PythonRuntimeExtraData {
pythonPath: string;
pythonEnvironmentId: string;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was used anywhere.

// This shouldn't happen. Did something go wrong during `npm install`?
throw new Error(`Bundled ipykernel dependencies not found at: ${bundlePath}`);
}
this._envVarsService.appendPythonPath(kernelSpec.env, bundlePath);
Copy link
Contributor Author

@seeM seeM Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to confirm where exactly this puts things in the path relative to the user's site packages.

EDIT: This appends to the PYTHONPATH environment variable, which still puts it before the user's site packages. For example:

$ PYTHONPATH=foo python -c 'import sys; print(sys.path)'
['', '/Users/seem/posit/positron/foo', '/Users/seem/.pyenv/versions/3.11.9/lib/python311.zip', '/Users/seem/.pyenv/versions/3.11.9/lib/python3.11', '/Users/seem/.pyenv/versions/3.11.9/lib/python3.11/lib-dynload', '/Users/seem/.pyenv/versions/3.11.9/lib/python3.11/site-packages']

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am seeing the following in a conda env that has ipykernel installed already:

['/Users/isabelizimm/code/positron/extensions/positron-python/python_files/posit',
 '/Users/isabelizimm/code/positron/extensions/positron-python/python_files/lib/ipykernel/cp312',
 '/Users/isabelizimm/code/positron/extensions/positron-python/python_files/lib/ipykernel/cp3',
 '/Users/isabelizimm/code/positron/extensions/positron-python/python_files/lib/ipykernel/py3',
 '/Users/isabelizimm/miniconda3/lib/python312.zip',
 '/Users/isabelizimm/miniconda3/lib/python3.12',
 '/Users/isabelizimm/miniconda3/lib/python3.12/lib-dynload',
 '',
 '/Users/isabelizimm/miniconda3/lib/python3.12/site-packages']

a few nits for you to take or leave:

  • If I have ipykernel installed already, can we not put another one on the path?
  • Is it possible to not have 4 items prepended? eg, only 1 ipykernel? not sure if this is even possible, but it does feel surprising to have so many there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have ipykernel installed already, can we not put another one on the path?

IIRC we said we'd always prefer the bundled ipykernel since users may have installed an incompatible version.

Is it possible to not have 4 items prepended? eg, only 1 ipykernel? not sure if this is even possible, but it does feel surprising to have so many there

python_files/posit was always there since it's the folder containing the entrypoint positron_language_server.py.

As for the 3 ipykernel paths, it is possible to have one but there are some trade-offs. The three different paths py3, cp3, and cp312 actually house disjoint subsets of ipykernel's dependencies for different implementation and version constraints. We split them that way so that we don't have to bundle duplicates of dependencies that have looser constraints (e.g. pure Python packages which live in py3). That took the storage requirement down from ~400MB to ~100MB.

Maybe we could symlink things so that we can add one path without actually duplicating storage but I'd like to leave that as a future optimization.

@seeM
Copy link
Contributor Author

seeM commented Feb 12, 2025

I've tested a local release build and it seems to work fine.

Full test suite running here: https://github.com/posit-dev/positron/actions/runs/13286091031 (EDIT: only failure is an unrelated to this PR)

@isabelizimm
Copy link
Contributor

Some general comments:

In general, after playing around I can't find anything in particular that is breaking dramatically! I tested out lots of plotting and widgets libraries that might bring up ipython/ipykernel weirdness.

I am getting giving warnings about unavailable magics for plotting in an env that does not have ipykernel installed:

pip install hvplot pandas plotly

import hvplot.pandas
import pandas as pd
hvplot.extension('plotly')
pd.DataFrame(dict(x=[1,2,3], y=[4,5,6])).hvplot.scatter(x="x", y="y")

returns:

%opts magic unavailable (pyparsing cannot be imported)
%compositor magic unavailable (pyparsing cannot be imported)

@seeM
Copy link
Contributor Author

seeM commented Feb 13, 2025

@isabelizimm I'm unable to repro the warning. Perhaps we can file that for a future fix?

Copy link
Contributor

@isabelizimm isabelizimm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good place to start and get feedback! Thanks for taking this on, I am personally looking forward to not having the install ipykernel popup 🚀

@seeM seeM merged commit aaabaa3 into main Feb 14, 2025
28 checks passed
@seeM seeM deleted the bundle-ipykernel branch February 14, 2025 12:48
@github-actions github-actions bot locked and limited conversation to collaborators Feb 14, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants