Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NotebookResolver.resolve_notebook should verify if the resolved path is a notebook #3704

Open
1 task done
JCZuurmond opened this issue Feb 17, 2025 · 1 comment
Open
1 task done
Labels
2026-BUG migrate/python Pull requests that update Python code python Pull requests that update Python code

Comments

@JCZuurmond
Copy link
Member

JCZuurmond commented Feb 17, 2025

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

NotebookResolver.resolve_notebook should verify if the resolved path is a file.

def resolve_notebook(self, path_lookup: PathLookup, path: Path, inherit_context: bool) -> MaybeDependency:
absolute_path = self._notebook_loader.resolve(path_lookup, path)
if not absolute_path:
return self._fail('notebook-not-found', f"Notebook not found: {path.as_posix()}")
dependency = Dependency(self._notebook_loader, absolute_path, inherit_context)
return MaybeDependency(dependency, [])

Currently, if a directory is passed to the NotebookResolver it is resolved successfully.

Implementation suggestion

The NotebookResolver uses NotebookLoader.resolve which checks if the path is a notebook if can be resolved in the current working directory:

def resolve(self, path_lookup: PathLookup, path: Path) -> Path | None:
"""Resolve the notebook path.
If the path is a Python file, return the path to the Python file. If the path is neither,
return None.
"""
# check current working directory first
absolute_path = path_lookup.cwd / path
absolute_path = absolute_path.resolve()
if is_a_notebook(absolute_path):
return absolute_path
# When exported through Git, notebooks are saved with a .py extension. So check with and without extension
candidates = (path, self._adjust_path(path)) if not path.suffix else (path,)
for candidate in candidates:
a_path = path_lookup.resolve(candidate)
if not a_path:
continue
return a_path
return None

That should be updated to always verify if a found path is a notebook, also with the adjusted path

Expected Behavior

A directory should not be resolved successfully by NotebookResolver.resolve_notebook

Steps To Reproduce

No response

Cloud

AWS

Operating System

macOS

Version

latest via Databricks CLI

Relevant log output

@JCZuurmond JCZuurmond added migrate/python Pull requests that update Python code needs-triage python Pull requests that update Python code labels Feb 17, 2025
@JCZuurmond JCZuurmond added this to UCX Feb 17, 2025
@github-project-automation github-project-automation bot moved this to Todo in UCX Feb 17, 2025
@JCZuurmond
Copy link
Member Author

I gave this issue some thought. Maybe, when resolving a notebook, we should only allow absolute paths or relative paths to the notebook that runs a "child" notebook as per the documentation.

Currently, we allow notebooks to be resolved in the library roots, which is not what is allowed in practice and it could lead to unsuspected side effects when a notebook cannot be found (relatively to another notebook) but there is a match (by accident) with a library name.

github-merge-queue bot pushed a commit that referenced this issue Feb 19, 2025
## Changes
Broaden safe read text exception scope to be more safe when reading a
text file. Also the tests are extended.

### Linked issues

Relates to #3703
Relates to #3704

### Functionality

- [x] Changes in the code linting parts

### Tests

- [x] added unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2026-BUG migrate/python Pull requests that update Python code python Pull requests that update Python code
Projects
Status: Todo
Development

No branches or pull requests

2 participants