Make it easy to get the correct file path of a dataset #3753

merelcht · 2024-02-09T15:44:01Z

merelcht
Feb 9, 2024
Maintainer

Description

To get the correct (versioned) file path for a dataset is quite hard. There doesn't seem to be one way that works for both non-versioned and versioned datasets and/or local and remote datasets.

The main question here is why users need to access the file path. Not all datasets have a file path, e.g. APIDataset and so it's important to understand the true user need, before diving into solution.

Context

dataset._filepath() for non-versioned, local dataset
dataset_get_load_path() for versioned, local datasets
Remote datasets: get_filepath_str(self._get_load_path(), self._protocol)

This might not even be the full list of ways to get the file path.

Important

This idea is based on observations from several Kedro engineers see e.g. #1778. However, we need a clear view on what user needs are when it comes to why they need the file path and what their use cases are. Any implementation should be preceded by user research: #1978

astrojuanlu · 2025-01-13T12:36:17Z

astrojuanlu
Jan 13, 2025
Maintainer

@Galileo-Galilei mentioned this idea in the context of a user problem with MLflow https://kedro.hall.community/kedro-mlflow-netcdf-dataset-path-issue-dLxZxJzh1dsl

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it easy to get the correct file path of a dataset #3753

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Make it easy to get the correct file path of a dataset #3753

merelcht Feb 9, 2024 Maintainer

Description

Context

Replies: 1 comment

astrojuanlu Jan 13, 2025 Maintainer

merelcht
Feb 9, 2024
Maintainer

astrojuanlu
Jan 13, 2025
Maintainer