Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type hints to EpochsArray constructor and BaseRaw.get_data() #12740

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

hoechenberger
Copy link
Member

@hoechenberger hoechenberger commented Jul 20, 2024

This is WIP and I'm trying out some things while also simultaneously attempting not to go insane.

Whether we like it or not, as you know, most IDEs these days rely on static code analysis to assist writing code and offering tab completions etc.

So to provide a better UX, I started experimenting with type annotations in MNE-Python.

Now there are several design patterns we have in our code that make me pull out my hair – not because I necessarily think they're inherently bad, not all of them; but because some are extremely difficult or impossible to properly annotate such that static analysis tools will be able to infer the correct types and we don't have to bend over backwards

One of such recurring patterns, for example, is changing the type of a return value depending on a function parameter.

Let's look at the following code:

# %%

import numpy as np

import mne

sample_dir = mne.datasets.sample.data_path()
sample_fname = sample_dir / 'MEG' / 'sample' / 'sample_audvis_raw.fif'

raw = mne.io.read_raw_fif(sample_fname, preload=True)
raw.crop(tmax=60)

data = raw.get_data()
assert isinstance(data, np.ndarray)
raw_array = mne.io.RawArray(data=data, info=raw.info)

raw.get_data() has a switch, return_times, which will turn the return value from NDArray into a tuple[NDArray, NDArray].

This means that which type will actually be returned can only be determined at runtime, except if one uses @overloads, which we figured are a bit … heavy? Let's call it heavy :)

But without adding an overload to get_data() to handle the two cases – return_times=False -> NDArray, return_times=True -> tuple, the user is forced to perform a runtime type check. Without the assert in my MWE above, mne.io.RawArray(data=data, …) would (correctly!) get red squiggly lines, because Pyright cannot know whether it's dealing with an array or a tuple of arrays here, while the constructor clearly demands an array.

The only solution to that problem that wouldn't require a change or amendment to our API is adding overloads to get_data().

But since previous consensus was that we're not super happy about this approach, I'm right now a little bit lost and demotivated on how to best proceed. I want to provide an improved UX but I fully understand the reservations towards overloads.

That said … This PR includes some improvements to EpochsArray and BaseRaw.get_data() that already improve the UX a bit but I'm not happy about the unclear return type of get_data(). Sigh…

@hoechenberger hoechenberger changed the title Add type hints to EpochsArray constructor Add type hints to EpochsArray constructor and BaseRaw.get_data() Jul 20, 2024
@cbrnr
Copy link
Contributor

cbrnr commented Jul 20, 2024

I feel you! In certain cases overloads are totally fine, maybe even in this one (or rather, in all cases where return types vary as a function of arguments). If we don't want that, it's best to separate into two or more functions with stable return types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants