Skip to content

Commit

Permalink
Allow automated metadata generation to be bounded by "row events" ins…
Browse files Browse the repository at this point in the history
…tead of explicit time windows (mne-tools#12118)
  • Loading branch information
hoechenberger authored Oct 31, 2023
1 parent b9cab3c commit 60db738
Show file tree
Hide file tree
Showing 3 changed files with 159 additions and 25 deletions.
1 change: 1 addition & 0 deletions doc/changes/devel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ Enhancements
- By default MNE-Python creates matplotlib figures with ``layout='constrained'`` rather than the default ``layout='tight'`` (:gh:`12050`, :gh:`12103` by `Mathieu Scheltienne`_ and `Eric Larson`_)
- Enhance :func:`~mne.viz.plot_evoked_field` with a GUI that has controls for time, colormap, and contour lines (:gh:`11942` by `Marijn van Vliet`_)
- Add :class:`mne.viz.ui_events.UIEvent` linking for interactive colorbars, allowing users to link figures and change the colormap and limits interactively. This supports :func:`~mne.viz.plot_evoked_topomap`, :func:`~mne.viz.plot_ica_components`, :func:`~mne.viz.plot_tfr_topomap`, :func:`~mne.viz.plot_projs_topomap`, :meth:`~mne.Evoked.plot_image`, and :meth:`~mne.Epochs.plot_image` (:gh:`12057` by `Santeri Ruuskanen`_)
- :func:`~mne.epochs.make_metadata` now accepts ``tmin=None`` and ``tmax=None``, which will bound the time window used for metadata generation by event names (instead of a fixed time). That way, you can now for example generate metadata spanning from one cue or fixation cross to the next, even if trial durations vary throughout the recording (:gh:`12118` by `Richard Höchenberger`_)

Bugs
~~~~
Expand Down
92 changes: 72 additions & 20 deletions mne/epochs.py
Original file line number Diff line number Diff line change
Expand Up @@ -2664,17 +2664,18 @@ def make_metadata(
keep_first=None,
keep_last=None,
):
"""Generate metadata from events for use with `mne.Epochs`.
"""Automatically generate metadata for use with `mne.Epochs` from events.
This function mimics the epoching process (it constructs time windows
around time-locked "events of interest") and collates information about
any other events that occurred within those time windows. The information
is returned as a :class:`pandas.DataFrame` suitable for use as
is returned as a :class:`pandas.DataFrame`, suitable for use as
`~mne.Epochs` metadata: one row per time-locked event, and columns
indicating presence/absence and latency of each ancillary event type.
indicating presence or absence and latency of each ancillary event type.
The function will also return a new ``events`` array and ``event_id``
dictionary that correspond to the generated metadata.
dictionary that correspond to the generated metadata, which together can then be
readily fed into `~mne.Epochs`.
Parameters
----------
Expand All @@ -2687,25 +2688,37 @@ def make_metadata(
A mapping from event names (keys) to event IDs (values). The event
names will be incorporated as columns of the returned metadata
:class:`~pandas.DataFrame`.
tmin, tmax : float
Start and end of the time interval for metadata generation in seconds,
relative to the time-locked event of the respective time window.
tmin, tmax : float | None
Start and end of the time interval for metadata generation in seconds, relative
to the time-locked event of the respective time window (the "row events").
.. note::
If you are planning to attach the generated metadata to
`~mne.Epochs` and intend to include only events that fall inside
your epochs time interval, pass the same ``tmin`` and ``tmax``
values here as you use for your epochs.
If ``None``, the time window used for metadata generation is bounded by the
``row_events``. This is can be particularly practical if trial duration varies
greatly, but each trial starts with a known event (e.g., a visual cue or
fixation).
.. note::
If ``tmin=None``, the first time window for metadata generation starts with
the first row event. If ``tmax=None``, the last time window for metadata
generation ends with the last event in ``events``.
.. versionchanged:: 1.6.0
Added support for ``None``.
sfreq : float
The sampling frequency of the data from which the events array was
extracted.
row_events : list of str | str | None
Event types around which to create the time windows / for which to
create **rows** in the returned metadata :class:`pandas.DataFrame`. If
provided, the string(s) must be keys of ``event_id``. If ``None``
(default), rows are created for **all** event types present in
``event_id``.
Event types around which to create the time windows. For each of these
time-locked events, we will create a **row** in the returned metadata
:class:`pandas.DataFrame`. If provided, the string(s) must be keys of
``event_id``. If ``None`` (default), rows are created for **all** event types
present in ``event_id``.
keep_first : str | list of str | None
Specify subsets of :term:`hierarchical event descriptors` (HEDs,
inspired by :footcite:`BigdelyShamloEtAl2013`) matching events of which
Expand Down Expand Up @@ -2780,8 +2793,10 @@ def make_metadata(
The time window used for metadata generation need not correspond to the
time window used to create the `~mne.Epochs`, to which the metadata will
be attached; it may well be much shorter or longer, or not overlap at all,
if desired. The can be useful, for example, to include events that occurred
before or after an epoch, e.g. during the inter-trial interval.
if desired. This can be useful, for example, to include events that
occurred before or after an epoch, e.g. during the inter-trial interval.
If either ``tmin``, ``tmax``, or both are ``None``, the time window will
typically vary, too.
.. versionadded:: 0.23
Expand All @@ -2791,7 +2806,11 @@ def make_metadata(
"""
pd = _check_pandas_installed()

_validate_type(events, types=("array-like",), item_name="events")
_validate_type(event_id, types=(dict,), item_name="event_id")
_validate_type(sfreq, types=("numeric",), item_name="sfreq")
_validate_type(tmin, types=("numeric", None), item_name="tmin")
_validate_type(tmax, types=("numeric", None), item_name="tmax")
_validate_type(row_events, types=(None, str, list, tuple), item_name="row_events")
_validate_type(keep_first, types=(None, str, list, tuple), item_name="keep_first")
_validate_type(keep_last, types=(None, str, list, tuple), item_name="keep_last")
Expand Down Expand Up @@ -2840,8 +2859,8 @@ def _ensure_list(x):

# First and last sample of each epoch, relative to the time-locked event
# This follows the approach taken in mne.Epochs
start_sample = int(round(tmin * sfreq))
stop_sample = int(round(tmax * sfreq)) + 1
start_sample = None if tmin is None else int(round(tmin * sfreq))
stop_sample = None if tmax is None else int(round(tmax * sfreq)) + 1

# Make indexing easier
# We create the DataFrame before subsetting the events so we end up with
Expand Down Expand Up @@ -2887,16 +2906,49 @@ def _ensure_list(x):
start_idx = stop_idx
metadata.iloc[:, start_idx:] = None

# We're all set, let's iterate over all eventns and fill in in the
# We're all set, let's iterate over all events and fill in in the
# respective cells in the metadata. We will subset this to include only
# `row_events` later
for row_event in events_df.itertuples(name="RowEvent"):
row_idx = row_event.Index
metadata.loc[row_idx, "event_name"] = id_to_name_map[row_event.id]

# Determine which events fall into the current epoch
window_start_sample = row_event.sample + start_sample
window_stop_sample = row_event.sample + stop_sample
# Determine which events fall into the current time window
if start_sample is None:
# Lower bound is the current event.
window_start_sample = row_event.sample
else:
# Lower bound is determined by tmin.
window_start_sample = row_event.sample + start_sample

if stop_sample is None:
# Upper bound: next event of the same type, or the last event (of
# any type) if no later event of the same type can be found.
next_events = events_df.loc[
(events_df["sample"] > row_event.sample),
:,
]
if next_events.size == 0:
# We've reached the last event in the recording.
window_stop_sample = row_event.sample
elif next_events.loc[next_events["id"] == row_event.id, :].size > 0:
# There's still an event of the same type appearing after the
# current event. Stop one sample short, we don't want to include that
# last event here, but in the next iteration.
window_stop_sample = (
next_events.loc[next_events["id"] == row_event.id, :].iloc[0][
"sample"
]
- 1
)
else:
# There are still events after the current one, but not of the
# same type.
window_stop_sample = next_events.iloc[-1]["sample"]
else:
# Upper bound is determined by tmax.
window_stop_sample = row_event.sample + stop_sample

events_in_window = events_df.loc[
(events_df["sample"] >= window_start_sample)
& (events_df["sample"] <= window_stop_sample),
Expand Down
91 changes: 86 additions & 5 deletions mne/tests/test_epochs.py
Original file line number Diff line number Diff line change
Expand Up @@ -3914,29 +3914,36 @@ def assert_metadata_equal(got, exp):


@pytest.mark.parametrize(
("all_event_id", "row_events", "keep_first", "keep_last"),
("all_event_id", "row_events", "tmin", "tmax", "keep_first", "keep_last"),
[
(
{"a/1": 1, "a/2": 2, "b/1": 3, "b/2": 4, "c": 32}, # all events
None,
-0.5,
1.5,
None,
None,
),
({"a/1": 1, "a/2": 2}, None, None, None), # subset of events
(dict(), None, None, None), # empty set of events
({"a/1": 1, "a/2": 2}, None, -0.5, 1.5, None, None), # subset of events
(dict(), None, -0.5, 1.5, None, None), # empty set of events
(
{"a/1": 1, "a/2": 2, "b/1": 3, "b/2": 4, "c": 32},
("a/1", "a/2", "b/1", "b/2"),
-0.5,
1.5,
("a", "b"),
"c",
),
# Test when tmin, tmax are None
({"a/1": 1, "a/2": 2}, None, None, 1.5, None, None), # tmin is None
({"a/1": 1, "a/2": 2}, None, -0.5, None, None, None), # tmax is None
({"a/1": 1, "a/2": 2}, None, None, None, None, None), # tmin and tmax are None
],
)
def test_make_metadata(all_event_id, row_events, keep_first, keep_last):
def test_make_metadata(all_event_id, row_events, tmin, tmax, keep_first, keep_last):
"""Test that make_metadata works."""
pytest.importorskip("pandas")
raw, all_events, _ = _get_data()
tmin, tmax = -0.5, 1.5
sfreq = raw.info["sfreq"]
kwargs = dict(
events=all_events,
Expand Down Expand Up @@ -4005,6 +4012,80 @@ def test_make_metadata(all_event_id, row_events, keep_first, keep_last):
Epochs(raw, events=events, event_id=event_id, metadata=metadata, verbose="warning")


def test_make_metadata_bounded_by_row_events():
"""Test make_metadata() with tmin, tmax set to None."""
pytest.importorskip("pandas")

sfreq = 100
duration = 15
n_chs = 10

# Define events and generate annotations
experimental_events = [
# Beginning of recording until response (1st trial)
{"onset": 0.0, "description": "rec_start", "duration": 1 / sfreq},
{"onset": 1.0, "description": "cue", "duration": 1 / sfreq},
{"onset": 2.0, "description": "stim", "duration": 1 / sfreq},
{"onset": 2.5, "description": "resp", "duration": 1 / sfreq},
# 2nd trial
{"onset": 4.0, "description": "cue", "duration": 1 / sfreq},
{"onset": 4.3, "description": "stim", "duration": 1 / sfreq},
{"onset": 8.0, "description": "resp", "duration": 1 / sfreq},
# 3rd trial until end of the recording
{"onset": 10.0, "description": "cue", "duration": 1 / sfreq},
{"onset": 12.0, "description": "stim", "duration": 1 / sfreq},
{"onset": 13.0, "description": "resp", "duration": 1 / sfreq},
{"onset": 14.9, "description": "rec_end", "duration": 1 / sfreq},
]

annots = mne.Annotations(
onset=[e["onset"] for e in experimental_events],
description=[e["description"] for e in experimental_events],
duration=[e["duration"] for e in experimental_events],
)

# Generate raw data, attach the annotations, and convert to events
rng = np.random.default_rng()
data = 1e-5 * rng.standard_normal((n_chs, sfreq * duration))
info = mne.create_info(
ch_names=[f"EEG {i}" for i in range(n_chs)], sfreq=sfreq, ch_types="eeg"
)

raw = mne.io.RawArray(data=data, info=info)
raw.set_annotations(annots)
events, event_id = mne.events_from_annotations(raw=raw)

metadata, events_new, event_id_new = mne.epochs.make_metadata(
events=events,
event_id=event_id,
tmin=None,
tmax=None,
sfreq=raw.info["sfreq"],
row_events="cue",
)

# We should have 3 rows in the metadata table in total.
# rec_start occurred before the first row_event, so should not be included
# rec_end occurred after the last row_event and should be included

assert len(metadata) == 3
assert (metadata["event_name"] == "cue").all()
assert (metadata["cue"] == 0.0).all()

for row in metadata.itertuples():
assert row.cue < row.stim < row.resp
assert np.isnan(row.rec_start)

# Beginning of recording until end of 1st trial
assert np.isnan(metadata.iloc[0]["rec_end"])

# 2nd trial
assert np.isnan(metadata.iloc[1]["rec_end"])

# 3rd trial until end of the recording
assert metadata.iloc[2]["resp"] < metadata.iloc[2]["rec_end"]


def test_events_list():
"""Test that events can be a list."""
events = [[100, 0, 1], [200, 0, 1], [300, 0, 1]]
Expand Down

0 comments on commit 60db738

Please sign in to comment.