Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binning plugin #28

Merged
merged 24 commits into from
Sep 15, 2023
Merged

binning plugin #28

merged 24 commits into from
Sep 15, 2023

Conversation

kecnry
Copy link
Member

@kecnry kecnry commented Jun 13, 2023

This PR implements a binning plugin (wrapping existing lightkurve binning functionality), supporting binning in either time or phase-space, along with a live-preview that works in conjunction with changes to the underlying ephemeris (when phase-binning, the binning is recomputed with any change to the ephemeris that dictates that phase-folding). Once loaded into the app, the binned data remain fixed (and with spacetelescope/jdaviz#2312 would only be available to load in the respective viewer).

This PR currently only exposes the number of bins as an input parameter, but we might want to consider (now or as a follow-up effort) the ability to optionally provide a bin width in time/phase units instead.

Support for having live-updating binning in the data-collection object might be considered in the future, but is out-of-scope for now.

Screen.Recording.2023-07-27.at.4.15.33.PM.mov

There seems to be a slight bug with the reference time when the time-binned data is actually applied and then phased (the phasing in the preview and the finalized data appears different). I'm not sure if that is introduced here or an existing bug that this uncovers.

The change in color when adding the time-binned data to the second viewer is likely an upstream bug.

@codecov
Copy link

codecov bot commented Jul 27, 2023

Codecov Report

Patch coverage is 94.46% of modified lines.

Files Changed Coverage
lcviz/template_mixin.py 84.44%
lcviz/plugins/ephemeris/ephemeris.py 92.10%
lcviz/utils.py 93.75%
lcviz/plugins/binning/binning.py 97.05%
lcviz/events.py 100.00%
lcviz/helper.py 100.00%
lcviz/marks.py 100.00%
lcviz/plugins/__init__.py 100.00%
lcviz/plugins/binning/__init__.py 100.00%
lcviz/plugins/coords_info/coords_info.py 100.00%
... and 3 more

📢 Thoughts on this report? Let us know!.

@kecnry kecnry marked this pull request as ready for review July 27, 2023 20:19
@bmorris3
Copy link
Contributor

By default, the binned results have the same marker properties as the default (except an increment in the color cycle). This makes sense most of the time, but in the specific case of binning, it's useful to have the binned markers larger and/or with a different marker shape, since it can be hard to see the bins over the busy original, densely sampled data:

bqplot (23)

Is there a way to specify the marker size/shape for binned results specifically?

@kecnry
Copy link
Member Author

kecnry commented Jul 31, 2023

Is there a way to specify the marker size/shape for binned results specifically?

Sure, we can do that with layer defaults (see latest commit). If we want that to be more advanced (dependent on the number of bins, etc), then we'll either have to store more information in the metadata, or add support at the add_results_to_plugin level (but that would be forgotten when the data is removed and re-added to the viewer).

@kecnry
Copy link
Member Author

kecnry commented Jul 31, 2023

NOTE: preview won't work on jdaviz 3.6. I'm working on a separate PR to fix that and will rebase this on top once that's ready.

@bmorris3
Copy link
Contributor

I'm seeing the following error when I create binned data entries and select a subset:


---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
File ~/miniconda3/envs/jdaviz-only-5/lib/python3.11/site-packages/glue_jupyter/bqplot/common/viewer.py:185, in BqplotBaseView._on_mouse_interaction(self, interaction, data, buffers)
    183 events = self._events_for_callback.get(key, [])
    184 if data["event"] in events:
--> 185     callback(data)

File ~/git/jdaviz/jdaviz/core/template_mixin.py:199, in TemplateMixin._viewer_callback.<locals>.plugin_viewer_callback.<locals>.<lambda>(data)
    198 def plugin_viewer_callback(viewer, plugin_method):
--> 199     return lambda data: plugin_method(viewer, data)

File ~/git/jdaviz/jdaviz/configs/imviz/plugins/coords_info/coords_info.py:169, in CoordsInfo._viewer_mouse_event(self, viewer, data)
    166 # update last known cursor position (so another event like a change in layers can update
    167 # the coordinates with the last known position)
    168 self._x, self._y = x, y
--> 169 self.update_display(viewer, x=x, y=y)

File ~/git/lcviz/lcviz/plugins/coords_info/coords_info.py:38, in CoordsInfo.update_display(self, viewer, x, y)
     36 # TODO: update with display_unit when supported in lcviz
     37 x_unit = '' if is_phase else str(viewer.time_unit)
---> 38 y_unit = str(viewer.data()[0].flux.unit)
     40 def _cursor_fallback():
     41     self._dict['axes_x'] = x

File ~/git/lcviz/lcviz/viewers.py:86, in TimeScatterView.data(self, cls)
     84 handler, _ = data_translator.get_handler_for(_class)
     85 try:
---> 86     layer_data = handler.to_object(layer_data)
     87 except IncompatibleAttribute:
     88     continue

File ~/git/lcviz/lcviz/utils.py:179, in LightCurveHandler.to_object(self, data_or_subset)
    177     values = Time(values.base)
    178 elif hasattr(component, 'units') and component.units != "None":
--> 179     values = u.Quantity(values, component.units)
    181 columns.append(values)
    182 names.append(component_id.label)

File ~/git/astropy/astropy/units/quantity.py:550, in Quantity.__new__(cls, value, unit, dtype, copy, order, subok, ndmin)
    546     unit = value_unit = _structured_unit_like_dtype(value_unit, value.dtype)
    548 # check that array contains numbers or long int objects
    549 if value.dtype.kind in "OSU" and not (
--> 550     value.dtype.kind == "O" and isinstance(value.item(0), numbers.Number)
    551 ):
    552     raise TypeError("The value must be a valid Python or Numpy numeric type.")
    554 # by default, cast any integer, boolean, etc., to float

IndexError: index 0 is out of bounds for axis 0 with size 0

Not sure what's the culprit yet.

Comment on lines +35 to +44
<plugin-ephemeris-select
:items="ephemeris_items"
:selected.sync="ephemeris_selected"
:show_if_single_entry="false"
label="Ephemeris"
hint="Select the phase-folding as input."
/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's clear that using the "ephemeris selected" machinery is the correct choice here, but I think the naming is a bit confusing if you haven't yet specified an ephemeris. For example, if I load a light curve at 1 min cadence but I want to see what happens when binned on ~year timescales, it'll be unclear what "ephemeris" means, since I don't have a strict period or epoch in mind.

My instinct was that this selector could be a "viewer" selector, so that one option says "flux-vs-time". What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't even show in the UI if you haven't manually created an ephemeris, so I'm hoping it'll make more sense in that context. But this is definitely an unsolved problem from the API that might warrant some thinking 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the API, something like spacetelescope/jdaviz#2347 would help (but that hasn't seemed to get much traction/support yet).

lcviz/plugins/binning/binning.py Show resolved Hide resolved
Comment on lines +57 to +72
<plugin-add-results
:label.sync="results_label"
:label_default="results_label_default"
:label_auto.sync="results_label_auto"
:label_invalid_msg="results_label_invalid_msg"
:label_overwrite="results_label_overwrite"
label_hint="Label for the binned data."
:add_to_viewer_items="add_to_viewer_items"
:add_to_viewer_selected.sync="add_to_viewer_selected"
action_label="Bin"
action_tooltip="Bin data"
@click:action="apply"
></plugin-add-results>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we track a progress status with a spinner here, like in Imviz:

https://github.com/spacetelescope/jdaviz/blob/b391423294fb5e6cd5ef6d7cafbb7a0b05a66ca3/jdaviz/configs/imviz/plugins/links_control/links_control.vue#L65-L79

This is a bit slow for me with large datasets and the data looks strangely linear at first as it loads. Graying out the viewer during load would help there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the live-preview is enabled, then we can probably skip rebinning when clicking the button and it should be quite responsive (just the time to load into the data collection/viewer). But in that case we might want some sort of indication that the binning is updating in the preview. And if the live-preview is off, then an indication when clicking the button will still be necessary.

This same pattern should probably also apply to flattening 🤔

Comment on lines +44 to +57
<v-text-field
label="N Bins"
type="number"
v-model.number="n_bins"
:step="10"
:rules="[() => n_bins !== '' || 'This field is required',
() => n_bins > 0 || 'Number of bins must be positive']"
hint="Number of bins."
persistent-hint
>
</v-text-field>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N bins is a good start, but it would be awesome/helpful to also offer a bin width in time units (10 minute bins, 5 day bins, etc), and do the arithmetic for the user. Your choice if that's in this PR or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! This gets more complicated when switching between time and phase-binning, so my vote would be to defer this until we get a better feel for what would feel intuitive for these cases

'''Generate a label and set the results field to that value'''
if not hasattr(self, 'ephemeris'):
return
label = f"binned {self.dataset_selected}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok for now, but it'd be helpful to add the number of bins in this string too. For example, here I've binned data in each viewer at different bin numbers in different data entries:

Screen Shot 2023-07-31 at 14 23 34

but the data labels in the drop down and in the data collection only show:

DataCollection (3 data sets)
	  0: KIC 10748390[pdcsap_flux]
	  1: binned KIC 10748390[pdcsap_flux]:default
	  2: binned KIC 10748390[pdcsap_flux]

You and I know that default means an ephemeris is being applied to 1, but that might not obvious for the user. We could do f'binned (N={self.nbins}) {self.dataset_selected}' as a start.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to open a ticket (🐱) to consider this in general across all plugins that export data back to the app. I think we need a consistent and somewhat comprehensive approach for (1) what information is most important to be in the default labels, and (2) a consistent "syntax" (for example, the ephemeris in the phase-viewer labels is consistent with the :default suffix right now, but we might want to consider changing that too).

Of course we probably can't cover every use-case without undesirably long labels, which is why the labels can be manually modified.

A really cool idea that we could implement upstream would be the concept of user-defined templates with some sort of drag and drop UI, but I'm not sure that's worth the effort and overhead. Or maybe any inputs that have been "touched" are included, but others that have remained fixed aren't? Or maybe we can store all the inputs in the metadata and have dynamic labels after the fact?

return
label = f"binned {self.dataset_selected}"
if self.ephemeris_selected not in self.ephemeris._manual_options:
label += f":{self.ephemeris_selected}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe this could be more like f", Ephemeris: {self.ephemeris_selected}" so it's clear what default refers to.

Copy link
Contributor

@bmorris3 bmorris3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in! 🚀

@kecnry kecnry merged commit ae113c0 into spacetelescope:main Sep 15, 2023
9 checks passed
@kecnry kecnry deleted the binning-plugin branch September 15, 2023 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants