Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data.split (and other methods?) do not pass parent's attrs to children #1058

Open
darienmorrow opened this issue Apr 3, 2022 · 4 comments
Open

Comments

@darienmorrow
Copy link
Member

darienmorrow commented Apr 3, 2022

Note, this issue originated in a slack conversation with @DLafayetteII

import numpy as np
import WrightTools as wt

d = wt.Data(name='test')
d.create_variable('t', values=np.linspace(0,1,51), units='eV')
d.attrs['key'] = 1
print(d.attrs['key'])
c = d.split('t', .5)
print(c[0].attrs['key']) #throws KeyError
@ksunden
Copy link
Member

ksunden commented Apr 3, 2022

This is certainly not critical...

I'm not even sure it should be the case that attrs are blindly applied to e.g. split et al...

It's a one liner to apply them when they should be applied:

c[0].attrs.update(d.attrs)

@DLafayetteII
Copy link
Contributor

Most use cases I am familiar with would benefit from attrs copying through to the split objects -
Split is commonly used to refine data (remove noise regions, focus on specific regions of data). When a user applies split, it is unlikely that their goal is to remove all attrs that they have created, as well as acquisition attrs like 'data info', 'data'origin'.

This issue came up in discussion because we have a need to tie some constant acquisition parameters to data objects (sample thickness, sample number, PMT voltage, heater temperature, etc.). I think this will come up increasingly often if the 'info' section of acquisitions becomes more used. Storing these parameters as variables causes split to fail. Storing in Data.attrs was @darienmorrow 's proposed solution, but it sounds like users will have to be careful to track their attrs through processing.

@ddkohler
Copy link
Contributor

ddkohler commented Jun 14, 2022

This issue came up in discussion because we have a need to tie some constant acquisition parameters to data objects (sample thickness, sample number, PMT voltage, heater temperature, etc. [...] Storing these parameters as variables causes split to fail.

I think this is not true? Perhaps this an issue of not broadcasting these parameters to the dimensionality of the data object?

In any case, it seems like we need a more concrete vision of what attrs should be to move forward with this. If attrs is about storing conditions of the experiment, it's more likely than not that a split/chopped dataset will retain have those conditions.

Should we implement a "kind" property for attrs members (e.g. could be "persistent" or "unique") so they can guide the attr inheritance process?

Note that this could raise issues for join or procedures where attrs need to be merged.

@DLafayetteII
Copy link
Contributor

DLafayetteII commented Jul 8, 2024

Could add a keyword to split that assigns the attrs of the initial object to all created objects in the resulting split list, where the default of this argument is false (to remain consistent with previous implementation). I still believe general use benefits from this argument being true.

@ddkohler ddkohler mentioned this issue Jul 10, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants