-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pint.Qunatity
units support
#881
base: master
Are you sure you want to change the base?
Conversation
I've run into a couple issues with this.
|
I've fixed problem 2, but problem 1 is a little trickier. One solution would be to simply not support step types. I'd like to use them, so that doesn't really work for me. The simplest solution would be to add a Another idea I've been playing around with is removing steps from metric. Instead, the steps would be their own metric and the user would declare that other metrics rely on a metric. QCoDeS uses this approach. For example, if there were a test that measured signal strength as frequency changed, you would have 2 metrics:
Where Signal Strength depends on Frequency. We could have a time metric auto created for each metric. We could also auto generate a step metric for every metric that doesn't depend on another metric. graph TD;
frequency-->frequency.step;
frequency-->frequency.timestamp;
signal_strength-->frequency;
signal_strength-->signal_strength.timestamp;
This starts to get a little messy for consumers of sacred data. Consumers have to know to focus on non step, non timestamp relations when possible. Or perhaps we keep the metrics as is and create an association table. Steps and timestamps would still be auto generated as they currently are, but users would now be able to make their independent variables their own metrics. This shouldn't break existing code that depends on sacred, and it would allow multidimensional support which has been kind of a hack thus far. Consumers, like Omniboard, could be much more intelligent about displaying metrics. The downside would be a lot of useless step data, and comparing step and timestamp data would be done differently from comparing metrics. If a heartbeat occurred in the middle of a measurement, you could wind up with a measurement that depends on a different measurement not having the same number of entries. This will complicate consumers of sacred data, but only those that are looking at the new association metadata. @Qwlouse, do you have any thoughts about this proposal? Sorry for the info dump! |
I've come across a major flaw in my approach. In order to solve this, some step would have to be done to preserve information about the first |
Went ahead and fixed the issue with units changing in between heartbeats. I also added the measurement dependency concept described above. I can refactor that out into a separate PR if the maintainers would prefer. |
Hi @Gracecr, sorry for the late response, I have been offline for two weeks. Thank you for your hard work! It's a lot of information to process, so I'll start with a few comments:
|
No need for an apology, hope you enjoyed your time offline!
|
I'm having some trouble picturing how metadata should be stored. Unlike other experiment frameworks, Sacred doesn't make the user create Metrics explicitly, rather the the user can start logging to metrics right away. This is quite nice for the user, but not having a persistent representation of each measurement makes it difficult to have persistent metadata that refers to the Metric. I could make a @dataclass
class Metric:
name: str
meta: dict
entries: list[ScalarMetricLogEntry] # May not want to persist entries in memory...
In terms of filling that class MetricPlugin:
def process_metric(metric_entry: ScalarMetricLogEntry, metric: Metric) -> tuple[ScalarMetricLogEntry, Metric]:
"""Transforms `metric_entry` and `metric`.
Args:
metric_entry (ScalarMetricLogEntry)
metric (Metric)
Returns:
tuple[ScalarMetricLogEntry, Metric]: Transformed metric entry, transformed metric
"""
def linearize_metrics(logged_metrics):
for index, metric_entry in enumerate(logged_metrics)):
for plugin in self.plugins:
metric = self._metrics[metric_entry.name]
self._metrics[metric_entry.name], logged_metrics[index] = plugin.process_metric(metric_entry, metric)
... For that to work, I can't really think of a another usecase for this plugin structure, but it would work well for my use case of adding unit information. I could even have the plugin only be added to I assume most plugins would look like: class MyPlugin(MetricPlugin):
def process_metric(metric_entry, metric):
if not isinstance(metric_entry.value, MyTypeOfInterest):
return metric_entry, metric
... Not crazy about the name "MetricPlugin". Let me know your thoughts! |
I like the idea of making this extensible. This could also be the right time to do some cleanup of the code related to metrics logging, where the You could almost use the same code as now if you replace the queue in @dataclass
class Metric:
name: str
meta: dict
entries: Queue[ScalarMetricLogEntry] # Queue instead of list to not persist the values in memory I'm not sure why the metrics are stored in the way they currently are stored. By moving the grouping backward into the I'm not sure yet whether I like the
That would make the code longer where metrics are logged, but the "plugin" only appears in one place (with your approach we would have the plugin and the pint unit). (This was just a quick thought, so it could be garbage) |
Things are looking much cleaner with this approach. Good idea with To push for the plugin idea a little more -- the nice thing about doing it this way is the user doesn't even have to know it's happening. In fact, we're already doing something similar with Likewise, if the user logs a I removed the scattered type hints. I'll put those in another PR once this is finished. I implemented
@ex.automain
def my_main(_run: Run):
test_metric = _run._metrics.create_metric("test_metric", meta={"my_metadata": 123})
for i in range(100):
test_metric.log(i)
test_metric.meta["more_metadata"] = "example" This way the user can add their own metadata whenever they like, and they don't have to keep including the name every time they log a new value. This would expose the entries Queue, which we definitely don't want the user touching. We also need to pass the plugins to the metrics if we're going to let the user call a
|
Azure Pipeline doesn't seem to be running on my latest commit. I added the @thequilo, is this something you can assist me with? |
Hmm, I can't find a button to trigger the checks manually. I guess I don't have the rights on azure and it is not possible to manually trigger a run on azure from the GitHub interface. But if I remember correctly, the tests used to just run even when the azure config was changed (as in, for example, #821). |
Adds a
units
field tolinearize_metrics
output per discussion in #880.I took a slightly different approach. Instead of adding
units
toScalarMetricLogEntry
, I added "units" to the linearized output and filled it in based on whether or notvalue
is of typepint.Quantity
. Not sure if it would be better to addunits=None
tolog_scalar_metric
and do thepint
support in the background, or to do it as I've done. On the one hand, it would remove the hard dependency onpint
and would be a little easier for users. On the other hand, users wouldn't have full access topint
s features, so they couldn't define their own units.I went ahead and added in the unit conversion as
pint
makes it quite easy to do. Units will be converted to the unit of the first log entry. If the units cannot be converted, a custom exception is raised. If you submit some entries with units and some without, it assumes that the entries without units use the same units as the entries with units. Might be better to throw an exception there as a user really shouldn't be doing that.While working on this feature, I added some type hints where they were missing. Not my finest type hints, but it's better than nothing!
I have an update adding metric support (with units) to
SqlObserver
ready, but it relies on this PR.