WIP: High-precision timer #382

zzm634 · 2024-03-14T16:27:40Z

Adds a high-precision (1,024hz) reference timestamp that faces may use to measure elapsed time or schedule future events independent of calendar time (aka, the RTC).

WORK IN PROGRESS. Still to do:

Testing overflows and other edge cases
Speed up timer synchronization
Invoke face code for background outside of the timer ISR (or re-enable interrupts
Better overflow ISR handling
General cleanup

Currently, in most faces that need it, the RTC is used to measure elapsed time and schedule background tasks. Although the RTC offers higher tick-rates than 1hz, the reference timestamp it provides is only precise down to the second. Stopwatches and other timing faces need a timestamp with higher precision. Also, the value stored in the RTC can be modified by the user when they set the time or change time zones, meaning that comparing RTC timestamps is not an entirely reliable way to measure elapsed time.

The High-Precision Timer (HPT) uses TC2 (and TC3) to keep track of a 32-bit counter, ticking upwards at 1,024 hz. It is enabled on-demand by faces or features that need access to the timestamp it provides. When no faces or features require this timestamp, the timer peripheral is disabled to save power. watch_hpt.h provides low-level access to the timer peripheral, while Movement provides an implementation to handle counter overflows and an API to share access to timer features across multiple faces.

It's sorta like the difference between performance.now() and Date.now() in javascript.

The RTC provides an "alarm" system that can trigger an interrupt at a specific calendar time, but this is also still tied to its one-second resolution. This means it is not suitable for scheduling background that are to occur in the very near future. For example, buzzer tones are often under a full second long. Controlling the buzzer is currently handled using a "fast ticks" mode that wakes up the CPU using the 128hz period provided by the RTC. If we want to play a tone for a quarter of a second, we enable the buzzer, turn on "fast ticks" mode, count 32 wakeups later and then disable the buzzer. The HPT provides access to a background wakeup event with an arbitrary delay. Using the HPT we can instead enable the buzzer, schedule a HPT wakeup for 256 ticks later, go to sleep, wake up, and disable the buzzer, all without ever needing to wake the CPU up between turning the buzzer on and off.

This PR adds:

the HPT API to the Movement framework
two new faces that use the HPT (a high-precision chronograph face and countdown timer face)
emscripten simulator support
refactors to existing watch faces or features that previously used TC2 or TC3 (stock chrono face, dual timer face, and buzzer melody system)

If this PR is accepted, I will begin work on refactoring movement.c to use the HPT for LED control and long-press timing, which should eliminate the need for "fast ticks" mode and allow the CPU to sleep more often. (It also means we could allow faces to use the 128hz RTC tick mode, if we really wanted to.) Depending on power consumption, we could even consider using this timer to handle face timeouts and low-energy timeouts, though these are not critical as they do not need subsecond precision, and they won't be affected by RTC modifications, because the user changing the time would invariably involve resetting these timeouts.

Some outstanding concerns:

Power consumption - This feature requires activating more hardware peripherals, specifically, another clock generator in GCLK, and two timers, TC2, and TC3. This has the potential to consume more power while these devices run in the background. However, I am inclined to believe that this change will save power overall, as we will now be using dedicated hardware to keep track of timing information that was previously implemented using software. Using a hardware timer to schedule future tasks should mean fewer CPU wakeups overall and more time spent in standby mode. Ultimately, power profiling may be necessary to truly answer this question. Many faces that currently use watch_buzzer_play_note can be refactored to use the HPT buzzer system and avoid a call to delay_ms, which would save power during those beeps.
Overflows and edge cases - at 1024hz, a 32-bit counter will take 49 days to overflow. The code in this PR should handle this situation properly, but it does mean there is some extra logic in play to detect and compensate for timer overflows that occur while critical operations are being performed. These "safety" checks mean more CPU cycles, more power consumed, more latency for the user. The timer can be reset to zero when no face is using it, so in normal mode, it is only likely to overflow if someone leaves a stopwatch running in a drawer for 49 days, or sets a ridiculously long countdown timer. This lengthy overflow period makes it difficult (though not impossible) to test edge cases where a face attempts to read a timestamp or schedules a future event that triggers exactly during a timer overflow. One alternative is to reduce the update rate from 1024hz to 128hz, in which case the overflow will occur after 1.09 years.
Synchronization - As the CPU and Timers are in different "peripheral domains", registers must be synchronized between them when reading data. In practice this means sending a command to the timer to synchronize its counter value and waiting for the synchronization to complete. Without synchronization, I was encountering an issue where I would set the timer to trigger an interrupt at some counter value N, but if I read the counter inside the interrupt handler, I would get back a value of N-5. Once I added the right synchronization code, I'd get back a value around N+3, which is okay, but has other implications. The synchronization can only be done through busy waiting, and the timer "commands" are clocked using the peripheral clock, not the main clock. Which means that (unless I am mistaken), reading a synchronized value from the timer takes about 3 milliseconds of busy-waiting CPU time. I have since added some code to read unsynchronized values where the timestamp accuracy is not critical, but still. 3ms is a long time for the CPU to be spinning doing nothing. Further experimentation is required.
Simulator accuracy - The current simulator implementation of the HPT does not simulate counter overflows, due to the difficulty of scheduling a browser setTimeout() callback 49 days in the future. This is a known issue in most browsers. While not an insurmountable problem, it didn't seem like a high priority problem to solve at the time.

See also #381

TODO: - Scheduling next HPT event - Carefully handling overflows - Low level timer stuff

matheusmoreira

Looking forward to seeing this develop.

matheusmoreira · 2024-03-14T17:24:37Z

movement/movement.c

+// bit flags indicating which watch face wants the HPT enabled.
+// TODO make sure this number width is larger than the number of faces
+#define HPT_REQUESTS_T uint16_t
+HPT_REQUESTS_T hpt_enable_requests;


How about this?

#define HPT_REQUESTS_SIZE (MOVEMENT_NUM_FACES / CHAR_BIT) unsigned char hpt_enable_requests[HPT_REQUESTS_SIZE + 1];

I only really need boolean flags, and I was trying to come up with a good way to pack them into a smaller structure to save memory. (On the other hand, I went and allocated 64-bits for a bunch of clock face timestamps that probably aren't going to use them)

Oh I see what you're doing with that. That would also work, and I think once I get this feature working, I'll do another pass and try to optimize memory usage as much as possible. (like, timestamps really only need to be 40-bits, even in 1024hz mode, but packing all that stuff up is a lot of extra math)

With some extra preprocessor hackery, we could get rid of that +1 as well which is there just to prevent zero sized array errors in case the result is zero. That would save an extra byte.

matheusmoreira · 2024-03-14T17:26:59Z

movement/movement.c

+// the number of times the high-precision timer has overflowed
+uint8_t hpt_overflows = 0;


How many times are we expecting the timer to overflow?

255 overflows of a 32-bit timer running at 1024hz is 34 years. Even in BACKUP mode, I'm pretty sure the battery itself will self-discharge by then.

matheusmoreira · 2024-03-14T17:28:01Z

movement/movement.c

+// timestamps at which watch faces should have a HPT event triggered. UINT64_MAX for no callback
+uint64_t hpt_scheduled_events[MOVEMENT_NUM_FACES];


Perhaps it would be possible to store this in a watch face structure, eliminating a global variable.

I had considered that, but either way, it's going to add up to one new 64-bit timestamp per watch face. I didn't see an obvious watch face struct to add it to in movement.c, but I can't say I looked very hard.

matheusmoreira · 2024-03-14T17:30:18Z

movement/movement.c

+/** Figures out the next scheduled event for the HPT to wake up for.
+ * Must be called whenever:
+ * - a new event is scheduled
+ * - an old event is deleted
+ * - an event is triggered
+ * - the timer overflows
+ */
+static void _movement_hpt_schedule_next_event() {
+    uint64_t next = UINT64_MAX;
+    for(uint8_t req_idx = 0; req_idx < MOVEMENT_NUM_FACES; ++req_idx) {
+        uint64_t event = hpt_scheduled_events[req_idx];
+        if(event < next) {
+            event = next;
+        }
+    }


It's critical that we make this function as efficient as possible. Hopefully we can eliminate this O(n) loop. What's your take on radix timers?

I don't think it's fair to call this O(n), because MOVEMENT_NUM_FACES is a constant. Most people are probably going to have like, 12 faces max on their watch, and even if they go hog wild, the most they can have is 256. Compilers are very good at optimizing simple loops like this, it will be fine.

radix timers seems like a really "clever" solution to this very simple problem.

matheusmoreira · 2024-03-14T17:36:23Z

movement/movement.c

+void movement_hpt_relenquish(void) {
+    movement_hpt_relenquish_face(movement_state.current_face_idx);
+}
+void movement_hpt_relenquish_face(uint8_t face_idx) {


Small correction: relinquish.

I thought I spelled that wrong. I'm gonna rename it to "release" anyway, because "relinquish" is too fancy.

matheusmoreira · 2024-03-14T17:38:47Z

movement/movement.h

+void movement_hpt_relenquish(void);
+/**
+ * A variant of "movement_hpt_relenquish" that can be used when your face is
+ * not running in the foreground.
+*/
+void movement_hpt_relenquish_face(uint8_t face_idx);


Small correction: relinquish.

matheusmoreira · 2024-03-14T17:42:24Z

movement/movement.h

+ * Faces may schedule an EVENT_HPT event to occur by calling 
+ * "movement_hpt_schedule" and passing in a timestamp for the event to occur.
+ * The face must call "movement_hpt_request" before scheduling the event, and
+ * must not call "movement_hpt_relenquish" until after the event has occurred.


Why not? It won't be possible to cancel a running timer? What if the user cancels an alarm?

Oh I mean, they can call "relenquish" if they want to, but it means their scheduled event is going to be cancelled, and some devs might be surprised by that.

matheusmoreira · 2024-03-14T17:51:01Z

movement/movement.h

+/**
+ * Schedules a future EVENT_HPT event to occur on or after the given timestamp.
+ * The Movement framework will do its best to trigger the event as close to the
+ * timestamp as possible, but it may be delayed if multiple faces or events are
+ * scheduled to occur on the same timestamp.


This implies timers are guaranteed to be handled, which means either real time code or a queue of pending timer events. Do you mean to guarantee this, and if so by which method?

I assume you meant real time constraints, as implied by your concerns in discussion #381:

If interrupt handling takes too long, it's possible for the timer value to change in the background while we're handling scheduled events. If events are scheduled too close together or too close in the future, it might be possible that by the time we set the timestamp value for the new interrupt, the counter has already passed it and the interrupt won't trigger. Now, if the counter is running at 128hz and the cpu is running at 4mhz, that gives us about 31,000 clock cycles to finish up whatever we were doing and check the timer again. Plenty of time... unless a face calls _delay_ms() or has to wait to activate a peripheral. Not a deal-breaking problem, but something we need to be aware of.

However this comment was written before you updated your timer code to 1024 Hz. The deadlines will be much tighter.

I think the Movement framework needs to guarantee these events, but I haven't figured out exactly how to do it. It might be messy.

We could check the nearest event to see if its timestamp is in the past. If so, that would mean its deadline was missed. We move forward in the event queue, processing them all in sequence until we find one with a timestamp in the future.

An efficient, hopefully constant time "next event" function would make it feasible. We'd be able to guarantee it would finish before the next timer window.

matheusmoreira · 2024-03-14T17:52:22Z

watch-library/hardware/watch/watch_hpt.h

+    /**
+     * The callback is being invoked because of an error in the comparison (?)
+    */
+    bool cmp_error :1;


What conditions could possibly cause this?

Not sure, but it was something the datasheet called out as a possible interrupt source. Unless I read the datasheet wrong, the timer only has a single interrupt vector and you're supposed to check the interrupt status register to see what caused it.

Bit 1 – ERR: Error Interrupt Flag
This flag is set when a new capture occurs on a channel while the corresponding Match or Capture
Channel x interrupt flag is set, in which case there is nowhere to store the new capture.

It's entirely irrelevant to the HPT though, so maybe I should get rid of the flag in this struct.

Match or capture? This has something to do with using these timers to generate waveforms. I don't think we'll be using that functionality. It could be safe to ignore that possibility.

…or watch_hpt.c

Still some questions left in the code, and I had to disable the stock stopwatch and dual timer faces. Once I am sure this works properly, I'll update those faces to use it.

zzm634 · 2024-03-16T19:51:49Z

Just noticed that watch_buzzer.c uses TC3 for playing note sequences. Yet another thing for HPT to support I guess.

movement/make/Makefile

- Removed some debugging cruft and commented out code - HPT requests are stored using a byte array instead of a single variable (allows for more than 64 HPT requests) - Added "fast" mode to counter reads that skips the synchronization step - Fixes and improvements to HPT timing faces.

Also refactors movement.c to use HPT buzzing for alarms, signals, and other short beeps. Adds another feature to buzzer sequences: skip aheads. (Not used by anything yet, but repeater faces might need it. Also enables TCC run in standby always. TCC is enabled on-demand anyway. Might as well let it run in standby. Eventually, I would like to remove `watch_buzzer_play_note` because it uses delay_ms, but a lot of faces use it right now and they can be refactored later.

zzm634 · 2024-03-19T22:20:32Z

movement/movement.c

+    _movement_hpt_schedule_next_event();
+    if (canSleep)
+    {
+        // TODO put cpu to sleep?


Could use some input from other devs on how to put the CPU back to sleep here. Or other feedback on running code in this ISR vs running it in the main app loop.

zzm634 · 2024-03-19T22:25:13Z

movement/movement_config.h

This will all be reverted when this is ready for merge.

zzm634 · 2024-03-19T22:25:33Z

movement/watch_faces/clock/hpt_led_test_face.c

this face is likely to be deleted before merge as well

zzm634 · 2024-03-19T22:26:43Z

movement/watch_faces/complication/hpt_countdown_face.h

+/*
+ * MIT License
+ *
+ * Copyright (c) 2024 <#author_name#>


fix author name

zzm634 · 2024-03-19T22:32:29Z

watch-library/hardware/watch/watch_hpt.c

needs copyright header

zzm634 · 2024-03-19T22:32:47Z

watch-library/shared/watch/watch_hpt.h

copyright header

wryun · 2024-07-21T15:09:37Z

Well, I'm a fan of higher preicions timing, and it all sounds good. But it's a little scary!

@joeycastillo - do you think this is a good approach?

joeycastillo · 2024-07-21T15:32:02Z

I think I need to reserve an opinion on this until I can run power profiling on it, but I do like the idea of a shared high precision timer that everything that needs high precision timing can share, rather than having individual watch faces make use of TC peripherals. I’m also curious about the choice to use a 32 bit counter (which ties up two TC’s) instead of a 16 bit counter. Using a 16 bit counter would mean overflowing every minute or so, but it would make handling overflows as a matter of course instead of a thing that happens once in a blue moon. Especially since high precision timing mostly relates to things happening in the near future.

@wryun your pinging me on this comes at an interesting moment too, because I’m currently doing a pretty big refactor of Movement in the background, tearing out a lot of low level technical debt and trying to free up resources for forthcoming hardware. In particular, my refactor should free up TC0 and TC1, which I think we might need to free up TC2 and TC3 for use by the accelerometer. Anyway I’m open to considering this for potential inclusion into the forthcoming second movement firmware, again, pending a more thorough power consumption analysis and a deeper review of the code.

wryun · 2024-07-22T07:52:38Z

Thanks - happy to hear it's on your radar.

I guess the main advantage of the 32-bit counter from @zzm634 's perspective is that one might be able to do away with some of the overflow handling:

The code in this PR should handle this situation properly, but it does mean there is some extra logic in play to detect and compensate for timer overflows that occur while critical operations are being performed. These "safety" checks mean more CPU cycles, more power consumed, more latency for the user.

i.e. rather than having this complexity, one could just schedule an event before the overflow every time you start the timer like (EVENT_HPT_MAX_TIME or something) and document that code has to handle it by aborting the timer action. Since it's unlikely one needs a 49+ day precision timing event, and this might have an impact on battery life (and event hpt accuracy, in that you might be able to get the event to user code faster).

But given they're in there at the moment, I would also prefer a 16-bit counter and handle the overflow as a matter of course.

One of the interesting things about this PR as well is that (from a brief look) none of the existing faces have been modified to use EVENT_HPT because they only really need an accurate time when they're reacting to another event. I'm not sure there's actually a clear use case for EVENT_HPT; it would be enough that you can easily get the differences between times when something else happens (i.e. button press).

zzm634 added 4 commits March 12, 2024 18:51

HPT: basic implementation without low-level support

81d04bf

HPT: chrono face using high-precision timer

092a375

HPT: add stuff to makefiles and update watch_hpt.h a bit

757957e

HPT: switch to 1024hz timekeeping

0d51c44

TODO: - Scheduling next HPT event - Carefully handling overflows - Low level timer stuff

zzm634 marked this pull request as draft March 14, 2024 16:27

matheusmoreira reviewed Mar 14, 2024

View reviewed changes

zzm634 added 2 commits March 15, 2024 15:04

HPT: more complete implementation in movement.c and some psuedocode f…

d6f9e99

…or watch_hpt.c

HPT: now working on hardware

ae5b466

Still some questions left in the code, and I had to disable the stock stopwatch and dual timer faces. Once I am sure this works properly, I'll update those faces to use it.

zzm634 added 3 commits March 16, 2024 20:04

HPT: Simulator support (mostly) and some fixes

eb966e4

HPT: Update dual timer face to use HPT

bc284ac

HPT: add countdown face and some simulator fixes

aeee9a4

zzm634 commented Mar 18, 2024

View reviewed changes

movement/make/Makefile Show resolved Hide resolved

zzm634 added 4 commits March 18, 2024 16:03

HPT: Some general clean-up and better documentation

aa79719

HPT: Restore and refactor stock_stopwatch_face to use HPT

728b3e3