Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: High-precision timer #382

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from
Draft

WIP: High-precision timer #382

wants to merge 15 commits into from

Conversation

zzm634
Copy link

@zzm634 zzm634 commented Mar 14, 2024

Adds a high-precision (1,024hz) reference timestamp that faces may use to measure elapsed time or schedule future events independent of calendar time (aka, the RTC).

WORK IN PROGRESS. Still to do:

  • Testing overflows and other edge cases
  • Speed up timer synchronization
  • Invoke face code for background outside of the timer ISR (or re-enable interrupts
  • Better overflow ISR handling
  • General cleanup

Currently, in most faces that need it, the RTC is used to measure elapsed time and schedule background tasks. Although the RTC offers higher tick-rates than 1hz, the reference timestamp it provides is only precise down to the second. Stopwatches and other timing faces need a timestamp with higher precision. Also, the value stored in the RTC can be modified by the user when they set the time or change time zones, meaning that comparing RTC timestamps is not an entirely reliable way to measure elapsed time.

The High-Precision Timer (HPT) uses TC2 (and TC3) to keep track of a 32-bit counter, ticking upwards at 1,024 hz. It is enabled on-demand by faces or features that need access to the timestamp it provides. When no faces or features require this timestamp, the timer peripheral is disabled to save power. watch_hpt.h provides low-level access to the timer peripheral, while Movement provides an implementation to handle counter overflows and an API to share access to timer features across multiple faces.

It's sorta like the difference between performance.now() and Date.now() in javascript.

The RTC provides an "alarm" system that can trigger an interrupt at a specific calendar time, but this is also still tied to its one-second resolution. This means it is not suitable for scheduling background that are to occur in the very near future. For example, buzzer tones are often under a full second long. Controlling the buzzer is currently handled using a "fast ticks" mode that wakes up the CPU using the 128hz period provided by the RTC. If we want to play a tone for a quarter of a second, we enable the buzzer, turn on "fast ticks" mode, count 32 wakeups later and then disable the buzzer. The HPT provides access to a background wakeup event with an arbitrary delay. Using the HPT we can instead enable the buzzer, schedule a HPT wakeup for 256 ticks later, go to sleep, wake up, and disable the buzzer, all without ever needing to wake the CPU up between turning the buzzer on and off.

This PR adds:

  • the HPT API to the Movement framework
  • two new faces that use the HPT (a high-precision chronograph face and countdown timer face)
  • emscripten simulator support
  • refactors to existing watch faces or features that previously used TC2 or TC3 (stock chrono face, dual timer face, and buzzer melody system)

If this PR is accepted, I will begin work on refactoring movement.c to use the HPT for LED control and long-press timing, which should eliminate the need for "fast ticks" mode and allow the CPU to sleep more often. (It also means we could allow faces to use the 128hz RTC tick mode, if we really wanted to.) Depending on power consumption, we could even consider using this timer to handle face timeouts and low-energy timeouts, though these are not critical as they do not need subsecond precision, and they won't be affected by RTC modifications, because the user changing the time would invariably involve resetting these timeouts.

Some outstanding concerns:

  • Power consumption - This feature requires activating more hardware peripherals, specifically, another clock generator in GCLK, and two timers, TC2, and TC3. This has the potential to consume more power while these devices run in the background. However, I am inclined to believe that this change will save power overall, as we will now be using dedicated hardware to keep track of timing information that was previously implemented using software. Using a hardware timer to schedule future tasks should mean fewer CPU wakeups overall and more time spent in standby mode. Ultimately, power profiling may be necessary to truly answer this question. Many faces that currently use watch_buzzer_play_note can be refactored to use the HPT buzzer system and avoid a call to delay_ms, which would save power during those beeps.
  • Overflows and edge cases - at 1024hz, a 32-bit counter will take 49 days to overflow. The code in this PR should handle this situation properly, but it does mean there is some extra logic in play to detect and compensate for timer overflows that occur while critical operations are being performed. These "safety" checks mean more CPU cycles, more power consumed, more latency for the user. The timer can be reset to zero when no face is using it, so in normal mode, it is only likely to overflow if someone leaves a stopwatch running in a drawer for 49 days, or sets a ridiculously long countdown timer. This lengthy overflow period makes it difficult (though not impossible) to test edge cases where a face attempts to read a timestamp or schedules a future event that triggers exactly during a timer overflow. One alternative is to reduce the update rate from 1024hz to 128hz, in which case the overflow will occur after 1.09 years.
  • Synchronization - As the CPU and Timers are in different "peripheral domains", registers must be synchronized between them when reading data. In practice this means sending a command to the timer to synchronize its counter value and waiting for the synchronization to complete. Without synchronization, I was encountering an issue where I would set the timer to trigger an interrupt at some counter value N, but if I read the counter inside the interrupt handler, I would get back a value of N-5. Once I added the right synchronization code, I'd get back a value around N+3, which is okay, but has other implications. The synchronization can only be done through busy waiting, and the timer "commands" are clocked using the peripheral clock, not the main clock. Which means that (unless I am mistaken), reading a synchronized value from the timer takes about 3 milliseconds of busy-waiting CPU time. I have since added some code to read unsynchronized values where the timestamp accuracy is not critical, but still. 3ms is a long time for the CPU to be spinning doing nothing. Further experimentation is required.
  • Simulator accuracy - The current simulator implementation of the HPT does not simulate counter overflows, due to the difficulty of scheduling a browser setTimeout() callback 49 days in the future. This is a known issue in most browsers. While not an insurmountable problem, it didn't seem like a high priority problem to solve at the time.

See also #381

@zzm634 zzm634 marked this pull request as draft March 14, 2024 16:27
Copy link
Collaborator

@matheusmoreira matheusmoreira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking forward to seeing this develop.

Comment on lines 88 to 91
// bit flags indicating which watch face wants the HPT enabled.
// TODO make sure this number width is larger than the number of faces
#define HPT_REQUESTS_T uint16_t
HPT_REQUESTS_T hpt_enable_requests;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

#define HPT_REQUESTS_SIZE (MOVEMENT_NUM_FACES / CHAR_BIT)

unsigned char hpt_enable_requests[HPT_REQUESTS_SIZE + 1];

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only really need boolean flags, and I was trying to come up with a good way to pack them into a smaller structure to save memory. (On the other hand, I went and allocated 64-bits for a bunch of clock face timestamps that probably aren't going to use them)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see what you're doing with that. That would also work, and I think once I get this feature working, I'll do another pass and try to optimize memory usage as much as possible. (like, timestamps really only need to be 40-bits, even in 1024hz mode, but packing all that stuff up is a lot of extra math)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With some extra preprocessor hackery, we could get rid of that +1 as well which is there just to prevent zero sized array errors in case the result is zero. That would save an extra byte.

Comment on lines 96 to 97
// the number of times the high-precision timer has overflowed
uint8_t hpt_overflows = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many times are we expecting the timer to overflow?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

255 overflows of a 32-bit timer running at 1024hz is 34 years. Even in BACKUP mode, I'm pretty sure the battery itself will self-discharge by then.

Comment on lines 93 to 94
// timestamps at which watch faces should have a HPT event triggered. UINT64_MAX for no callback
uint64_t hpt_scheduled_events[MOVEMENT_NUM_FACES];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be possible to store this in a watch face structure, eliminating a global variable.

Copy link
Author

@zzm634 zzm634 Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had considered that, but either way, it's going to add up to one new 64-bit timestamp per watch face. I didn't see an obvious watch face struct to add it to in movement.c, but I can't say I looked very hard.

Comment on lines 717 to 731
/** Figures out the next scheduled event for the HPT to wake up for.
* Must be called whenever:
* - a new event is scheduled
* - an old event is deleted
* - an event is triggered
* - the timer overflows
*/
static void _movement_hpt_schedule_next_event() {
uint64_t next = UINT64_MAX;
for(uint8_t req_idx = 0; req_idx < MOVEMENT_NUM_FACES; ++req_idx) {
uint64_t event = hpt_scheduled_events[req_idx];
if(event < next) {
event = next;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's critical that we make this function as efficient as possible. Hopefully we can eliminate this O(n) loop. What's your take on radix timers?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's fair to call this O(n), because MOVEMENT_NUM_FACES is a constant. Most people are probably going to have like, 12 faces max on their watch, and even if they go hog wild, the most they can have is 256. Compilers are very good at optimizing simple loops like this, it will be fine.

radix timers seems like a really "clever" solution to this very simple problem.

Comment on lines 752 to 755
void movement_hpt_relenquish(void) {
movement_hpt_relenquish_face(movement_state.current_face_idx);
}
void movement_hpt_relenquish_face(uint8_t face_idx) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small correction: relinquish.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I spelled that wrong. I'm gonna rename it to "release" anyway, because "relinquish" is too fancy.

Comment on lines 367 to 372
void movement_hpt_relenquish(void);
/**
* A variant of "movement_hpt_relenquish" that can be used when your face is
* not running in the foreground.
*/
void movement_hpt_relenquish_face(uint8_t face_idx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small correction: relinquish.

* Faces may schedule an EVENT_HPT event to occur by calling
* "movement_hpt_schedule" and passing in a timestamp for the event to occur.
* The face must call "movement_hpt_request" before scheduling the event, and
* must not call "movement_hpt_relenquish" until after the event has occurred.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? It won't be possible to cancel a running timer? What if the user cancels an alarm?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I mean, they can call "relenquish" if they want to, but it means their scheduled event is going to be cancelled, and some devs might be surprised by that.

Comment on lines 374 to 378
/**
* Schedules a future EVENT_HPT event to occur on or after the given timestamp.
* The Movement framework will do its best to trigger the event as close to the
* timestamp as possible, but it may be delayed if multiple faces or events are
* scheduled to occur on the same timestamp.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies timers are guaranteed to be handled, which means either real time code or a queue of pending timer events. Do you mean to guarantee this, and if so by which method?

I assume you meant real time constraints, as implied by your concerns in discussion #381:

If interrupt handling takes too long, it's possible for the timer value to change in the background while we're handling scheduled events. If events are scheduled too close together or too close in the future, it might be possible that by the time we set the timestamp value for the new interrupt, the counter has already passed it and the interrupt won't trigger. Now, if the counter is running at 128hz and the cpu is running at 4mhz, that gives us about 31,000 clock cycles to finish up whatever we were doing and check the timer again. Plenty of time... unless a face calls _delay_ms() or has to wait to activate a peripheral. Not a deal-breaking problem, but something we need to be aware of.

However this comment was written before you updated your timer code to 1024 Hz. The deadlines will be much tighter.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Movement framework needs to guarantee these events, but I haven't figured out exactly how to do it. It might be messy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could check the nearest event to see if its timestamp is in the past. If so, that would mean its deadline was missed. We move forward in the event queue, processing them all in sequence until we find one with a timestamp in the future.

An efficient, hopefully constant time "next event" function would make it feasible. We'd be able to guarantee it would finish before the next timer window.

Comment on lines 10 to 13
/**
* The callback is being invoked because of an error in the comparison (?)
*/
bool cmp_error :1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What conditions could possibly cause this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but it was something the datasheet called out as a possible interrupt source. Unless I read the datasheet wrong, the timer only has a single interrupt vector and you're supposed to check the interrupt status register to see what caused it.

Bit 1 – ERR: Error Interrupt Flag
This flag is set when a new capture occurs on a channel while the corresponding Match or Capture
Channel x interrupt flag is set, in which case there is nowhere to store the new capture.

It's entirely irrelevant to the HPT though, so maybe I should get rid of the flag in this struct.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Match or capture? This has something to do with using these timers to generate waveforms. I don't think we'll be using that functionality. It could be safe to ignore that possibility.

Still some questions left in the code, and I had to disable the stock stopwatch and dual timer faces. Once I am sure this works properly, I'll update those faces to use it.
@zzm634
Copy link
Author

zzm634 commented Mar 16, 2024

Just noticed that watch_buzzer.c uses TC3 for playing note sequences. Yet another thing for HPT to support I guess.

- Removed some debugging cruft and commented out code
- HPT requests are stored using a byte array instead of a single variable (allows for more than 64 HPT requests)
- Added "fast" mode to counter reads that skips the synchronization step
- Fixes and improvements to HPT timing faces.
Also refactors movement.c to use HPT buzzing for alarms, signals, and other short beeps.

Adds another feature to buzzer sequences: skip aheads. (Not used by anything yet, but repeater faces might need it.

Also enables TCC run in standby always. TCC is enabled on-demand anyway. Might as well let it run in standby.

Eventually, I would like to remove `watch_buzzer_play_note` because it uses delay_ms, but a lot of faces use it right now and they can be refactored later.
_movement_hpt_schedule_next_event();
if (canSleep)
{
// TODO put cpu to sleep?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use some input from other devs on how to put the CPU back to sleep here. Or other feedback on running code in this ISR vs running it in the main app loop.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will all be reverted when this is ready for merge.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this face is likely to be deleted before merge as well

/*
* MIT License
*
* Copyright (c) 2024 <#author_name#>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix author name

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs copyright header

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyright header

@wryun
Copy link
Contributor

wryun commented Jul 21, 2024

Well, I'm a fan of higher preicions timing, and it all sounds good. But it's a little scary!

@joeycastillo - do you think this is a good approach?

@joeycastillo
Copy link
Owner

I think I need to reserve an opinion on this until I can run power profiling on it, but I do like the idea of a shared high precision timer that everything that needs high precision timing can share, rather than having individual watch faces make use of TC peripherals. I’m also curious about the choice to use a 32 bit counter (which ties up two TC’s) instead of a 16 bit counter. Using a 16 bit counter would mean overflowing every minute or so, but it would make handling overflows as a matter of course instead of a thing that happens once in a blue moon. Especially since high precision timing mostly relates to things happening in the near future.

@wryun your pinging me on this comes at an interesting moment too, because I’m currently doing a pretty big refactor of Movement in the background, tearing out a lot of low level technical debt and trying to free up resources for forthcoming hardware. In particular, my refactor should free up TC0 and TC1, which I think we might need to free up TC2 and TC3 for use by the accelerometer. Anyway I’m open to considering this for potential inclusion into the forthcoming second movement firmware, again, pending a more thorough power consumption analysis and a deeper review of the code.

@wryun
Copy link
Contributor

wryun commented Jul 22, 2024

Thanks - happy to hear it's on your radar.

I guess the main advantage of the 32-bit counter from @zzm634 's perspective is that one might be able to do away with some of the overflow handling:

The code in this PR should handle this situation properly, but it does mean there is some extra logic in play to detect and compensate for timer overflows that occur while critical operations are being performed. These "safety" checks mean more CPU cycles, more power consumed, more latency for the user.

i.e. rather than having this complexity, one could just schedule an event before the overflow every time you start the timer like (EVENT_HPT_MAX_TIME or something) and document that code has to handle it by aborting the timer action. Since it's unlikely one needs a 49+ day precision timing event, and this might have an impact on battery life (and event hpt accuracy, in that you might be able to get the event to user code faster).

But given they're in there at the moment, I would also prefer a 16-bit counter and handle the overflow as a matter of course.


One of the interesting things about this PR as well is that (from a brief look) none of the existing faces have been modified to use EVENT_HPT because they only really need an accurate time when they're reacting to another event. I'm not sure there's actually a clear use case for EVENT_HPT; it would be enough that you can easily get the differences between times when something else happens (i.e. button press).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants