-
Notifications
You must be signed in to change notification settings - Fork 19
/
Copy pathindex.bs
335 lines (258 loc) · 16.9 KB
/
index.bs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
<pre class='metadata'>
Title: HTMLVideoElement.requestVideoFrameCallback()
Repository: wicg/video-rvfc
Status: CG-DRAFT
ED: https://wicg.github.io/video-rvfc/
Shortname: video-rvfc
Level: 1
Group: wicg
Editor: Thomas Guilbert, w3cid 120583, Google Inc. https://google.com/
Abstract: <video>.requestVideoFrameCallback() allows web authors to be notified when a frame has been presented for composition.
!Participate: <a href="https://github.com/wicg/video-rvfc">Git Repository.</a>
!Participate: <a href="https://github.com/wicg/video-rvfc/issues/new">File an issue.</a>
!Version History: <a href="https://github.com/wicg/video-rvfc/commits">https://github.com/wicg/video-rvfc/commits</a>
Indent: 2
Markup Shorthands: markdown yes
</pre>
<pre class='anchors'>
spec: hr-timing; urlPrefix: https://w3c.github.io/hr-time/
type: dfn
for: Clock resolution; text: clock resolution; url: #clock-resolution
spec: html; urlPrefix: https://html.spec.whatwg.org/multipage/imagebitmap-and-animations.html
type: dfn
text: run the animation frame callbacks; url: #run-the-animation-frame-callbacks
type:attribute; for:CanvasRenderingContext2D; text:canvas
spec: css-values; urlPrefix: https://drafts.csswg.org/css-values/
type: dfn
text: CSS pixels; url: #px
spec: media-capabilities; urlPrefix: https://w3c.github.io/media-capabilities
type: dictionary
text: MediaCapabilitiesInfo; url: #dictdef-mediacapabilitiesinfo
spec: media-playback-quality; urlPrefix: https://w3c.github.io/media-playback-quality/
type: attribute
for: VideoPlaybackQuality; text: droppedVideoFrames; url: #dom-videoplaybackquality-droppedvideoframes
</pre>
# Introduction # {#introduction}
*This section is non-normative*
This is a proposal to add a {{requestVideoFrameCallback()}} method to the {{HTMLVideoElement}}.
This method allows web authors to register a {{VideoFrameRequestCallback|callback}} which runs in the
[=update the rendering|rendering steps=], when a new video frame is sent to the compositor. The
new {{VideoFrameRequestCallback|callbacks}} are executed immediately before existing
{{AnimationFrameProvider|window.requestAnimationFrame()}} callbacks. Changes made from within both
callback types within the same [=event loop processing model|turn of the event loop=] will be visible on
screen at the same time, with the next v-sync.
Drawing operations (e.g. drawing a video frame to a {{canvas}} via {{drawImage()}}) made through this
API will be synchronized as a *best effort* with the video playing on screen. *Best effort* in this case
means that, even with a normal work load, a {{VideoFrameRequestCallback|callback}} can occasionally be
fired one v-sync late, relative to when the new video frame was presented. This means that drawing
operations might occasionally appear on screen one v-sync after the video frame does. Additionally, if
there is a heavy load on the main thread, we might not get a callback for every frame (as measured by a
discontinuity in the {{presentedFrames}}).
Note: A web author could know if a callback is late by checking whether {{expectedDisplayTime}} is equal
to *now*, as opposed to roughly one v-sync in the future.
The {{VideoFrameRequestCallback}} also provides useful {{VideoFrameCallbackMetadata|metadata}} about the video
frame that was most recently presented for composition, which can be used for automated metrics analysis.
# VideoFrameCallbackMetadata # {#video-frame-callback-metadata}
<pre class='idl'>
dictionary VideoFrameCallbackMetadata {
required DOMHighResTimeStamp presentationTime;
required DOMHighResTimeStamp expectedDisplayTime;
required unsigned long width;
required unsigned long height;
required double mediaTime;
required unsigned long presentedFrames;
double processingDuration;
DOMHighResTimeStamp captureTime;
DOMHighResTimeStamp receiveTime;
unsigned long rtpTimestamp;
};
</pre>
## Definitions ## {#video-frame-callback-metadata-definitions}
<dfn>media pixels</dfn> are defined as a media resource's visible decoded pixels, without pixel aspect
ratio adjustments. They are different from [=CSS pixels=], which account for pixel aspect ratio
adjustments.
## Attributes ## {#video-frame-callback-metadata-attributes}
: <dfn for="VideoFrameCallbackMetadata" dict-member>presentationTime</dfn>
:: The time at which the user agent submitted the frame for composition.
: <dfn for="VideoFrameCallbackMetadata" dict-member>expectedDisplayTime</dfn>
:: The time at which the user agent expects the frame to be visible.
: <dfn for="VideoFrameCallbackMetadata" dict-member>width</dfn>
:: The width of the video frame, in [=media pixels=].
: <dfn for="VideoFrameCallbackMetadata" dict-member>height</dfn>
:: The height of the video frame, in [=media pixels=].
Note: {{width}} and {{height}} might differ from {{HTMLVideoElement/videoWidth|videoWidth}} and
{{HTMLVideoElement/videoHeight|videoHeight}} in certain cases (e.g, an anamorphic video might
have rectangular pixels). When a calling
<a href="https://developer.mozilla.org/en-US/docs/Web/API/WebGLRenderingContext/texImage2D">`texImage2D()`</a>,
{{width}} and {{height}} are the dimensions used to copy the video's [=media pixels=] to the texture,
while {{HTMLVideoElement/videoWidth|videoWidth}} and {{HTMLVideoElement/videoHeight|videoHeight}} can
be used to determine the aspect ratio to use, when using the texture.
: <dfn for="VideoFrameCallbackMetadata" dict-member>mediaTime</dfn>
:: The media presentation timestamp (PTS) in seconds of the frame presented (e.g.
its timestamp on the {{HTMLMediaElement/currentTime|video.currentTime}} timeline).
MAY have a zero value for live-streams or WebRTC applications.
: <dfn for="VideoFrameCallbackMetadata" dict-member>presentedFrames</dfn>
:: A count of the number of frames submitted for composition. Allows clients
to determine if frames were missed between {{VideoFrameRequestCallback}}s. MUST be monotonically
increasing.
: <dfn for="VideoFrameCallbackMetadata" dict-member>processingDuration</dfn>
:: The elapsed duration in seconds from submission of the encoded packet with the same
presentation timestamp (PTS) as this frame (e.g. same as the {{mediaTime}}) to the decoder
until the decoded frame was ready for presentation.
:: In addition to decoding time, may include processing time. E.g., YUV
conversion and/or staging into GPU backed memory.
:: SHOULD be present. In some cases, user-agents might not be able to surface this information since
portions of the media pipeline might be owned by the OS.
: <dfn for="VideoFrameCallbackMetadata" dict-member>captureTime</dfn>
:: For video frames coming from a local source, this is the time at which
the frame was captured by the camera.
For video frames coming from remote source, the capture time is based on
the RTP timestamp of the frame and estimated using clock synchronization.
This is best effort and can use methods like using RTCP SR as specified
in RFC 3550 Section 6.4.1, or by other alternative means if use by
RTCP SR isn't feasible.
:: SHOULD be present for WebRTC or getUserMedia applications, and absent otherwise.
: <dfn for="VideoFrameCallbackMetadata" dict-member>receiveTime</dfn>
:: For video frames coming from a remote source, this is the
time the encoded frame was received by the platform, i.e., the time at
which the last packet belonging to this frame was received over the network.
:: SHOULD be present for WebRTC applications that receive data from a remote source,
and absent otherwise.
: <dfn for="VideoFrameCallbackMetadata" dict-member>rtpTimestamp</dfn>
:: The RTP timestamp associated with this video frame.
:: SHOULD be present for WebRTC applications that receive data from a remote source,
and absent otherwise.
# VideoFrameRequestCallback # {#video-frame-request-callback}
<pre class='idl'>
callback VideoFrameRequestCallback = undefined(DOMHighResTimeStamp now, VideoFrameCallbackMetadata metadata);
</pre>
Each {{VideoFrameRequestCallback}} object has a <dfn>canceled</dfn> boolean initially set to false.
# HTMLVideoElement.requestVideoFrameCallback() # {#video-rvfc}
<pre class='idl'>
partial interface HTMLVideoElement {
unsigned long requestVideoFrameCallback(VideoFrameRequestCallback callback);
undefined cancelVideoFrameCallback(unsigned long handle);
};
</pre>
## Methods ## {#video-rvfc-methods}
Each {{HTMLVideoElement}} has a <dfn>list of video frame request callbacks</dfn>, which is initially
empty. It also has a <dfn>last presented frame identifier</dfn> and a <dfn>video frame request
callback identifier</dfn>, which are both numbers which are initially zero.
: <dfn for="HTMLVideoElement" method>requestVideoFrameCallback(|callback|)</dfn>
:: Registers a callback to be fired the next time a frame is presented to the compositor.
When `requestVideoFrameCallback` is called, the user agent MUST run the following steps:
1. Let |video| be the {{HTMLVideoElement}} on which `requestVideoFrameCallback` is
invoked.
1. Increment |video|'s {{ownerDocument}}'s [=video frame request callback identifier=] by one.
1. Let |callbackId| be |video|'s {{ownerDocument}}'s [=video frame request callback identifier=]
1. Append |callback| to |video|'s [=list of video frame request callbacks=], associated with |callbackId|.
1. Return |callbackId|.
: <dfn for="HTMLVideoElement" method>cancelVideoFrameCallback(|handle|)</dfn>
:: Cancels an existing video frame request callback given its handle.
When `cancelVideoFrameCallback` is called, the user agent MUST run the following steps:
1. Let |video| be the target {{HTMLVideoElement}} object on which `cancelVideoFrameCallback` is invoked.
1. Find the entry in |video|'s [=list of video frame request callbacks=] that is associated with the value |handle|.
1. If there is such an entry, set its [=canceled=] boolean to <code>true</code> and remove it from |video|'s [=list of video frame request callbacks=].
## Procedures ## {#video-rvfc-procedures}
An {{HTMLVideoElement}} is considered to be an <dfn>associated video element</dfn> of a {{Document}}
|doc| if its {{ownerDocument}} attribute is the same as |doc|.
<div algorithm="video-rvfc-rendering-step">
Issue: This spec should eventually be merged into the HTML spec, and we should directly call [=run the
video frame request callbacks=] from the [=update the rendering=] steps. This procedure describes
where and how to invoke the algorithm in the meantime.
When the [=update the rendering=] algorithm is invoked, run this new step:
+ For each [=fully active=] {{Document}} in |docs|, for each [=associated video element=] for that
{{Document}}, [=run the video frame request callbacks=] passing |now| as the timestamp.
immediately before this existing step:
+ "<i>For each [=fully active=] {{Document}} in |docs|, [=run the animation frame callbacks=] for that {{Document}}, passing in |now| as the timestamp</i>"
using the definitions for |docs| and |now| described in the [=update the rendering=] algorithm.
</div>
<div algorithm="run the video frame request callbacks">
Note: The effective rate at which {{VideoFrameRequestCallback|callbacks}} are run is the lesser rate
between the video's rate and the browser's rate. When the video rate is lower than the browser rate,
the {{VideoFrameRequestCallback|callbacks}}' rate is limited by the frequency at which new frames are
presented. When the video rate is greater than the browser rate, the
{{VideoFrameRequestCallback|callbacks}}' rate is limited by the frequency of the [=update the
rendering=] steps. This means, a 25fps video playing in a browser that paints at 60Hz would fire
callbacks at 25Hz; a 120fps video in that same 60Hz browser would fire callbacks at 60Hz.
To <dfn>run the video frame request callbacks</dfn> for a {{HTMLVideoElement}} |video| with a timestamp |now|, run the following steps:
1. If |video|'s [=list of video frame request callbacks=] is empty, abort these steps.
1. Let |metadata| be the {{VideoFrameCallbackMetadata}} dictionary built from |video|'s latest presented frame.
1. Let |presentedFrames| be the value of |metadata|'s {{presentedFrames}} field.
1. If the [=last presented frame identifier=] is equal to |presentedFrames|, abort these steps.
1. Set the [=last presented frame identifier=] to |presentedFrames|.
1. Let |callbacks| be the [=list of video frame request callbacks=].
1. Set |video|'s [=list of video frame request callbacks=] to be empty.
1. [=list/For each=] |entry| of |callbacks|:
1. If |entry|'s [=canceled=] boolean is <code>true</code>, continue.
1. [=Invoke=] |entry| with « |now|, |metadata| » and "`report`".
Note: There are **no strict timing guarantees** when it comes to how soon
{{VideoFrameRequestCallback|callbacks}} are run after a new video frame has been presented.
Consider the following scenario: a new frame is presented on the compositor thread, just as the user
agent aborts the [=run the video frame request callbacks|algorithm=] above, when it confirms that
there are no new frames. We therefore won't run the {{VideoFrameRequestCallback|callbacks}} in the
*current* [=update the rendering|rendering steps=], and have to wait until the *next* [=update the
rendering|rendering steps=], one v-sync later. In that case, visual changes to a web page made from
within the delayed {{VideoFrameRequestCallback|callbacks}} will appear on-screen one v-sync after the
video frame does.<br/>
<br/>
Offering stricter guarantees would likely force implementers to add cross-thread synchronization, which might be detrimental to video playback performance.
</div>
# Security and Privacy Considerations # {#security-and-privacy}
This specification does not expose any new privacy-sensitive information. However, the location
correlation opportunities outlined in the Privacy and Security section of [[webrtc-stats]] also hold
true for this spec: {{captureTime}}, {{receiveTime}}, and {{rtpTimestamp}} expose network-layer
information which can be correlated to location information. E.g., reusing the same example,
{{captureTime}} and {{receiveTime}} can be used to estimate network end-to-end travel time, which can
give indication as to how far the peers are located, and can give some location information about a peer
if the location of the other peer is known. Since this information is already available via the
[[webrtc-stats|RTCStats]], this specification doesn't introduce any novel privacy considerations.
This specification might introduce some new GPU fingerprinting opportunities. {{processingDuration}}
exposes some under-the-hood performance information about the video pipeline, which is otherwise
inaccessible to web developers. Using this information, one could correlate the performance of various
codecs and video sizes to a known GPU's profile. We therefore propose a resolution of 100μs, which is
still useful for automated quality analysis, but doesn't offer any new sources of high resolution
information. Still, despite a coarse clock, one could exploit the significant performance differences
between hardware and software decoders to infer information about a GPU's features. For example, this
would make it easier to fingerprint the newest GPUs, which have hardware decoders for the latest
codecs, which don't yet have widespread hardware decoding support. However, rather than measuring the
profiles themselves, one could directly get equivalent information from getting the
{{MediaCapabilitiesInfo}}.
This specification also introduces some new timing information. {{presentationTime}} and
{{expectedDisplayTime}} expose compositor timing information; {{captureTime}} and
{{receiveTime}} expose network timing information. The [=clock resolution=] of these fields should
therefore be coarse enough not to facilitate timing attacks.
# Examples # {#examples}
## Drawing frames at the video rate ## {#example-drawing}
*This section is non-normative*
Drawing video frames onto a {{canvas}} at the video rate (instead of the browser's animation rate)
can be done by using {{requestVideoFrameCallback()|video.requestVideoFrameCallback()}} instead of
{{AnimationFrameProvider|window.requestAnimationFrame()}}.
<pre class="example" highlight="js">
<body>
<video controls></video>
<canvas width="640" height="360"></canvas>
<span id="fps_text"/>
</body>
<script>
function startDrawing() {
var video = document.querySelector('video');
var canvas = document.querySelector('canvas');
var ctx = canvas.getContext('2d');
var paint_count = 0;
var start_time = 0.0;
var updateCanvas = function(now) {
if(start_time == 0.0)
start_time = now;
ctx.drawImage(video, 0, 0, canvas.width, canvas.height);
var elapsed = (now - start_time) / 1000.0;
var fps = (++paint_count / elapsed).toFixed(3);
document.querySelector('#fps_text').innerText = 'video fps: ' + fps;
video.requestVideoFrameCallback(updateCanvas);
}
video.requestVideoFrameCallback(updateCanvas);
video.src = "http://example.com/foo.webm"
video.play()
}
</script>
</pre>