Access to a different output device: AudioContext.setSinkId() #2400

cwilso · 2014-11-21T17:21:11Z

Should be able to specify different audio devices, using media device selectors.

hoch · 2015-02-06T19:35:21Z

http://w3c.github.io/mediacapture-output/#h-webaudio-extensions

Just to keep the reference here. The spec enumerates several options for implementation with pros/cons.

bill-hofmann · 2015-02-18T18:46:11Z

Missing is any information about characteristics of output devices (channels, etc., etc.) - this is a getUserMedia issue, but essential to solve

joeberkovitz · 2015-04-30T13:47:49Z

This is currently in Media Capture Task Force's court; we are monitoring their progress.

cwilso · 2015-05-12T17:00:17Z

I don't think MCTF is just gonna fix our problem, though - and we need to provide a constructor that takes a different device.

joeberkovitz · 2015-05-12T18:19:47Z

@cwilso If MCTF provides a way to obtain a MediaStream for some desired device (which is what we asked for, and their updated API seems well on the way to giving it to us) then doesn't AudioContext::createMediaStreamDestination() provide a way to use that device as a destination?

cwilso · 2015-05-12T18:21:23Z

Sure. But that would absolutely be a HORRIBLE way to connect - because you want the AudioContext to run at the rate and clock of the device, not at some arbitrary other clock and have to be coerced to that device. MediaStream device in/out implies a potential for clock conversion.

joeberkovitz · 2015-05-12T18:29:11Z

@cwilso Thanks for explaining - I had not been aware of that very significant point (I wonder if others were?).

Perhaps something as simple as a class method on AudioContext would fill the bill, e.g. ctx = AudioContext.createMediaStreamContext(stream). This would not lock us into an optional constructor arg that we'd have to live with forever. I'm not sure whether this sort of static pseudo-constructor is cool in Web APIs.

Do you have a specific proposal in mind for us to discuss on Thursday?

cwilso · 2015-05-12T19:16:45Z

Yes, it's what I gave Justin to put in the MCTF spec:

3.1.1 Constructor argument
Option 1: AudioContext constructor argument
The sink ID is passed as an argument to the AudioContext constructor, e.g.

new AudioContext({ sinkId: requestedSinkId });

By requiring the sink ID to be set at construction time, this simplifies the implementation, since the output sample rate is fixed.

joeberkovitz · 2015-05-12T20:00:54Z

I see now, it's in the document that @hoch referenced above. Sorry for the thrash here. Yup, that all looks great.

joeberkovitz · 2015-05-12T20:01:33Z

So... apart from MCTF needing to approve, is there a reason this is not just "Ready for Editing"?

joeberkovitz · 2015-05-12T20:06:58Z

@cwilso So that I can make sure MCTF is focusing on the right stuff: exactly what elements of the Audio Output API other than the Web Audio extensions must exist for V1, from your point of view? Since a sinkId is just a deviceID from enumerateDevices() (which itself is not part of the Audio Output API), do we really need to make the whole Audio Output API proposal a dependency for V1 Web Audio?

cwilso · 2015-05-12T20:45:42Z

I think we (Web Audio) are responsible for making sure AO API and WA API work together to define bedrock (e.g. audio device access, audio bits access a la issue #359) vs. layers on top of that bedrock (e.g. biquadfilter). If we don't work through and make sense of these architectural layers now, it will never make sense.

joeberkovitz · 2015-05-12T20:52:36Z

Just to confirm: you are saying we shouldn't implement the constructor until the whole AO API is accepted by MCTF?

cwilso · 2015-05-12T21:28:44Z

No, that's not quite what I'm saying. I'm saying we shouldn't ship until the model for how access to devices - bedrock - works. We should be able to work through and prove how is built on top of Web Audio, which is built on top of direct device access alongside getUserMedia (which provides device enumeration). The Audio Output API is actually a semantic layer on top of an implementation - we just need to prove that we could implement that (through device enumeration from gUM and redirecting of AudioContexts).

joeberkovitz · 2015-05-13T16:21:02Z

OK -- can you walk us through this on tomorrow's call? Let's discuss what "prove" means, in particular.

cwilso · 2015-05-13T17:14:00Z

Sure

On Wed, May 13, 2015 at 9:21 AM, Joe Berkovitz [email protected]
wrote:

OK -- can you walk us through this on tomorrow's call? Let's discuss what
"prove" means, in particular.

—
Reply to this email directly or view it on GitHub
https://github.com/WebAudio/web-audio-api/issues/445#issuecomment-101733971
.

joeberkovitz · 2015-05-14T19:23:05Z

This issue is Ready for Editing w/r/t the AudioContext constructor as described Audio Output API proposal. However we need to still wait for MCTF response re the ability to enumerate devices with an awareness of sample rate, latency, number of channels, etc.

joeberkovitz · 2015-06-02T20:01:32Z

We should also ask MCTF about Permissions API with respect to acquiring permission to access or enumerate devices.

jasonmcaffee · 2016-01-29T21:15:53Z

Is there any work around to get this functionality with AudioContext? With appropriate flags enabled, I'm able to get the list of audio devices via navigator.mediaDevices.enumerateDevices(), but I'm at a loss on how to set the output to a given device id. It would be a super useful feature.

jasonmcaffee · 2016-01-29T23:13:11Z

I found a workaround, but I'm not sure how well it works yet. There seems to be pops and glitches with a single sine oscillator.
https://jsfiddle.net/2k7gkdqw/1/

Basically the Audio element has a setSinkId for setting the appropriate output device.
The sinkId can be obtained, after gaining audio permissions, via navigator.mediaDevices.enumerateDevices.

With the audio element setup to send to the appropriate output, you can stream from the audiocontext to the audio element by creating a mediaStreamDestination, and passing it's stream property to the audio element.

e.g.

var c = new AudioContext();
var o = c.createOscillator();
var m = c.createMediaStreamDestination();
o.connect(m);
var audioEl = new Audio();
audioEl.src = URL.createObjectURL(m.stream);
audioEl.play();
audioEl.setSinkId('idFromEnumerateDevicesItem');
o.start();

Not sure how well this will work when dealing with several oscillators, effects, etc yet, but appears to be somewhat functional when appropriate flags are set.

UPDATE: I plugged this behavior into my synthesizer.
There is quite a bit of pops and clicks, especially for the first 30-60 seconds of playing. There are periods where pops and clicks don't occur, but these seem to reoccur if the sound is complicated (lots of notes, and/or lots of oscillators)
Another interesting behavior is that there are periods where the sound is detuned.

petkaantonov · 2016-01-30T10:14:21Z

after gaining audio permissions

Btw, the permission dialog is: "site wants to use your microphone" which is not going to get accepted if the user is went to e.g. "output device settings" section of an app. Something should probably be done about that so that app that simply wants to enable the user to choose between audio output devices can do so without having to deal with creepy microphone permissions.

jan-ivar · 2016-01-31T18:11:03Z

You can get the deviceId of the output device without permission. Permission is only needed for the label.

petkaantonov · 2016-01-31T18:24:12Z

I cannot imagine a use case where the label wouldnt be needed, other than malicous ones.
On Jan 31, 2016 20:11, jan-ivar [email protected] wrote:You can get the deviceId of the output device without permission. Permission is only needed for the label.

—Reply to this email directly or view it on GitHub.

jan-ivar · 2016-01-31T19:16:37Z

output 1, output 2.

petkaantonov · 2016-01-31T19:19:08Z

Its unacceptable to present that to a normal user who will just think the app is cheap/unfinished.
On Jan 31, 2016 21:16, jan-ivar [email protected] wrote:output 1, output 2.

—Reply to this email directly or view it on GitHub.

jan-ivar · 2016-01-31T19:24:59Z

We are talking about audio people, right (didn't they invent output 1 and output 2)?

Seriously, though. What are you suggesting? That output device labels be in the clear?

petkaantonov · 2016-01-31T19:41:23Z

I implied there should be separate permission for seeing audio output labels, not "wants to use your microphone" which is creepy as hell when the app has no reason for it.

Applications that play sound are not for "audio people only". Even if they were, that doesnt change the feeling of low quality and cheapness when an app cannot even get your devices right while all other apps can.

On Jan 31, 2016 21:25, jan-ivar [email protected] wrote:We are talking about audio people, right (didn't they invent output 1 and output 2)?

Seriously, though. What are you suggesting? That output device labels be in the clear?

—Reply to this email directly or view it on GitHub.

hoch · 2022-08-29T15:20:53Z

More details on a solution that doesn't seem clean or practical:

setSinkId((DOMString or AudioContextOptions) sinkId);

Where AudioContextOptions has:

dictionary AudioContextOptions {
  (AudioContextLatencyCategory or double) latencyHint = "interactive";
  float sampleRate;
  (DOMString or AudioContextSinkOptions) sinkId;
};

dictionary AudioContextSinkOptions {
  bool useSilentSink;
}

With this way, we can change the latency hint and the sample rate when we change the sink.

chrisguttandin · 2022-08-29T22:06:13Z

Could the following be expressed in WebIDL?

new AudioContext() // uses the default device since sinkId is not defined

new AudioContext({ sinkId: null }) // uses no output device since sinkId is set to null
// or
const audioContext = new AudioContext();
audioContext.setSinkId(null);

new AudioContext({ sinkId: 'abcd' }) // uses the device with the sinkId called 'abcd'
// or
const audioContext = new AudioContext();
audioContext.setSinkId('abcd');

If I recall correctly any member of a dictionary is nullable by default. In that case it would just be a dictionary in WebIDL.

dictionary AudioContextOptions {
    DOMString sinkId;
}

Another option could be to use false instead of 'none' to select no output device.

new AudioContext({ sinkId: false })
// or
const audioContext = new AudioContext();
audioContext.setSinkId(false);

I'm not sure though if it is possible to define a union of a DOMString with a boolean in WebIDL.

hoch · 2022-08-29T22:15:32Z

(DOMString or boolean) should be possible, but it's not descriptive enough.

Using null is an interesting idea, and then I believe it becomes a union of (DOMString and object). It should be doable but leaving the second field wide open to object doesn't feel great either.

I don't have a strong opinion, but the FooBarOptions approach is generally considered future-proof.

bicknellr · 2022-08-30T19:55:47Z

Have you all considered putting this functionality on AudioDestinationNode instead? That way you'd be able to route different streams from the same graph to different outputs. This would be useful if you want to build something that supports separate master and monitor outputs when, for example, preparing upcoming tracks while DJing.

hoch · 2022-08-30T20:00:08Z

There's 1:1 association between AudioContext and AudioDestinationNode. Having multiple AudioDestinationNodes is an idea, but I am not sure we want to pursue. Multiple devices mean that the system needs to handle sample rate and callback buffer differences across them.

Also - the multi-routing is already possible with multiple instances of MediaStreamAudioDestinationNode -(MediaStream)-> AudioElement. You'll lose sample-accurate synchronization between devices, but that's expected without device aggregation and an intermedia layer.

hoch · 2022-08-31T17:03:18Z

This idea also was proposed from Chrome engineers:

audioContext.setSinkId("default");
audioContext.setSinkId("device-unique-id");
audioContext.setSinkId("silent");

No complicated types, just plain DOMStrings. This is easy and sensible, but IIUC there's no precedences in Web Audio API. We've been using enums for this purpose.

hoch · 2022-09-07T17:55:11Z

To recap, here are two proposals for configurability:

A. Using AudioSinkOptions pattern:

dictionary AudioContextOptions {
  ...
  (DOMString or AudioSinkOptions) sinkId;
};

dictionary AudioSinkOptions {
  bool useSilentSink;
}

// example
audioContext.setSinkId("");
audioContext.setSinkId("5b79a953d8fb279...");
audioContext.setSinkId({useSilentSink: true});

B. Using plain strings:

dictionary AudioContextOptions {
  ...
  DOMString sinkId;
};

// example
audioContext.setSinkId("");
audioContext.setSinkId("5b79a953d8fb279...");
audioContext.setSinkId("silent");

Sheraff · 2022-09-08T07:59:40Z

Is it a guarantee that no device (sink) will ever have an ID of "silent" or "default"?

hoch · 2022-09-08T16:36:18Z

See examples if typical IDs over here:
https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/enumerateDevices#examples

This is what I get from my MacbookPro + Chrome:

audioinput: Default - MacBook Pro Microphone (Built-in) id = default
audioinput: MacBook Pro Microphone (Built-in) id = 5b79a953d8fb279e717b108562f28b4d934473541367b4ae41b809fedb319a8d
audiooutput: Default - MacBook Pro Speakers (Built-in) id = default
audiooutput: MacBook Pro Speakers (Built-in) id = 94508a698f07a537453f6c37a9449817ff3321b8f0869588a94ee4661350f0a8

Is it a guarantee that no device (sink) will ever have an ID of "silent" or "default"?

So I would say no and yes:

"No": you'll get default as an ID. This rather works nicely with this API. You can just throw default and it'll work.
"Yes": you won't get silent as an ID unless there's no future spec change on MediaDevices.enumerateDevices(). Also the first proposal (using AudioSinkOptions) will be effective for this corner case.

padenot · 2022-09-09T11:56:32Z

The identifier is generated by the browser, it cannot be controlled by authors. It's different than the device name.

hoch · 2022-09-14T16:02:14Z

As @padenot mentioned above, the device identifier and the label are different.

padenot · 2022-09-14T21:18:31Z

Update from discussing with @hoch at TPAC, this is what we're currently thinking:

enum AudioSinkType {
  "default",
  "none",
};

dict AudioSinkOptions {
  AudioSinkType type;
};

partial interface AudioContext {
  Promise<undefined> setSinkId(AudioSinkOptions or DOMString);
}

AudioSinkType could grow other values. Something we thought about was the notion of "default device for communication use-case" (vs., say, listening to music). This is something that Android and Windows expose, at least.

hoch · 2022-09-15T16:46:25Z

A slight update and one more discussion topic:

enum AudioSinkType {
  "none"
};

dict AudioSinkOptions {
  AudioSinkType type;
};

partial interface AudioContext {
  Promise<undefined> setSinkId(AudioSinkOptions or DOMString);
}

audioContext.setSinkId("") is already available for the default device, and the concept of "default device" doesn't really fall into a "type". A device being a default one is more about its identity, less of characteristics.

Question: what would be the value of audioContext.sinkId when the current sink type is "none"?

hoch · 2022-09-16T18:27:04Z

We agreed upon the sinkId getter design. The up-to-date API shape is:

enum AudioSinkType {
  "none"
};

dict AudioSinkOptions {
  AudioSinkType type;
};

partial interface AudioContext {
  readonly attribute (DOMString or AudioSinkOptions) sinkId;
  Promise<undefined> setSinkId(DOMString or AudioSinkOptions);
}

mjwilson-google · 2022-09-20T17:18:40Z

hoch, would this generally mean that after a successful setSinkId we should get exactly the same argument back when we call sinkId?

guest271314, I think you are describing using PulseAudio commands to make a microphone / other input device appear as an output device, then setting that output device as the system default output device. I don't think this is the usual configuration, so if someone has set up their system that way it may be that they have a reason to and the browser should respect that. Are you concerned with the meaning of "default audio output device" in the spec? Would it make more sense to say something like the "system-reported" default audio output device?

It seems to me that if the user configures their system to have a particular default output audio device, then that is the "real" default audio output device (even if it happens to be a microphone, /dev/null, etc.). There isn't anything special that would make one audio device the natural default for a particular system.

Or am I misunderstanding your concern?

hoch · 2022-09-20T17:37:38Z

Based on the algorithm, the internal slot changes only when the transition is successful. So:

await context.setSinkId('some-id'); // if this was successful
console.log(context.sinkId); // then this should be 'some-id'

cwilso self-assigned this Dec 3, 2014

joeberkovitz assigned joeberkovitz and unassigned cwilso Jun 1, 2015

hoch mentioned this issue Aug 30, 2022

Output Device Selection in Web Audio API: AudioContext.setSinkId() w3ctag/design-reviews#766

Closed

1 task

orottier mentioned this issue Sep 2, 2022

Let user specify output device for online AudioContext orottier/web-audio-api-rs#216

Closed

This comment was marked as off-topic.

Sign in to view

hoch unassigned padenot Sep 14, 2022

hoch removed the recharter-deliverables label Sep 14, 2022

hoch self-assigned this Sep 14, 2022

This comment was marked as off-topic.

Sign in to view

hoch closed this as completed in #2498 Oct 4, 2022

b-ma mentioned this issue Nov 8, 2022

Render audio graph without emitting to the speakers orottier/web-audio-api-rs#234

Merged

hoch mentioned this issue Sep 27, 2024

Publish Web Audio API v1.1 FPWD #2603

Merged

Access to a different output device: AudioContext.setSinkId() #2400

Access to a different output device: AudioContext.setSinkId() #2400

Comments

cwilso commented Nov 21, 2014

hoch commented Feb 6, 2015

bill-hofmann commented Feb 18, 2015

joeberkovitz commented Apr 30, 2015

cwilso commented May 12, 2015

joeberkovitz commented May 12, 2015

cwilso commented May 12, 2015

joeberkovitz commented May 12, 2015

cwilso commented May 12, 2015

joeberkovitz commented May 12, 2015

joeberkovitz commented May 12, 2015

joeberkovitz commented May 12, 2015

cwilso commented May 12, 2015

joeberkovitz commented May 12, 2015

cwilso commented May 12, 2015

joeberkovitz commented May 13, 2015

cwilso commented May 13, 2015

joeberkovitz commented May 14, 2015

joeberkovitz commented Jun 2, 2015

jasonmcaffee commented Jan 29, 2016

jasonmcaffee commented Jan 29, 2016

petkaantonov commented Jan 30, 2016

jan-ivar commented Jan 31, 2016

petkaantonov commented Jan 31, 2016

jan-ivar commented Jan 31, 2016

petkaantonov commented Jan 31, 2016

jan-ivar commented Jan 31, 2016

petkaantonov commented Jan 31, 2016

hoch commented Aug 29, 2022

chrisguttandin commented Aug 29, 2022

hoch commented Aug 29, 2022

bicknellr commented Aug 30, 2022

hoch commented Aug 30, 2022

hoch commented Aug 31, 2022

hoch commented Sep 7, 2022 • edited Loading

Sheraff commented Sep 8, 2022

hoch commented Sep 8, 2022 • edited Loading

This comment was marked as off-topic.

padenot commented Sep 9, 2022

This comment was marked as off-topic.

This comment was marked as off-topic.

hoch commented Sep 14, 2022

padenot commented Sep 14, 2022

This comment was marked as off-topic.

hoch commented Sep 15, 2022

hoch commented Sep 16, 2022 • edited Loading

This comment was marked as off-topic.

mjwilson-google commented Sep 20, 2022

hoch commented Sep 20, 2022

This comment was marked as off-topic.

hoch commented Sep 7, 2022 •

edited

Loading

hoch commented Sep 8, 2022 •

edited

Loading

hoch commented Sep 16, 2022 •

edited

Loading