Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggest try submit these Audio Media Player enhancements/improvements to official "Home Assistant Voice PE" (home-assistant-voice-pe) fork of ESPHome? #3

Open
Hedda opened this issue Oct 17, 2024 · 7 comments

Comments

@Hedda
Copy link

Hedda commented Oct 17, 2024

First of all, thank you for enhancements that make ESPHome's Audio Media Player much better than what upstream is by default today!

@rwrozelle I would like to make a request but here is a little backstory; as you and maybe others following your Audio Media Player improvements to ESPHome in this repository are perhaps already aware of; that is, Nabu Casa plans on soon releasing an official Home Assistant "Voice Satellite" appliance (a smart speaker voice assistant with media playback features) as an official voice assistant development platform and framework based on ESPHome with hardware that combines ESP32-S3 and an xCORE chip from XMOS for advanced audio processing, with that the PCB(s) in that not only including far-field microphones and built-in speaker but by default also including an audio-jack output for external speakers as well as GPIO pins for it to be used as a development board.

For that reason the lead ESPHome developers currently have an official "Home Assistant Voice PE" ("home-assistant-voice-pe") fork that ESPHome developers from Nabu Casa are actively working on in a relativly fast-pace with focus on improving voice control but also includes improving media player and especially enhancing audio playback functionality in ESPHome, and I understand they themselves have a plan to sooner or later backporting all the stable code from that forked home-assistant-voice-pe repository back upstream to main ESPHome for mainlining once they feel that the code is no longer experimental.

https://github.com/esphome/home-assistant-voice-pe

I would therefore like to ask if you and others here could consider backporting all or some of your Audio Media Player improvement and try submit code patches upstream to that experimental "home-assistant-voice-pe" (Home Assistant Voice PE) fork as a stop-gap step before mainlining as that might have a lower threshold for entry and acceptance, and optionally still submit some stable improvements/enhancements directly to the main ESPHome repository if feel the code is stable and those will be accepted upstream in mainline without much fuss, with the ultimate end goal of improving out-of-the-box capabilities for all audio related features in upstream mainline ESPHome.

Any thoughts on first trying to submit your Audio Media Player enhancements to that experimental fork repository of ESPHome?

FYI, for more info and reference check out "voice assistants" section in this Home Assistant's Roadmap 2024 Midyear Update blog post:

https://www.home-assistant.io/blog/2024/06/12/roadmap-2024h1#voice-assistants


Voice assistants

Since last year, we have built our voice assistant framework from scratch with our “Year of the Voice” initiative. Now that the infrastructure is in place, we want to make sure that it will be usable for everyone (before the demise of Alexa and Google Assistant 😜).

Current priority 1: Improve Assist capabilities out of the box

Our research has shown users are most interested in us improving out-of-the-box capabilities of Assist, for instance, timers, reminders, and music controls.

Current priority 2: Make Assist easier to start with

At the moment, there are several things you need to install or configure to get started with voice. We want to make it easier to set up and onboard. There are already some good hardware choices to start using voice, but we’re exploring building our voice satellite hardware to create a more plug-and-play experience.


By the way, should be obvsious that Nabu Casa development is initially focuses on controlling your smart home via the Home Assistant platform and their incredible Assist voice control pipeline.

However, they are also looking at music playback via such "Voice Satellite" hardware streaming from Music Assistant to ESPHome as a core feature, and as such they are going to promote audio support for ESPHome and native media player functionality.

So to eventually make more enhanced/improved ESPHome features/functions related to audio output, voice input, and media playback become useful to even avérage end users of Home Assistant they have made it clear that their plan is not only to have them be supported upstream ESPHome project by default, but they also plan on standardizing voice assistant devices in both ESPHome (including audio output and media player features/functions) as well as matching functionallity and integrations in the Home Assistant core, and ESHome + Nabu Casa developers are now working on several new components related to this, including a new entity component as assist_satellite platform for that which will represent a standard VoIP-based voice satellite for Home Assistant Assist voice control. As such I also recommend that you check out this initial architecture discussions:

And the initial entity component for this new assist_satellite platform has been merged to Home Assistant core now:

Also follow related ongoing patches with many new related features submitted to both ESPHome and the Home Assistant core:

Bigger picture:

  • Standardize how voice satellites expose their capabilities
  • Standardize how voice satellites are configured
  • Automate based on the state of the satellite's pipeline
  • Control the behavior of a voice satellite from HA during the setup wizard
  • Skip wake word and listen for a command (with or without executing it)
  • Listen for a specific wake word (without running a pipeline)
  • Control a voice satellite from HA using service calls
  • Announce text using the TTS portion of the satellite's pipeline

Note also that the XMOS xCORE AI chip is technically also not limited to audio input from the microphone, so it can also be used for audio output to improve music playback, etc. using other custom AI models algorithms adding EQ options, and other features such as DRC (Digital Room Correction), etc. to achieve improved sound fidelity. Many products only XMOS chip just for music playback, like example music network streamers, to get great HiFi quality audio for low cost.

PS: Other than the official Home Assistant Voice Satellite development hardware there are also already some third-parties working on ESPHome voice assistant hardware products, like for example FutureProofHomes have posted a new video on their YouTube channel showing off the current design of their ESP32-based hardware prototype upcoming FutureProofHomes Satellite1 voice control development board which looks to now be using such a XU316-1024-QF60A-C24 based XK-VOICE-L71 (XMOS Voice Reference Design Evaluation Kit connected externally, (which by the way features 3,5mm line out jack for audio output to external speakers). Check it out:

@Hedda
Copy link
Author

Hedda commented Oct 17, 2024

Off-topic but make sure that you do not miss this pull request with new related improvements in upstream that was just merged:

And the matching pull request to implement use of that in the "Home Assistant Voice PE" ("home-assistant-voice-pe") fork repo:

Also used as proof-of-concept in the nabu component in the kahrendt-i2s-audio-approach branch of home-assistant-voice-pe:

@Hedda
Copy link
Author

Hedda commented Oct 18, 2024

@rwrozelle FYI, just last night they moved the new audio decoder and resampling libraries from that experimental repository into their own separate repo at https://github.com/esphome/esp-audio-libs

Suspect that they may potentially also make more refactoring changes that will scramble things around more before merging to mainline ESPHome.

Hopefully splitting things like that while still keeping repos under the ESPHome originazation on GitHub will make it more readable and get more eyes on it + not as dounting to contribute upstream for mainlining.

@Hedda
Copy link
Author

Hedda commented Oct 19, 2024

By the way, recommend that you guys check out the new "ReSpeaker Lite" Voice Assistant Development Kit hardware from Seeed Studio which combine an ESP32-S3 with an XMOS xCORE XU316 MCU DSP chip for advanced audio acceleration and pre/post-processing as that features both far-field microphones for voice input and a 3.5mm audio output jack for external speakers so it can be used as a ESPHome-based Home Assistant Assist Satellite devkit (as it has the same hardware components as the upcoming official voice-kit from Home Assistent and Nabu Casa):

@rwrozelle
Copy link
Owner

Hedda, agree it would be great to merge. I'll wait to re-evaluate submitting anything until after HAVA-PE code is fully back ported to ESPHOME. I want to rebuild based upon the Nabu Media Player and move off of ADF prior to any submission. I'm also not sure what ESPHOME intends to do, seems like ESPHOME future state is tied to Music Assistant (MA), which I am not a fan of. My goal is to be able to play albums from Media Sources (either Local or DLNA) and not have to install MA which in my case duplicates what I'm getting from my Jellyfin installation.

@Hedda
Copy link
Author

Hedda commented Feb 5, 2025

@rwrozelle if you want to work on something more cutting edge today then suggest also check out FutureProofHomes projects:

They also got a cool related project to enchance Wyoming Voice Assistant which relates to intents and music playback + more:

Several other independent ESPHome developers looks to a jumped in with the development of that Satellite1-ESPHome fork:

@Hedda
Copy link
Author

Hedda commented Feb 10, 2025

Hedda, agree it would be great to merge. I'll wait to re-evaluate submitting anything until after HAVA-PE code is fully back ported to ESPHOME. I want to rebuild based upon the Nabu Media Player and move off of ADF prior to any submission.

@rwrozelle FYI, new speaker media player component with several other helper components have now been backported and merged into ESPHome, see:

Note that these also depend on the new esp-audio-libs library available via PlatformIO:

PS: Also check out this not yet merged PR:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants