[WIP] 0.1.112 docs (vocodedev#556)

* [docs sprint] Updates docs for using transcribers (#9) * [docs sprint] phrase trigger documentation (#16) * [docs sprint] update open source quickstarts (#15) * [docs sprint] Add Documentation on Using Vocode's Loguru Implementation (#19) * [docs sprint] Add Documentation on Using Vocode's Loguru Implementation * Remove Tracing --------- Co-authored-by: srhinos <[email protected]> * [docs sprint] Updates docs for using synthesizers (#8) * [docs sprint] using synthesizers docs update * update docs for elevenlabs ws * Apply suggestions from code review Co-authored-by: Adnaan Sachidanandan <[email protected]> --------- Co-authored-by: Adnaan Sachidanandan <[email protected]> * [docs sprint] Updates docs for react quickstart (#10) * [docs sprint] Updates docs for react quickstart * PR feedback * [docs sprint] Adds docs for conversation mechanics and moves endpointing docs from transcribers (#11) * [docs sprint] Updates docs for using transcribers * Adds docs for conversation mechanics and moves endpointing docs from transcribers * Update docs/open-source/conversation-mechanics.md Co-authored-by: Adnaan Sachidanandan <[email protected]> * use mdx * PR feedback --------- Co-authored-by: Adnaan Sachidanandan <[email protected]> * updates docs for events manager (#7) * [docs sprint] python quickstart + working with phone calls (#27) * deprecate SpeakerOutput * remove play.ht default voice id * rename open source quickstarts page * remove building block reference * update python quickstart * extra steps to deprecate speakeroutput * finish telephony docs * fix some references + language in how-to-use-it * fix test * [docs sprint] Add Sentry Docs to OS (#20) * Add Sentry Docs to OS * Remove Tracing * update docs and fix integration * remove free --------- Co-authored-by: srhinos <[email protected]> Co-authored-by: Ajay Raj <[email protected]> * update README * make mark terminated sync instead of async (#28) * [docs sprint] Add Docs on Creating and Using External Actions (#18) Also updated example for action agents * rename sentry + move around docs order * update README paths to docs * more updates to README * [docs sprint] update agent and action docs and move legacy docs (#29) --------- Co-authored-by: Adnaan Sachidanandan <[email protected]> Co-authored-by: Mac Wilkinson <[email protected]> Co-authored-by: srhinos <[email protected]>
ArtisanLabs · Jun 14, 2024 · 78f7b8f · 78f7b8f
1 parent 53b01da
commit 78f7b8f
Show file tree

Hide file tree

Showing 38 changed files with 1,228 additions and 539 deletions.
diff --git a/README.md b/README.md
@@ -5,7 +5,7 @@
 [![Twitter](https://img.shields.io/twitter/url/https/twitter.com/vocodehq.svg?style=social&label=Follow%20%40vocodehq)](https://twitter.com/vocodehq) [![GitHub Repo stars](https://img.shields.io/github/stars/vocodedev/vocode-python?style=social)](https://github.com/vocodedev/vocode-python)
 [![Downloads](https://static.pepy.tech/badge/vocode/month)](https://pepy.tech/project/vocode)
 
-[Community](https://discord.gg/NaU4mMgcnC) | [Docs](https://docs.vocode.dev) | [Dashboard](https://app.vocode.dev)
+[Community](https://discord.gg/NaU4mMgcnC) | [Docs](https://docs.vocode.dev/open-source) | [Dashboard](https://app.vocode.dev)
 
 </div>
 
@@ -19,11 +19,11 @@ We're actively looking for community maintainers, so please reach out if interes
 
 # ⭐️ Features
 
-- 🗣 [Spin up a conversation with your system audio](https://docs.vocode.dev/python-quickstart)
-- ➡️ 📞 [Set up a phone number that responds with a LLM-based agent](https://docs.vocode.dev/telephony#inbound-calls)
-- 📞 ➡️ [Send out phone calls from your phone number managed by an LLM-based agent](https://docs.vocode.dev/telephony#outbound-calls)
-- 🧑‍💻 [Dial into a Zoom call](https://github.com/vocodedev/vocode-python/blob/main/vocode/streaming/telephony/hosted/zoom_dial_in.py)
-- 🤖 [Use an outbound call to a real phone number in a Langchain agent](https://docs.vocode.dev/langchain-agent)
+- 🗣 [Spin up a conversation with your system audio](https://docs.vocode.dev/open-source/python-quickstart)
+- ➡️ 📞 [Set up a phone number that responds with a LLM-based agent](https://docs.vocode.dev/open-source/telephony#inbound-calls)
+- 📞 ➡️ [Send out phone calls from your phone number managed by an LLM-based agent](https://docs.vocode.dev/telephony/open-source/#outbound-calls)
+- 🧑‍💻 [Dial into a Zoom call](https://github.com/vocodedev/vocode-core/blob/53b01dab0b59f71961ee83dbcaf3653a6935c2e3/vocode/streaming/telephony/conversation/zoom_dial_in.py)
+- 🤖 [Use an outbound call to a real phone number in a Langchain agent](https://docs.vocode.dev/open-source/langchain-agent)
 - Out of the box integrations with:
   - Transcription services, including:
     - [AssemblyAI](https://www.assemblyai.com/)
@@ -34,19 +34,16 @@ We're actively looking for community maintainers, so please reach out if interes
     - [RevAI](https://www.rev.ai/)
     - [Whisper](https://openai.com/blog/introducing-chatgpt-and-whisper-apis)
     - [Whisper.cpp](https://github.com/ggerganov/whisper.cpp)
-
   - LLMs, including:
-    - [ChatGPT](https://openai.com/blog/chatgpt)
-    - [GPT-4](https://platform.openai.com/docs/models/gpt-4)
+    - [OpenAI](https://platform.openai.com/docs/models)
     - [Anthropic](https://www.anthropic.com/)
-    - [GPT4All](https://github.com/nomic-ai/gpt4all)
   - Synthesis services, including:
     - [Rime.ai](https://rime.ai)
     - [Microsoft Azure](https://azure.microsoft.com/en-us/products/cognitive-services/text-to-speech/)
     - [Google Cloud](https://cloud.google.com/text-to-speech)
     - [Play.ht](https://play.ht)
     - [Eleven Labs](https://elevenlabs.io/)
-    - [Coqui](https://coqui.ai/)
+    - [Cartesia](https://cartesia.ai/)
     - [Coqui (OSS)](https://github.com/coqui-ai/TTS)
     - [gTTS](https://gtts.readthedocs.io/)
     - [StreamElements](https://streamelements.com/)
@@ -59,9 +56,9 @@ Check out our React SDK [here](https://github.com/vocodedev/vocode-react-sdk)!
 
 We're an open source project and are extremely open to contributors adding new features, integrations, and documentation! Please don't hesitate to reach out and get started building with us.
 
-For more information on contributing, see our [Contribution Guide](https://github.com/vocodedev/vocode-python/blob/main/contributing.md).
+For more information on contributing, see our [Contribution Guide](https://github.com/vocodedev/vocode-core/blob/main/contributing.md).
 
-And check out our [Roadmap](https://github.com/vocodedev/vocode-python/blob/main/roadmap.md).
+And check out our [Roadmap](https://github.com/vocodedev/vocode-core/blob/main/roadmap.md).
 
 We'd love to talk to you on [Discord](https://discord.gg/NaU4mMgcnC) about new ideas and contributing!
 
@@ -73,31 +70,48 @@ pip install 'vocode'
 
 ```python
 import asyncio
-import logging
 import signal
-from vocode.streaming.streaming_conversation import StreamingConversation
+
+from pydantic_settings import BaseSettings, SettingsConfigDict
+
 from vocode.helpers import create_streaming_microphone_input_and_speaker_output
-from vocode.streaming.transcriber import *
-from vocode.streaming.agent import *
-from vocode.streaming.synthesizer import *
-from vocode.streaming.models.transcriber import *
-from vocode.streaming.models.agent import *
-from vocode.streaming.models.synthesizer import *
+from vocode.logging import configure_pretty_logging
+from vocode.streaming.agent.chat_gpt_agent import ChatGPTAgent
+from vocode.streaming.models.agent import ChatGPTAgentConfig
 from vocode.streaming.models.message import BaseMessage
-import vocode
-
-# these can also be set as environment variables
-vocode.setenv(
-    OPENAI_API_KEY="<your OpenAI key>",
-    DEEPGRAM_API_KEY="<your Deepgram key>",
-    AZURE_SPEECH_KEY="<your Azure key>",
-    AZURE_SPEECH_REGION="<your Azure region>",
+from vocode.streaming.models.synthesizer import AzureSynthesizerConfig
+from vocode.streaming.models.transcriber import (
+    DeepgramTranscriberConfig,
+    PunctuationEndpointingConfig,
 )
+from vocode.streaming.streaming_conversation import StreamingConversation
+from vocode.streaming.synthesizer.azure_synthesizer import AzureSynthesizer
+from vocode.streaming.transcriber.deepgram_transcriber import DeepgramTranscriber
 
+configure_pretty_logging()
 
-logging.basicConfig()
-logger = logging.getLogger(__name__)
-logger.setLevel(logging.DEBUG)
+
+class Settings(BaseSettings):
+    """
+    Settings for the streaming conversation quickstart.
+    These parameters can be configured with environment variables.
+    """
+
+    openai_api_key: str = "ENTER_YOUR_OPENAI_API_KEY_HERE"
+    azure_speech_key: str = "ENTER_YOUR_AZURE_KEY_HERE"
+    deepgram_api_key: str = "ENTER_YOUR_DEEPGRAM_API_KEY_HERE"
+
+    azure_speech_region: str = "eastus"
+
+    # This means a .env file can be used to overload these settings
+    # ex: "OPENAI_API_KEY=my_key" will set openai_api_key over the default above
+    model_config = SettingsConfigDict(
+        env_file=".env",
+        env_file_encoding="utf-8",
+    )
+
+
+settings = Settings()
 
 
 async def main():
@@ -106,8 +120,6 @@ async def main():
         speaker_output,
     ) = create_streaming_microphone_input_and_speaker_output(
         use_default_devices=False,
-        logger=logger,
-        use_blocking_speaker_output=True
     )
 
     conversation = StreamingConversation(
@@ -116,24 +128,25 @@ async def main():
             DeepgramTranscriberConfig.from_input_device(
                 microphone_input,
                 endpointing_config=PunctuationEndpointingConfig(),
-            )
+                api_key=settings.deepgram_api_key,
+            ),
         ),
         agent=ChatGPTAgent(
             ChatGPTAgentConfig(
+                openai_api_key=settings.openai_api_key,
                 initial_message=BaseMessage(text="What up"),
                 prompt_preamble="""The AI is having a pleasant conversation about life""",
             )
         ),
         synthesizer=AzureSynthesizer(
-            AzureSynthesizerConfig.from_output_device(speaker_output)
+            AzureSynthesizerConfig.from_output_device(speaker_output),
+            azure_speech_key=settings.azure_speech_key,
+            azure_speech_region=settings.azure_speech_region,
         ),
-        logger=logger,
     )
     await conversation.start()
     print("Conversation started, press Ctrl+C to end")
-    signal.signal(
-        signal.SIGINT, lambda _0, _1: asyncio.create_task(conversation.terminate())
-    )
+    signal.signal(signal.SIGINT, lambda _0, _1: asyncio.create_task(conversation.terminate()))
     while conversation.is_active():
         chunk = await microphone_input.get_audio()
         conversation.receive_audio(chunk)
@@ -145,8 +158,8 @@ if __name__ == "__main__":
 
 # 📞 Phone call quickstarts
 
-- [Telephony Server - Self-hosted](https://docs.vocode.dev/telephony)
+- [Telephony Server - Self-hosted](https://docs.vocode.dev/open-source/telephony)
 
 # 🌱 Documentation
 
-[docs.vocode.dev](https://docs.vocode.dev/)
+[docs.vocode.dev](https://docs.vocode.dev/open-source)
diff --git a/apps/telephony_app/speller_agent.py b/apps/telephony_app/speller_agent.py
@@ -1,4 +1,3 @@
-import typing
 from typing import Optional, Tuple
 
 from vocode.streaming.agent.abstract_factory import AbstractAgentFactory
@@ -65,16 +64,10 @@ def create_agent(self, agent_config: AgentConfig) -> BaseAgent:
             Exception: If the agent configuration type is not recognized.
         """
         # If the agent configuration type is CHAT_GPT, create a ChatGPTAgent.
-        if agent_config.type == AgentType.CHAT_GPT:
-            return ChatGPTAgent(
-                # Cast the agent configuration to ChatGPTAgentConfig as we are sure about the type here.
-                agent_config=typing.cast(ChatGPTAgentConfig, agent_config)
-            )
+        if isinstance(agent_config, ChatGPTAgentConfig):
+            return ChatGPTAgent(agent_config=agent_config)
         # If the agent configuration type is agent_speller, create a SpellerAgent.
-        elif agent_config.type == "agent_speller":
-            return SpellerAgent(
-                # Cast the agent configuration to SpellerAgentConfig as we are sure about the type here.
-                agent_config=typing.cast(SpellerAgentConfig, agent_config)
-            )
+        elif isinstance(agent_config, SpellerAgentConfig):
+            return SpellerAgent(agent_config=agent_config)
         # If the agent configuration type is not recognized, raise an exception.
         raise Exception("Invalid agent config")
diff --git a/docs/images/sentry.png b/docs/images/sentry.png
diff --git a/docs/mint.json b/docs/mint.json
@@ -49,7 +49,11 @@
   "navigation": [
     {
       "group": "Getting Started",
-      "pages": ["welcome", "hosted-quickstart", "open-source-quickstart"]
+      "pages": [
+        "welcome",
+        "hosted-quickstart",
+        "open-source-quickstarts"
+      ]
     },
     {
       "group": "Vocode 101",
@@ -65,26 +69,27 @@
         "open-source/python-quickstart",
         "open-source/telephony",
         "open-source/create-your-own-agent",
-        "open-source/langchain-agent",
-        "open-source/action-agents",
-        "open-source/local-conversation",
+        "open-source/agent-factory",
+        "open-source/agents-with-actions",
+        "open-source/action-phrase-triggers",
+        "open-source/external-action",
+        "open-source/conversation-mechanics",
         "open-source/events-manager",
         "open-source/using-synthesizers",
         "open-source/using-transcribers",
         "open-source/react-quickstart",
         "open-source/playground",
+        "open-source/sentry",
+        "open-source/logging-with-loguru",
         "open-source/turn-based-conversation",
-        "open-source/language-support",
-        "open-source/tracing",
-        "open-source/agent-factory"
+        "open-source/language-support"
       ]
     },
     {
-      "group": "Python",
+      "group": "Legacy (0.0.111) Guides",
       "pages": [
-        "open-source/transcriber-reference",
-        "open-source/agent-reference",
-        "open-source/synthesizer-reference"
+        "open-source/langchain-agent",
+        "open-source/local-conversation"
       ]
     },
     {
@@ -109,7 +114,9 @@
     },
     {
       "group": "Usage",
-      "pages": ["api-reference/usage/get-usage"]
+      "pages": [
+        "api-reference/usage/get-usage"
+      ]
     },
     {
       "group": "Actions",
@@ -223,4 +230,4 @@
     "twitter": "https://twitter.com/vocodehq",
     "website": "https://www.vocode.dev/"
   }
-}
+}
diff --git a/docs/open-source-quickstart.mdx → docs/open-source-quickstarts.mdx b/docs/open-source-quickstart.mdx → docs/open-source-quickstarts.mdx
@@ -1,12 +1,16 @@
 ---
-title: "Open Source Quickstart"
+title: "Open Source Quickstarts"
 description: "How to get Vocode up and running on your own machine"
 ---
 
 ## Start Developing
 
 <CardGroup>
-  <Card title="Python Quick Start" icon="circle-play" href="/open-source/python-quickstart">
+  <Card
+    title="Python Quick Start"
+    icon="circle-play"
+    href="/open-source/python-quickstart"
+  >
     Quickly get up and running with Vocode by following our Python quick start
     guide.
   </Card>

diff --git a/docs/open-source/action-agents.mdx b/docs/open-source/action-agents.mdx