🗣️ Azure Podcast Generator

Generate an engaging podcast based on your document using Azure OpenAI Service and Azure AI Speech.

This project leverages Streamlit for the front-end, Azure Document Intelligence for the document analysis, Azure OpenAI Service with structured outputs for the text generation, and Azure AI Speech for the text-to-speech. All data will be processed within your Azure subscription, ensuring it remains in your Azure environment and is not shared with any third-party.

Note

This application is an example implementation and is not intended for production use. It is provided as-is and is not supported.

demo-video-podcast-generator.mp4

Getting started

You can run the application locally or deploy it to Azure, such as on Azure Container Apps. For development, the easiest approach is to use the included Dev Container, which installs all necessary dependencies and tools to get you started.

Prerequisites

This project utilizes several Azure services, requiring an active Azure subscription. The services used include:

Azure Document Intelligence
Azure OpenAI Service, gpt-4o (2024-08-06, earlier model versions do not support structured outputs). See model availability per region
Azure AI Speech (East US, West Europe, and Southeast Asia for Azure HD voices)

Local deployment

Make sure you have Python 3.12+, uv and optionally the Azure CLI installed on your machine.

You can install the required dependencies via the command below using uv.

uv sync

Configure the necessary environment variables in the .env file or your environment settings. The required values can be found in the .env.sample file. Create a new .env file in the app directory and add the required values. Since this project supports (managed) identity-based authentication for Azure services, it's recommended not to store any keys in the .env file.

(optional) Identity Based Authentication

It is recommended to use managed identity-based authentication for Azure services, even during development. This project leverages the DefaultAzureCredential, which supports multiple authentication methods, such as environment variables, managed identity, and more.

Login to Azure using the Azure CLI and select your subscription.

az login

Assign roles. Ensure your user account has the necessary roles to access the Azure services. You can assign these roles using the Azure Portal (IAM) or the Azure CLI.

Azure Resource	Roles
Azure Document Intelligence	Cognitive Service User
Azure OpenAI Service	Cognitive Services OpenAI User or Cognitive Service User
Azure AI Speech	Cognitive Services Speech User or Cognitive Service User

# Assign roles using Azure CLI
az role assignment create --assignee <your-user-id> --role "Cognitive Service User" --scope <resource-scope>
az role assignment create --assignee <your-user-id> --role "Cognitive Services OpenAI User" --scope <resource-scope>
az role assignment create --assignee <your-user-id> --role "Cognitive Services Speech User" --scope <resource-scope>

To support identity based authentication with Azure AI Speech, you need to create a custom domain name.
Retrieve your Speech resource ID and set the AZURE_SPEECH_RESOURCE_ID environment variable.

Start the development server

Start the development Streamlit server using the command below. It will launch on port 8065.

uv run streamlit run app/app.py

Deploy on Azure

This repository includes the code for the Azure Podcast Generator, but infrastructure-as-code is not currently provided. You can use the Azure CLI to deploy the container to Azure Container Apps.

az containerapp up --resource-group your-rg-name \
--name your-app-name --location westeurope \
--ingress external --target-port 9000 --source . \
--env-vars DOCUMENTINTELLIGENCE_ENDPOINT="" AZURE_OPENAI_ENDPOINT="" AZURE_OPENAI_MODEL_DEPLOYMENT="gpt-4o" AZURE_SPEECH_RESOURCE_ID="" AZURE_SPEECH_REGION="westeurope"

It is advised to set the sticky-sessions to sticky using the command below, to prevent any issues with file-uploads.

az containerapp ingress sticky-sessions set --affinity sticky --name your-app-name --resource-group your-rg-name

Inspired by

Google NotebookLM
New HD voices preview in Azure AI Speech

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.devcontainer		.devcontainer
.github		.github
app		app
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🗣️ Azure Podcast Generator

Getting started

Prerequisites

Local deployment

(optional) Identity Based Authentication

Start the development server

Deploy on Azure

Inspired by

About

Releases 1

Contributors 3

Languages

License

iMicknl/azure-podcast-generator

Folders and files

Latest commit

History

Repository files navigation

🗣️ Azure Podcast Generator

Getting started

Prerequisites

Local deployment

(optional) Identity Based Authentication

Start the development server

Deploy on Azure

Inspired by

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Contributors 3

Languages