A TypeScript agent that listens for tasks via the Nevermined Payments framework, automatically generates lyrics and other metadata using LangChain + OpenAI, and then produces a final song audio track through Suno’s AI Music Generation API. It manages multiple steps internally, uses a modular architecture, and can be easily scaled or extended.
The Song Generator Agent is designed to:
- Receive prompts or “ideas” for songs (e.g., “A futuristic R&B track about neon cities”).
- Optionally generate missing metadata (e.g., lyrics, title, tags) using LangChain and OpenAI.
- Invoke the Suno API to synthesize an audio track (MP3) based on the prompt + metadata.
- Output the final track’s URL, title, duration, and lyrics.
- Integrate seamlessly with Nevermined Payments, listening for “step-updated” events and updating steps as they progress or fail.
This agent is well-suited for multi-step AI workflows where you want to automate music production.
This Song Generator Agent is part of a larger ecosystem of AI-driven media creation. For a complete view of how multiple agents work together, see:
-
Music Video Orchestrator Agent
- Coordinates end-to-end workflows: collects user prompts, splits them into tasks, pays agents in multiple tokens, merges final output.
- Generates cinematic scripts, extracts scene info, identifies settings and characters, producing prompts for video generation.
-
- Produces Images / Video using 3rd party wrapper APIs (Fal.ai and TTapi, wrapping Flux and Kling.ai)
Workflow Example:
[ User Prompt ] --> [Music Orchestrator] --> [Song Generation] --> [Script Generation] --> [Image/Video Generation] --> [Final Compilation]
- Features
- Prerequisites
- Installation
- Environment Variables
- Project Structure
- Architecture & Workflow
- Usage
- How It Works Internally
- Development & Testing
- License
- Nevermined Integration: Subscribes to tasks via
step-updated
events and updates them automatically. - Automatic Metadata Generation: Uses LangChain + OpenAI for lyrics, titles, and tag creation.
- Suno Music Generation: Calls Suno’s AI for track synthesis, monitors progress, and retrieves the final MP3.
- Concurrent Step Handling: Splits tasks into multiple steps (e.g.,
autoGenerateMetadata
,buildSong
), each with its own logic. - Configurable: Customize your prompts, model versions, or usage of OpenAI.
- Logging & Error Handling: Comprehensive logs (info, success, warn, error) via a custom
Logger
. - SOLID, Modular Architecture: Each function or class has a single responsibility, ensuring maintainability.
- Node.js (>= 18.0.0 recommended)
- TypeScript (project built on ^5.7.0 or later)
- Nevermined credentials (API key, environment settings, and an
AGENT_DID
) - Suno API Key (for music generation)
- OpenAI API Key (for metadata/lyrics generation via LangChain)
-
Clone the repository:
git clone https://github.com/nevermined-io/song-generation-agent.git cd song-generation-agent
-
Install dependencies:
yarn install
-
Build the project (optional for production):
yarn build
Rename .env.example
to .env
and set the required variables:
SUNO_API_KEY=your_suno_api_key
OPENAI_API_KEY=your_openai_api_key
NVM_API_KEY=your_nevermined_api_key
NVM_ENVIRONMENT=testing
AGENT_DID=did:nv:xxx-song-agent
IS_DUMMY=false
DUMMY_JOB_ID=foobar
SUNO_API_KEY
OPENAI_API_KEY
NVM_API_KEY
NVM_ENVIRONMENT
(e.g.,testing
,staging
, orproduction
)AGENT_DID
(identifies this Song Generator Agent)IS_DUMMY
/DUMMY_JOB_ID
(optional testing flags)
.
├── clients/
│ └── sunoClient.ts # Client for interacting with the Suno API
├── config/
│ └── env.ts # Loads environment variables from .env
├── interfaces/
│ └── apiResponses.ts # Type definitions for Suno API responses
├── utils/
│ ├── logger.ts # Logging utility with color-coded levels
│ ├── utils.ts # General helpers (e.g., track duration)
│ └── checkEnv.ts # Validates environment variables on startup
├── songMetadataGenerator.ts # Class that uses LangChain+OpenAI to generate metadata
├── main.ts # Main entry, listens to step-updated events & routes steps
├── package.json
├── tsconfig.json
└── README.md # This file
Key highlights:
main.ts
: Entry point that initializes Nevermined payments, subscribes to steps for this agent’s DID, and routes to step handlers (handleInitStep
,handleAutoGenerateMetadataStep
,handleBuildSongStep
).songMetadataGenerator.ts
: Orchestrates LangChain + OpenAI calls to produce lyrics, a title, and tags.clients/sunoClient.ts
: Talks to the Suno API for generating music. It can poll for status, retrieve the final track URL, and handle errors gracefully.
When the Song Generator Agent receives a new task (usually labeled init
for the first step), it checks if the user provided metadata (lyrics, title, tags). If not, the agent creates an intermediate step to auto-generate metadata via SongMetadataGenerator
. Finally, it proceeds to the buildSong step, which:
- Calls Suno to start a music generation job.
- Periodically checks the status until it’s either
SUCCESS
orFAILED
. - Logs and returns the final audio URL, duration, and metadata to Nevermined.
-
init
- Checks for existing metadata in
step.input_artifacts
. - If missing, creates
autoGenerateMetadata
thenbuildSong
. - Otherwise, creates
buildSong
directly.
- Checks for existing metadata in
-
autoGenerateMetadata
- Invokes the
SongMetadataGenerator
to produce a new title, lyrics, and tags. - Stores them in
output_artifacts
.
- Invokes the
-
buildSong
- Uses
SunoClient
to create a music generation job. - Waits for completion (by periodically checking status).
- Retrieves the final audio file, calculates duration, and updates
output_artifacts
.
- Uses
-
Configure
.env
with the relevant keys. -
Start the agent in development mode:
yarn dev
The agent will then log into Nevermined and wait for any
step-updated
events targeting itsAGENT_DID
. -
Send a Prompt
- Typically, a higher-level Orchestrator (e.g., the Music Video Orchestrator) dispatches tasks that mention this Song Generator’s DID.
- Once triggered, the agent spawns steps for metadata creation (if needed) and final audio generation.
-
Nevermined Subscription
Payments.getInstance({...})
authenticates with the Nevermined server.payments.query.subscribe(processSteps(payments), {...})
sets up an event listener.
-
Processing Steps
- A function
processSteps(...)
receives eachstep-updated
event. - It fetches the latest step info with
payments.query.getStep(...)
. - Based on
step.name
, it calls the corresponding handler function.
- A function
-
Handlers
handleInitStep()
: Checks for existing metadata. If missing, creates two sub-steps:autoGenerateMetadata
, thenbuildSong
. If present, only createsbuildSong
.handleAutoGenerateMetadataStep()
: Uses LangChain + OpenAI to produce a JSON object withtitle
,lyrics
,tags
.handleBuildSongStep()
: Calls Suno usingSunoClient
. Waits until the job is complete, then stores the final track details.
-
Logging & Error Handling
- Each function logs relevant info or errors with the custom
Logger
. - If any step fails (e.g., Suno returns an error), the handler updates the step to
Failed
.
- Each function logs relevant info or errors with the custom
-
Output Artifacts
- Agents store data in
output_artifacts
(e.g., an array of objects describing the final song). - This is how other steps or orchestrators retrieve the MP3 URL, duration, or lyrics.
- Agents store data in
-
Start the service in dev mode:
yarn dev
-
By default, it subscribes to the
AGENT_DID
in your.env
.
yarn build
Apache License 2.0
(C) 2025 Nevermined AG
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at:
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions
and limitations under the License.