This repository provides examples of using AWS Step Functions and Amazon Bedrock to build complex, serverless, and highly scalable generative AI applications with prompt chaining.
Prompt chaining is a technique for building complex generative AI applications and accomplishing complex tasks with large language models (LLMs). With prompt chaining, you construct a set of smaller subtasks as individual prompts. Together, these subtasks make up your overall complex task that you would like the LLM to complete for your application. To accomplish the overall task, your application feeds each subtask prompt to the LLM in a pre-defined order or according to a set of defined rules.
For applications using prompt chaining, Step Functions can orchestrate complex chains of prompts and invoke foundation models in Bedrock. Beyond simple ordered chains of prompts, Step Functions workflows can contain loops, map jobs, parallel jobs, conditions, and input/output manipulation. Workflows can also chain together steps that invoke a foundation model in Bedrock, steps that invoke custom code in AWS Lambda functions, and steps that interact with over 220 AWS services. Both Bedrock and Step Functions are serverless, so you don't have to manage any infrastructure to deploy and scale up your application.
This repository contains several working examples of using the prompt chaining techniques described above, as part of a demo generative AI application. The Streamlit-based demo application executes each example's Step Functions state machine and displays the results, including the content generated by foundation models in Bedrock. The Step Functions state machines are defined using AWS CDK in Python.
This example generates an analysis of a given novel for a literature blog.
This task is broken down into multiple subtasks to first generate individual paragraphs focused on specific areas of analysis, then a final subtask to pull together the generated content into a single blog post. The workflow is a simple, sequential chain of prompts. The previous prompts and LLM responses are carried forward as context and included in the next prompt as context for the next step in the chain.
CDK code: stacks/blog_post_stack.py
This example generates a short story about a given topic.
This task is broken down into multiple subtasks to first generate a list of characters for the story, generate each character's arc for the story, and then finally generate the short story using the character descriptions and arcs. This example illustrates using a loop in a Step Functions state machine to process a list generated by a foundation model in Bedrock, in this case to process the generated list of characters.
CDK code: stacks/story_writer_stack.py
This example generates an itinerary for a weekend vacation to a given destination.
This task is broken down into multiple subtasks, which first generate suggestions for hotels, activities, and restaurants and then combines that content into a single daily itinerary. This example illustrates the ability to parallelize multiple distinct prompts in a Step Functions state machine, in this case to generate the hotel, activity, and restaurant recommendations in parallel. This example also illustrates the ability to chain together prompts and custom code. The final step in the state machine is a Lambda function that creates a PDF of the itinerary and uploads it to S3, with no generative AI interactions.
CDK code: stacks/trip_planner_stack.py
This example acts as an AI screenwriter pitching movie ideas to the human user acting as a movie producer. The movie producer can greenlight the AI's movie idea to get a longer one-page pitch for the idea, or they can reject the movie idea and the AI will generate a new idea to pitch.
This task is broken down into multiple subtasks: first, the prompt for generating a movie idea is invoked multiple times in parallel with three different temperature settings to generate three possible ideas. The next prompt in the chain chooses the best idea to pitch to the movie producer. The chain pauses while waiting for human input from the movie producer, either "greenlight" or "pass". If the movie producer greenlights the idea, the final prompt in the chain will generate a longer pitch using the chosen movie idea as context. If not, the chain loops back to the beginning of the chain, and generates three new ideas.
This example illustrates the ability to parallelize the same prompt with different inference parameters, to have multiple possible paths in the chain based on conditions, and to backtrack to a previous step in the chain. This example also illustrates the ability to require human user input as part of the workflow, using a Step Functions task token to wait for a callback containing the human user's decision.
CDK code: stacks/movie_pitch_stack.py
This example generates a recipe for the user, based on a few given ingredients they have on hand.
This task is broken down into multiple subtasks: first, two possible meal ideas are generated in parallel, with the two prompts acting as two different "AI chefs". The next prompt in the chain scores the two meal suggestions on a scale of 0 to 100 for "tastiness" of each described meal. The AI chefs receive the scores for both meal suggestions, and attempt to improve their own score in comparison to the other chef by making another meal suggestion. Another prompt determines whether the chefs have reached a consensus and suggested the same meal. If not, the chain loops back to generate new scores and new meal suggestions from the chefs. A small Lambda function with no LLM interaction chooses the highest-scoring meal, and the final prompt in the chain generates a recipe for the winning meal.
This example illustrates how prompt chains can incorporate two distinct AI conversations, and have two AI personas engage in a "society of minds" debate with each other to improve the final outcome. This example also illustrates the ability to chain together prompts and custom code, in this case choosing the highest meal score.
CDK code: stacks/meal_planner_stack.py
This example summarizes the open source repository that is the highest trending repository on GitHub today.
This task is broken down into multiple subtasks to first determine which open source repository is the highest trending on GitHub today (based on the GitHub Trending page), and then generate a summary of the repository based on its README file. This example illustrates chaining two ReAct prompts (also known as ReAct agents) together in a two-step sequential pipeline to answer the question 'What is the top trending repository on GitHub today?'. The two ReAct agents each interact with GitHub APIs to get current information; otherwise, they would not be able to identify the current trending repository or its current purpose. The first agent scrapes the GitHub Trending page, and the second agent calls the GitHub API for retrieving a repository README.
Note that this particular example does not require chaining - it could be implemented as a single agent. LLMs are capable of reasoning about and answering this relatively simple question in a single ReAct agent, by calling multiple GitHub APIs within the same agent. This example is simply meant to illustrate how a complex task requiring ReAct prompting can be broken down into multiple chained subtasks.
This example is implemented in two different versions, one uses Bedrock Agents and the other uses Langchain agents.
CDK code for Bedrock Agents version: stacks/most_popular_repo_bedrock_agent_stack.py
Note that Bedrock Agents does not yet support CloudFormation. The CDK stack creates all other resources required by the agent, but the agent itself must be created manually. Instructions for manually creating the agent in Bedrock Agents are here.
CDK code for Langchain version: stacks/most_popular_repo_langchain_stack.py
See the development guide for instructions on how to deploy the demo application.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.