Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xrx state machine initial draft #24

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

xrx state machine initial draft #24

wants to merge 4 commits into from

Conversation

mprast
Copy link
Collaborator

@mprast mprast commented Oct 10, 2024

XRX State Machine

The purpose of this PR is to add a backing state machine to the xrx reasoning agent. A sample state machine has been fully integrated into shopify-app. With this change the agent (mostly) appears to be able to:

  • Understand the state it's in and infer its current objective
  • Understand what states it can transition into
  • Transition to the next state when appropriate
  • Guide the user back to the current objective if they try to go out of bounds
  • Switch flows if the user indicates they want to do something different

Testing

No special setup is needed for the state machine; just pull the branches down for xrx-sample-apps and xrx-core and play around with shopify-app as usual. The agent will log what state it's in and will use a 'transition-state' node to transition when appropriate.

A flow is a graph of steps. Each flow has an 'initial' step, which is the step the agent starts in when it starts the flow. There are three sample flows in shopify-app/reasoning/app/agent/flows.yaml - one for buying a product from the store, one for submitting an app to be listed in the store, and one initial flow for figuring out what the user wants to do. The agent will move between these flows as necessary. It will abandon the flow it's on and start a new one if you ask it to.

Feel free to tinker with flows.yaml to add your own flows. The format should be self-explanatory, but if you have questions just shoot me a slack!

To demonstrate the capabilities of the state machine, I recorded four sample conversations I had with shopify-app using interactive-test.py. These are in shopify-app/reasoning/app/agent/sample_conversations. Feel free to replicate these yourself. If you can't, or if anything looks weird, please let me know!

TODOs & Cleanup Work

  • Slim down the prompts: I kept adding stuff until I got to a point I liked; a lot of it can probably be removed
  • Figure out how to pass the state machine to the client via session details: I rigged something that worked but I need to find out if it's the right way to do it
  • Build general functionality to trim state machine info in interactive-test.py: the response logging was outputting the entire structure of the state machine - including all flows, states, and transitions - on every response, which made responses very hard to read. I rigged something to redact the state machine from the session variable on output; this can probably be made configurable if we want a general way to say "this variable is huge so don't output it please" (chris - what do you think?)
  • remove various debug cruft (mostly pdb imports)

Next Steps

  • Remember history when transitioning out of a flow and allow the user to return to where they were if they want
  • Auto-gen initial flow so it doesn't have to be specified in flows.yaml
  • More testing with more complicated state graphs
  • Individual objectives for each transition
  • Constrain tool calls based on current state and/or flow
  • Anything else I'm forgetting!

@mprast mprast requested a review from chrislott October 10, 2024 23:29
@mprast
Copy link
Collaborator Author

mprast commented Oct 15, 2024

Extra notes: the agent appears to get "stuck" after applying guardrails a single time. That is to say, if you try to do something unrelated to the current state and the agent stops you, you can no longer switch between flows - the agent will stop you every time. I think it's getting too hung up on the conversation history.

I think the cleanest way around this may be to have a separate graph node that intercepts the output of the RespondToUser node and replaces it if the latest question and answer are unrelated to the objective of the current state machine node. We'd explicitly not include the rest of the conversation here to make sure the agent doesn't get thrown off.

@chrislott what do you think?

@mprast
Copy link
Collaborator Author

mprast commented Oct 15, 2024

also - I think it's probably worth having a separate node type for the initial 'query' flow (the one that describes the options to the user and asks the user what they want to do). as it stands modeling this as its own flow seems to confuse the agent with its vagueness; the agent uses it to circumvent the state guardrails a lot

@alessandro-neri
Copy link
Contributor

[Feature] Enhance Shopify App with Comprehensive Reasoning Agent Capabilities

1. Overview

  • What is the feature?
    This feature introduces significant enhancements to the Shopify app within the xRx framework by adding comprehensive reasoning agent capabilities. It includes the integration of new reasoning flows, state machine guardrails, tool execution modules, and sample conversations to facilitate advanced AI-powered user interactions. Additionally, updates to the Docker configuration ensure seamless deployment and scalability of the reasoning services.

  • What changed?

    • Reasoning Agent Integration: Added new modules and flows to handle complex user interactions and state management.
    • Docker Configuration: Updated docker-compose.yaml to include new services and dependencies required for the enhanced reasoning capabilities.

2. Files Modified

File Name Changes
docker-compose.yaml Context: Configuration file for Docker services.
Changes [EDIT]: Added two new services for the enhanced reasoning agents, updated environment variables, and modified network settings to support the new components.
agent/executor.py Context: Executes agent operations within the reasoning framework.
Changes [EDIT]: Refactored to incorporate new state machine integrations, added logging for enhanced observability, and optimized asynchronous operations for better performance.
agent/flows.yaml Context: Defines the conversation flows for the reasoning agent.
Changes [NEW]: Introduced new flows for handling product purchases and submissions, including states like productDetails, addOnSuggestions, confirmation, and checkout for buying, as well as productDetails and contactDetails for submitting items.
agent/graph/main.py Context: Manages the graph traversal logic for conversation flows.
Changes [EDIT]: Incorporated state_machine parameter to manage conversational states, updated node definitions to include new reasoning nodes, and enhanced error handling during graph traversal.
agent/graph/nodes/choose_tool.py Context: Node responsible for selecting appropriate tools based on conversation context.
Changes [EDIT]: Enhanced tool selection logic to consider state machine information, updated prompts for better tool invocation, and refined JSON response formatting for tool selection.
agent/graph/nodes/convert_natural_language.py Context: Converts user input into a structured format for processing.
Changes [EDIT]: Improved natural language processing to handle more complex queries, integrated state machine prompts, and optimized message formatting for consistency.
agent/graph/nodes/customer_response.py Context: Generates responses to customer inquiries.
Changes [EDIT]: Enhanced response generation with state machine awareness, added detailed reasoning for response content, and implemented stricter adherence to flow objectives.
agent/graph/nodes/execute_tool.py Context: Executes selected tools based on agent decisions.
Changes [EDIT]: Refined tool execution logic to handle state and flow transitions, improved error handling for tool responses, and optimized asynchronous execution for better performance.
agent/graph/nodes/identify_tool_params.py Context: Identifies parameters required for tool execution.
Changes [EDIT]: Enhanced parameter identification with state machine context, improved JSON formatting for parameter responses, and refined tool selection criteria based on current state.
agent/graph/nodes/routing.py Context: Routes conversation flow based on user input and agent decisions.
Changes [EDIT]: Updated routing logic to incorporate new flows and states, enhanced tool listing with parameters, and improved JSON response structure for routing decisions.
agent/graph/nodes/state_machine_guardrails_check.py Context: Ensures conversation aligns with state machine rules.
Changes [NEW]: Introduced a new node to validate state machine adherence, implemented JSON-based guardrail checks, and integrated LLM-based validation for response alignment.
agent/graph/nodes/transition_state.py Context: Manages state transitions within the conversation.
Changes [NEW]: Added a new node to handle state and flow transitions, integrated LLM prompts for determining appropriate transitions, and implemented JSON response handling for transition decisions.
sample_conversations/basic1.txt Context: Sample conversation script.
Changes [NEW]: Added initial conversation scenarios to demonstrate basic agent interactions.
sample_conversations/basic2.txt Context: Additional sample conversation script.
Changes [NEW]: Included more complex conversation flows to showcase advanced agent capabilities.
sample_conversations/change_flow.txt Context: Sample conversation demonstrating flow changes.
Changes [NEW]: Provided scenarios where the agent transitions between different conversation flows based on user input.
sample_conversations/state_guardrails.txt Context: Sample conversation highlighting state guardrails.
Changes [NEW]: Showcased how the agent maintains conversation alignment with state machine rules.
requirements.txt Context: Lists project dependencies.
Changes [EDIT]: Added new dependencies required for enhanced reasoning capabilities and state machine management.
test/interactive_test.py Context: Interactive testing script for the reasoning agent.
Changes [EDIT]: Updated to accommodate new conversation flows, enhanced logging for debugging, and improved session handling for accurate state management.
xrx-core Context: Core library submodule for xRx.
Changes [SUBMODULE UPDATE]: Updated xrx-core submodule with 3 new files to support the enhanced reasoning functionalities, including state management utilities and TTS integrations.

3. Issues/Improvements

Security. Potential exposure of sensitive state information. - **Specific security concern:** The state machine maintains sensitive user session data which, if exposed, could lead to data breaches. - **Specific mitigation needed:** Implement encryption for state data at rest and in transit. Ensure access controls are in place to restrict unauthorized access to session information.
Performance. Increased load due to additional reasoning processes. - **Specific performance impact:** The introduction of new reasoning nodes and state machine checks may lead to higher CPU and memory usage, potentially affecting response times. - **Specific optimization needed:** Optimize asynchronous operations and implement caching strategies where feasible. Conduct performance testing to identify and address bottlenecks.
Maintainability. Complexity introduced by extensive state management. - **Specific maintainability concern:** The expanded state machine and additional reasoning nodes increase the codebase complexity, making it harder to maintain and extend. - **Specific improvement needed:** Refactor code to modularize state management logic, add comprehensive documentation, and implement unit tests to ensure code reliability.
Simplification. Redundant state checks in multiple nodes. - **Specific simplification opportunity:** Multiple nodes perform similar state validations, leading to redundant code. - **Specific refactoring needed:** Consolidate state validation logic into a shared utility or base class to reduce duplication and streamline maintenance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants