Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bye langchain 😢 #35

Merged
merged 48 commits into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
65a8f78
updating dependencies for github workflows
ThibaultLSDC Sep 11, 2024
a5a2000
Merge branch 'main' of github.com:ServiceNow/AgentLab
ThibaultLSDC Sep 13, 2024
510b835
Merge branch 'main' of github.com:ServiceNow/AgentLab into tracking
ThibaultLSDC Sep 17, 2024
ab0f4ec
openrouter tracker poc
ThibaultLSDC Sep 18, 2024
ca50598
adding openai pricing request
ThibaultLSDC Sep 18, 2024
0cde22b
switching back to langchain community for openai pricing
ThibaultLSDC Sep 19, 2024
060e1e7
renaming launch_command.py to main.py
ThibaultLSDC Sep 19, 2024
7eafcf8
Merge branch 'main' into tracking
ThibaultLSDC Sep 19, 2024
600cfca
typo
ThibaultLSDC Sep 19, 2024
708bde5
tracking is thread safe and mostly tested
ThibaultLSDC Sep 20, 2024
7c20945
added pricy tests for ChatModels
ThibaultLSDC Sep 20, 2024
18a45e0
separating get_pricing function
ThibaultLSDC Sep 20, 2024
9d12cdf
updating function names
ThibaultLSDC Sep 20, 2024
8e5e5f9
updating function names
ThibaultLSDC Sep 20, 2024
17e8ff8
ciao retry_parallel
ThibaultLSDC Sep 20, 2024
57b3913
london 1666 (removing all (most) traces of langchain)
ThibaultLSDC Sep 20, 2024
4dd4fc9
Merge branch 'main' of github.com:ServiceNow/AgentLab into bye_langchain
ThibaultLSDC Sep 24, 2024
592c676
renaming langchain_utils to huggingface_utils
ThibaultLSDC Sep 24, 2024
bdab2fb
deps update
ThibaultLSDC Sep 24, 2024
9708961
removing last langchain traces
ThibaultLSDC Sep 24, 2024
631057d
import typo
ThibaultLSDC Sep 24, 2024
560f3e5
adding retry functionality to ChatModel
ThibaultLSDC Sep 24, 2024
ea3eacf
fixing tests
ThibaultLSDC Sep 24, 2024
88797d3
typos
ThibaultLSDC Sep 25, 2024
fe0db06
formatting
ThibaultLSDC Sep 26, 2024
3407341
fixing imports
ThibaultLSDC Sep 26, 2024
84f3e66
Merge branch 'main' of github.com:ServiceNow/AgentLab into bye_langchain
ThibaultLSDC Sep 26, 2024
6a77407
retrocompat xray
ThibaultLSDC Sep 26, 2024
df6842c
no rounding in stats
ThibaultLSDC Sep 26, 2024
04bb915
retrocompat xray
ThibaultLSDC Sep 26, 2024
42ab650
made message helper functions
ThibaultLSDC Sep 26, 2024
67f36ad
Update src/agentlab/llm/chat_api.py
ThibaultLSDC Sep 26, 2024
307f4f2
doc
ThibaultLSDC Sep 26, 2024
cf8720b
specific retry exception
ThibaultLSDC Sep 26, 2024
42515dc
bye retry
ThibaultLSDC Sep 26, 2024
5dee506
welcome back retry
ThibaultLSDC Sep 26, 2024
7e0adbc
moving API errors to ChatModel, restructuring ChatModels
ThibaultLSDC Sep 27, 2024
4c24bd1
Merge branch 'bye_langchain' of github.com:ServiceNow/AgentLab into b…
ThibaultLSDC Sep 27, 2024
a964123
Merge branch 'main' of github.com:ServiceNow/AgentLab into bye_langchain
ThibaultLSDC Sep 27, 2024
5bd7be6
fix test
ThibaultLSDC Sep 27, 2024
1bf3943
fix error handling
ThibaultLSDC Sep 27, 2024
b00a15d
updating hf llm class
ThibaultLSDC Sep 30, 2024
80308a0
testing Retry/Parse Error behaviors
ThibaultLSDC Oct 1, 2024
8e2c60e
Merge branch 'main' of github.com:ServiceNow/AgentLab into bye_langchain
ThibaultLSDC Oct 2, 2024
b8a1517
moving functions around
ThibaultLSDC Oct 2, 2024
71fa700
Merge branch 'main' into bye_langchain
ThibaultLSDC Oct 2, 2024
13cdd48
Merge branch 'main' of github.com:ServiceNow/AgentLab into bye_langchain
ThibaultLSDC Oct 2, 2024
7851565
Merge branch 'bye_langchain' of github.com:ServiceNow/AgentLab into b…
ThibaultLSDC Oct 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,6 @@ distributed
browsergym>=0.7.1
joblib>=1.2.0
openai>=1.7,<2
langchain>=0.1,<1
langchain_openai
langchain_community
tiktoken
huggingface_hub
Expand All @@ -20,3 +18,4 @@ pyyaml>=6
pandas
gradio
gitpython # for the reproducibility script
requests
4 changes: 1 addition & 3 deletions src/agentlab/agents/dynamic_prompting.py
Original file line number Diff line number Diff line change
Expand Up @@ -245,9 +245,7 @@ def fit_tokens(
additional_prompts = [additional_prompts]

for prompt in additional_prompts:
max_prompt_tokens -= (
count_tokens(prompt, model=model_name) + 1
) # +1 accounts for LangChain token
max_prompt_tokens -= count_tokens(prompt, model=model_name) + 1 # +1 because why not ?
ThibaultLSDC marked this conversation as resolved.
Show resolved Hide resolved

for _ in range(max_iterations):
prompt = shrinkable.prompt
Expand Down
30 changes: 16 additions & 14 deletions src/agentlab/agents/generic_agent/generic_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,11 @@
from warnings import warn

from browsergym.experiments.agent import Agent, AgentInfo
from langchain.schema import HumanMessage, SystemMessage

from agentlab.agents import dynamic_prompting as dp
from agentlab.agents.agent_args import AgentArgs
from agentlab.llm.chat_api import BaseModelArgs
from agentlab.llm.llm_utils import RetryError, retry_raise
from agentlab.llm.chat_api import BaseModelArgs, make_system_message, make_user_message
from agentlab.llm.llm_utils import ParseError, retry
from agentlab.llm.tracking import cost_tracker_decorator

from .generic_agent_prompt import GenericPromptFlags, MainPrompt
Expand Down Expand Up @@ -92,30 +91,33 @@ def get_action(self, obs):
max_iterations=max_trunc_itr,
additional_prompts=system_prompt,
)

stats = {}
try:
# TODO, we would need to further shrink the prompt if the retry
# cause it to be too long

chat_messages = [
SystemMessage(content=system_prompt),
HumanMessage(content=prompt),
make_system_message(system_prompt),
make_user_message(prompt),
]
ans_dict = retry_raise(
ans_dict = retry(
self.chat_llm,
chat_messages,
n_retry=self.max_retry,
parser=main_prompt._parse_answer,
)
ans_dict["busted_retry"] = 0
# inferring the number of retries, TODO: make this less hacky
stats["n_retry"] = (len(chat_messages) - 3) / 2
stats["busted_retry"] = 0
except RetryError as e:
ans_dict = {"action": None}
stats["busted_retry"] = 1
ans_dict["n_retry"] = (len(chat_messages) - 3) / 2
except ParseError as e:
ans_dict = dict(
action=None,
n_retry=self.max_retry + 1,
busted_retry=1,
)

stats["n_retry"] = self.max_retry + 1
stats = self.chat_llm.get_stats()
stats["n_retry"] = ans_dict["n_retry"]
stats["busted_retry"] = ans_dict["busted_retry"]

self.plan = ans_dict.get("plan", self.plan)
self.plan_step = ans_dict.get("step", self.plan_step)
Expand Down
34 changes: 9 additions & 25 deletions src/agentlab/agents/generic_agent/reproducibility_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,22 +10,22 @@
answers. Load the this reproducibility study in agent-xray to compare the results.
"""

import difflib
import logging
import time
from copy import copy
from dataclasses import dataclass
import logging
from pathlib import Path
import time

from browsergym.experiments.agent import AgentInfo
from browsergym.experiments.loop import ExpArgs, ExpResult, yield_all_exp_results
from bs4 import BeautifulSoup

from agentlab.agents.agent_args import AgentArgs
from .generic_agent import GenericAgentArgs, GenericAgent
from browsergym.experiments.loop import ExpResult, ExpArgs, yield_all_exp_results
from browsergym.experiments.agent import AgentInfo
import difflib
from agentlab.llm.chat_api import make_assistant_message
from agentlab.llm.llm_utils import messages_to_dict

from langchain.schema import BaseMessage, AIMessage
from langchain_community.adapters.openai import convert_message_to_dict
from .generic_agent import GenericAgent, GenericAgentArgs


class ReproChatModel:
Expand All @@ -45,8 +45,7 @@ def invoke(self, messages: list):

if len(messages) >= len(self.old_messages):
# if for some reason the llm response was not saved
# TODO(thibault): convert this to dict instead of AIMessage in the bye langchain PR.
return AIMessage(content="""<action>None</action>""")
return make_assistant_message("""<action>None</action>""")

old_response = self.old_messages[len(messages)]
self.new_messages.append(old_response)
Expand Down Expand Up @@ -108,21 +107,6 @@ def get_action(self, obs):
)


# TODO(thibault): move this to llm utils in bye langchain PR.
def messages_to_dict(messages: list[dict] | list[BaseMessage]) -> dict:
new_messages = []
for m in messages:
if isinstance(m, dict):
new_messages.append(m)
elif isinstance(m, str):
new_messages.append({"role": "<unknown role>", "content": m})
elif isinstance(m, BaseMessage):
new_messages.append(convert_message_to_dict(m))
else:
raise ValueError(f"Unknown message type: {type(m)}")
return new_messages


def _make_agent_stats(action, agent_info, step_info, old_chat_messages, new_chat_messages):
if isinstance(agent_info, dict):
agent_info = AgentInfo(**agent_info)
Expand Down
11 changes: 7 additions & 4 deletions src/agentlab/agents/most_basic_agent/most_basic_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@
from browsergym.core.action.highlevel import HighLevelActionSet
from browsergym.experiments.agent import Agent, AgentInfo
from browsergym.experiments.loop import AbstractAgentArgs, EnvArgs, ExpArgs
from langchain.schema import AIMessage, HumanMessage, SystemMessage

from agentlab.llm.chat_api import make_system_message, make_user_message
from agentlab.llm.llm_configs import CHAT_MODEL_ARGS_DICT
from agentlab.llm.llm_utils import ParseError, extract_code_blocks, retry_raise
from agentlab.llm.llm_utils import ParseError, extract_code_blocks, retry
from agentlab.llm.tracking import cost_tracker_decorator

if TYPE_CHECKING:
Expand Down Expand Up @@ -84,7 +84,10 @@ def get_action(self, obs: Any) -> tuple[str, dict]:
Provide a chain of thoughts reasoning to decompose the task into smaller steps. And execute only the next step.
"""

messages = [SystemMessage(content=system_prompt), HumanMessage(content=prompt)]
messages = [
make_system_message(system_prompt),
make_user_message(prompt),
]

def parser(response: str) -> tuple[dict, bool, str]:
blocks = extract_code_blocks(response)
Expand All @@ -94,7 +97,7 @@ def parser(response: str) -> tuple[dict, bool, str]:
thought = response
return {"action": action, "think": thought}

ans_dict = retry_raise(self.chat, messages, n_retry=3, parser=parser)
ans_dict = retry(self.chat, messages, n_retry=3, parser=parser)

action = ans_dict.get("action", None)
thought = ans_dict.get("think", None)
Expand Down
23 changes: 0 additions & 23 deletions src/agentlab/agents/utils.py

This file was deleted.

32 changes: 22 additions & 10 deletions src/agentlab/analyze/agent_xray.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,13 @@
import pandas as pd
from attr import dataclass
from browsergym.experiments.loop import ExpResult, StepInfo
from langchain.schema import BaseMessage
from langchain_openai import ChatOpenAI
from langchain.schema import BaseMessage, HumanMessage
from openai import OpenAI
from PIL import Image

from agentlab.analyze import inspect_results
from agentlab.experiments.exp_utils import RESULTS_DIR
from agentlab.llm.chat_api import make_system_message, make_user_message
from agentlab.llm.llm_utils import image_to_jpg_base64_url

select_dir_instructions = "Select Experiment Directory"
Expand Down Expand Up @@ -569,7 +570,7 @@ def update_chat_messages():
chat_messages = agent_info.get("chat_messages", ["No Chat Messages"])
messages = []
for i, m in enumerate(chat_messages):
if isinstance(m, BaseMessage):
if isinstance(m, BaseMessage): # TODO remove once langchain is deprecated
m = m.content
elif isinstance(m, dict):
m = m.get("content", "No Content")
Expand Down Expand Up @@ -653,11 +654,24 @@ def submit_action(input_text):
global info
agent_info = info.exp_result.steps_info[info.step].agent_info
chat_messages = deepcopy(agent_info.get("chat_messages", ["No Chat Messages"])[:2])
assert isinstance(chat_messages[1], BaseMessage), "Messages should be langchain messages"
if isinstance(chat_messages[1], BaseMessage): # TODO remove once langchain is deprecated
assert isinstance(chat_messages[1], HumanMessage), "Second message should be user"
chat_messages = [
make_system_message(chat_messages[0].content),
make_user_message(chat_messages[1].content),
]
elif isinstance(chat_messages[1], dict):
assert chat_messages[1].get("role", None) == "user", "Second message should be user"
else:
raise ValueError("Chat messages should be a list of BaseMessage or dict")

chat = ChatOpenAI(name="gpt-4o-mini")
chat_messages[1].content = input_text
result_text = chat(chat_messages).content
client = OpenAI()
chat_messages[1]["content"] = input_text
completion = client.chat.completions.create(
model="gpt-4o-mini",
messages=chat_messages,
)
result_text = completion.choices[0].message.content
return result_text


Expand All @@ -666,9 +680,7 @@ def update_prompt_tests():
agent_info = info.exp_result.steps_info[info.step].agent_info
chat_messages = agent_info.get("chat_messages", ["No Chat Messages"])
prompt = chat_messages[1]
if isinstance(prompt, BaseMessage):
prompt = prompt.content
elif isinstance(prompt, dict):
if isinstance(prompt, dict):
prompt = prompt.get("content", "No Content")
return prompt, prompt

Expand Down
4 changes: 2 additions & 2 deletions src/agentlab/analyze/inspect_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -302,9 +302,9 @@ def summarize_stats(sub_df):
key_ = key.split(".")[1]
op = key_.split("_")[0]
if op == "cum":
record[key_] = sub_df[key].sum(skipna=True).round(3)
record[key_] = sub_df[key].sum(skipna=True)
elif op == "max":
record[key_] = sub_df[key].max(skipna=True).round(3)
record[key_] = sub_df[key].max(skipna=True)
else:
raise ValueError(f"Unknown stats operation: {op}")
return pd.Series(record)
Expand Down
6 changes: 0 additions & 6 deletions src/agentlab/llm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,12 +95,6 @@ TODO
- in their demo, they queried the SNOW UI!



## Relevant agentic tools

- [Langchain Agents](https://python.langchain.com/docs/modules/agents/)


## Relevant Benchmarks

- [bigcode/bigcode-models-leaderboard](https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard)
Expand Down
Loading