Skip to content

Commit

Permalink
Merge branch 'All-Hands-AI:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
RainRat authored Mar 5, 2025
2 parents f38428a + 1ffee80 commit cb75170
Show file tree
Hide file tree
Showing 19 changed files with 176 additions and 34 deletions.
10 changes: 8 additions & 2 deletions .github/workflows/ghcr-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,10 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3.4.0
uses: docker/setup-qemu-action@v3.6.0
with:
image: tonistiigi/binfmt:latest
- name: Login to GHCR
Expand Down Expand Up @@ -90,8 +92,10 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3.4.0
uses: docker/setup-qemu-action@v3.6.0
with:
image: tonistiigi/binfmt:latest
- name: Login to GHCR
Expand Down Expand Up @@ -154,6 +158,8 @@ jobs:
base_image: ['nikolaik']
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
- name: Cache Poetry dependencies
uses: actions/cache@v4
with:
Expand Down
8 changes: 4 additions & 4 deletions docs/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
"@docusaurus/module-type-aliases": "^3.5.1",
"@docusaurus/tsconfig": "^3.7.0",
"@docusaurus/types": "^3.5.1",
"typescript": "~5.7.3"
"typescript": "~5.8.2"
},
"browserslist": {
"production": [
Expand Down
5 changes: 3 additions & 2 deletions evaluation/benchmarks/aider_bench/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,10 @@ You can update the arguments in the script
./evaluation/benchmarks/aider_bench/scripts/run_infer.sh eval_gpt35_turbo HEAD CodeActAgent 100 1 "1,3,10"
```

### Run Inference on `RemoteRuntime` (experimental)
### Run Inference on `RemoteRuntime`

This is in beta. Fill out [this form](https://docs.google.com/forms/d/e/1FAIpQLSckVz_JFwg2_mOxNZjCtr7aoBFI2Mwdan3f75J_TrdMS1JV2g/viewform) to apply if you want to try this out!

This is in limited beta. Contact Xingyao over slack if you want to try this out!

```bash
./evaluation/benchmarks/aider_bench/scripts/run_infer.sh [model_config] [git-version] [agent] [eval_limit] [eval-num-workers] [eval_ids]
Expand Down
5 changes: 3 additions & 2 deletions evaluation/benchmarks/commit0_bench/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,9 +58,10 @@ then your command would be:
./evaluation/benchmarks/commit0_bench/scripts/run_infer.sh lite llm.eval_sonnet HEAD CodeActAgent 10 30 1 wentingzhao/commit0_combined test
```

### Run Inference on `RemoteRuntime` (experimental)
### Run Inference on `RemoteRuntime`

This is in beta. Fill out [this form](https://docs.google.com/forms/d/e/1FAIpQLSckVz_JFwg2_mOxNZjCtr7aoBFI2Mwdan3f75J_TrdMS1JV2g/viewform) to apply if you want to try this out!

This is in limited beta. Contact Xingyao over slack if you want to try this out!

```bash
./evaluation/benchmarks/commit0_bench/scripts/run_infer.sh [repo_split] [model_config] [git-version] [agent] [eval_limit] [max_iter] [num_workers] [dataset] [dataset_split]
Expand Down
5 changes: 3 additions & 2 deletions evaluation/benchmarks/miniwob/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,10 @@ Access with browser the above MiniWoB URLs and see if they load correctly.
./evaluation/benchmarks/miniwob/scripts/run_infer.sh llm.claude-35-sonnet-eval
```

### Run Inference on `RemoteRuntime` (experimental)
### Run Inference on `RemoteRuntime`

This is in beta. Fill out [this form](https://docs.google.com/forms/d/e/1FAIpQLSckVz_JFwg2_mOxNZjCtr7aoBFI2Mwdan3f75J_TrdMS1JV2g/viewform) to apply if you want to try this out!

This is in limited beta. Contact Xingyao over slack if you want to try this out!

```bash
./evaluation/benchmarks/miniwob/scripts/run_infer.sh [model_config] [git-version] [agent] [note] [eval_limit] [num_workers]
Expand Down
8 changes: 4 additions & 4 deletions evaluation/benchmarks/swe_bench/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,9 @@ then your command would be:
./evaluation/benchmarks/swe_bench/scripts/run_infer.sh llm.eval_gpt4_1106_preview HEAD CodeActAgent 10
```

### Run Inference on `RemoteRuntime` (experimental)
### Run Inference on `RemoteRuntime`

This is in limited beta. Contact Xingyao over slack if you want to try this out!
This is in beta. Fill out [this form](https://docs.google.com/forms/d/e/1FAIpQLSckVz_JFwg2_mOxNZjCtr7aoBFI2Mwdan3f75J_TrdMS1JV2g/viewform) to apply if you want to try this out!

```bash
./evaluation/benchmarks/swe_bench/scripts/run_infer.sh [model_config] [git-version] [agent] [eval_limit] [max_iter] [num_workers] [dataset] [dataset_split]
Expand Down Expand Up @@ -163,9 +163,9 @@ The final results will be saved to `evaluation/evaluation_outputs/outputs/swe_be
- `report.json`: a JSON file that contains keys like `"resolved_ids"` pointing to instance IDs that are resolved by the agent.
- `logs/`: a directory of test logs
### Run evaluation with `RemoteRuntime` (experimental)
### Run evaluation with `RemoteRuntime`
This is in limited beta. Contact Xingyao over slack if you want to try this out!
This is in beta. Fill out [this form](https://docs.google.com/forms/d/e/1FAIpQLSckVz_JFwg2_mOxNZjCtr7aoBFI2Mwdan3f75J_TrdMS1JV2g/viewform) to apply if you want to try this out!
```bash
./evaluation/benchmarks/swe_bench/scripts/eval_infer_remote.sh [output.jsonl filepath] [num_workers]
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
import { Autocomplete, AutocompleteItem } from "@heroui/react";
import { ReactNode } from "react";
import { OptionalTag } from "./optional-tag";

interface SettingsDropdownInputProps {
testId: string;
label: string;
label: ReactNode;
name: string;
items: { key: React.Key; label: string }[];
showOptionalTag?: boolean;
Expand All @@ -29,7 +30,7 @@ export function SettingsDropdownInput({
{showOptionalTag && <OptionalTag />}
</div>
<Autocomplete
aria-label={label}
aria-label={typeof label === "string" ? label : name}
data-testid={testId}
name={name}
defaultItems={items}
Expand Down
21 changes: 12 additions & 9 deletions frontend/src/components/shared/modals/settings/settings-form.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -43,15 +43,18 @@ export function SettingsForm({ settings, models, onClose }: SettingsFormProps) {
const handleFormSubmission = async (formData: FormData) => {
const newSettings = extractSettings(formData);

await saveUserSettings(newSettings);
onClose();
resetOngoingSession();

posthog.capture("settings_saved", {
LLM_MODEL: newSettings.LLM_MODEL,
LLM_API_KEY: newSettings.LLM_API_KEY ? "SET" : "UNSET",
REMOTE_RUNTIME_RESOURCE_FACTOR:
newSettings.REMOTE_RUNTIME_RESOURCE_FACTOR,
await saveUserSettings(newSettings, {
onSuccess: () => {
onClose();
resetOngoingSession();

posthog.capture("settings_saved", {
LLM_MODEL: newSettings.LLM_MODEL,
LLM_API_KEY: newSettings.LLM_API_KEY ? "SET" : "UNSET",
REMOTE_RUNTIME_RESOURCE_FACTOR:
newSettings.REMOTE_RUNTIME_RESOURCE_FACTOR,
});
},
});
};

Expand Down
3 changes: 0 additions & 3 deletions frontend/src/hooks/query/use-settings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ import React from "react";
import posthog from "posthog-js";
import OpenHands from "#/api/open-hands";
import { useAuth } from "#/context/auth-context";
import { useConfig } from "#/hooks/query/use-config";
import { DEFAULT_SETTINGS } from "#/services/settings";

const getSettingsQueryFn = async () => {
Expand All @@ -27,12 +26,10 @@ const getSettingsQueryFn = async () => {

export const useSettings = () => {
const { setGitHubTokenIsSet, githubTokenIsSet } = useAuth();
const { data: config } = useConfig();

const query = useQuery({
queryKey: ["settings", githubTokenIsSet],
queryFn: getSettingsQueryFn,
enabled: config?.APP_MODE !== "saas" || githubTokenIsSet,
// Only retry if the error is not a 404 because we
// would want to show the modal immediately if the
// settings are not found
Expand Down
10 changes: 9 additions & 1 deletion frontend/src/routes/account-settings.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -278,7 +278,15 @@ function AccountSettings() {
<SettingsDropdownInput
testId="runtime-settings-input"
name="runtime-settings-input"
label="Runtime Settings"
label={
<>
Runtime Settings (
<a href="mailto:[email protected]">
get in touch for access
</a>
)
</>
}
items={REMOTE_RUNTIME_OPTIONS}
defaultSelectedKey={settings.REMOTE_RUNTIME_RESOURCE_FACTOR?.toString()}
isDisabled
Expand Down
5 changes: 4 additions & 1 deletion openhands/agenthub/codeact_agent/prompts/additional_info.j2
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,9 @@ At the user's request, repository {{ repository_info.repo_name }} has been clone
{{ repository_instructions }}
</REPOSITORY_INSTRUCTIONS>
{% endif %}
{% if runtime_info and runtime_info.available_hosts -%}
{% if runtime_info and (runtime_info.available_hosts or runtime_info.additional_agent_instructions) -%}
<RUNTIME_INFORMATION>
{% if runtime_info.available_hosts %}
The user has access to the following hosts for accessing a web application,
each of which has a corresponding port:
{% for host, port in runtime_info.available_hosts.items() -%}
Expand All @@ -18,5 +19,7 @@ each of which has a corresponding port:
When starting a web server, use the corresponding ports. You should also
set any options to allow iframes and CORS requests, and allow the server to
be accessed from any host (e.g. 0.0.0.0).
{% endif %}
{{ runtime_info.additional_agent_instructions }}
</RUNTIME_INFORMATION>
{% endif %}
12 changes: 12 additions & 0 deletions openhands/core/config/condenser_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,17 @@ class ObservationMaskingCondenserConfig(BaseModel):
model_config = {'extra': 'forbid'}


class BrowserOutputCondenserConfig(BaseModel):
"""Configuration for the BrowserOutputCondenser."""

type: Literal['browser_output_masking'] = Field('browser_output_masking')
attention_window: int = Field(
default=1,
description='The number of most recent browser output observations that will not be masked.',
ge=1,
)


class RecentEventsCondenserConfig(BaseModel):
"""Configuration for RecentEventsCondenser."""

Expand Down Expand Up @@ -115,6 +126,7 @@ class LLMAttentionCondenserConfig(BaseModel):
CondenserConfig = (
NoOpCondenserConfig
| ObservationMaskingCondenserConfig
| BrowserOutputCondenserConfig
| RecentEventsCondenserConfig
| LLMSummarizingCondenserConfig
| AmortizedForgettingCondenserConfig
Expand Down
4 changes: 4 additions & 0 deletions openhands/memory/condenser/impl/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
from openhands.memory.condenser.impl.amortized_forgetting_condenser import (
AmortizedForgettingCondenser,
)
from openhands.memory.condenser.impl.browser_output_condenser import (
BrowserOutputCondenser,
)
from openhands.memory.condenser.impl.llm_attention_condenser import (
ImportantEventSelection,
LLMAttentionCondenser,
Expand All @@ -23,5 +26,6 @@
'LLMSummarizingCondenser',
'NoOpCondenser',
'ObservationMaskingCondenser',
'BrowserOutputCondenser',
'RecentEventsCondenser',
]
48 changes: 48 additions & 0 deletions openhands/memory/condenser/impl/browser_output_condenser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
from __future__ import annotations

from openhands.core.config.condenser_config import BrowserOutputCondenserConfig
from openhands.events.event import Event
from openhands.events.observation import BrowserOutputObservation
from openhands.events.observation.agent import AgentCondensationObservation
from openhands.memory.condenser.condenser import Condenser


class BrowserOutputCondenser(Condenser):
"""A condenser that masks the observations from browser outputs outside of a recent attention window.
The intent here is to mask just the browser outputs and leave everything else untouched. This is important because currently we provide screenshots and accessibility trees as input to the model for browser observations. These are really large and consume a lot of tokens without any benefits in performance. So we want to mask all such observations from all previous timesteps, and leave only the most recent one in context.
"""

def __init__(self, attention_window: int = 1):
self.attention_window = attention_window
super().__init__()

def condense(self, events: list[Event]) -> list[Event]:
"""Replace the content of browser observations outside of the attention window with a placeholder."""
results: list[Event] = []
cnt: int = 0
for event in reversed(events):
if (
isinstance(event, BrowserOutputObservation)
and cnt >= self.attention_window
):
results.append(
AgentCondensationObservation(
f'Current URL: {event.url}\nContent Omitted'
)
)
else:
results.append(event)
if isinstance(event, BrowserOutputObservation):
cnt += 1

return list(reversed(results))

@classmethod
def from_config(
cls, config: BrowserOutputCondenserConfig
) -> BrowserOutputCondenser:
return BrowserOutputCondenser(**config.model_dump(exclude=['type']))


BrowserOutputCondenser.register_config(BrowserOutputCondenserConfig)
4 changes: 4 additions & 0 deletions openhands/runtime/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -487,3 +487,7 @@ def vscode_url(self) -> str | None:
@property
def web_hosts(self) -> dict[str, int]:
return {}

@property
def additional_agent_instructions(self) -> str:
return ''
4 changes: 4 additions & 0 deletions openhands/runtime/impl/daytona/daytona_runtime.py
Original file line number Diff line number Diff line change
Expand Up @@ -260,3 +260,7 @@ def vscode_url(self) -> str | None:
)

return self._vscode_url

@property
def additional_agent_instructions(self) -> str:
return f'When showing endpoints to access applications for any port, e.g. port 3000, instead of localhost:3000, use this format: {self._construct_api_url(3000)}.'
8 changes: 7 additions & 1 deletion openhands/utils/prompt.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
@dataclass
class RuntimeInfo:
available_hosts: dict[str, int]
additional_agent_instructions: str


@dataclass
Expand Down Expand Up @@ -56,7 +57,9 @@ def __init__(
self.user_template: Template = self._load_template('user_prompt')
self.additional_info_template: Template = self._load_template('additional_info')
self.microagent_info_template: Template = self._load_template('microagent_info')
self.runtime_info = RuntimeInfo(available_hosts={})
self.runtime_info = RuntimeInfo(
available_hosts={}, additional_agent_instructions=''
)

self.knowledge_microagents: dict[str, KnowledgeMicroAgent] = {}
self.repo_microagents: dict[str, RepoMicroAgent] = {}
Expand Down Expand Up @@ -113,6 +116,9 @@ def get_system_message(self) -> str:

def set_runtime_info(self, runtime: Runtime) -> None:
self.runtime_info.available_hosts = runtime.web_hosts
self.runtime_info.additional_agent_instructions = (
runtime.additional_agent_instructions
)

def set_repository_info(
self,
Expand Down
Loading

0 comments on commit cb75170

Please sign in to comment.