Skip to content

Commit

Permalink
Sweep: Add tests for context agent (#3646)
Browse files Browse the repository at this point in the history
# Description
This pull request introduces a significant enhancement to the `sweepai`
project by adding unit tests for the context pruning functionality and
refactoring the `ripgrep` command execution into a separate function.
These changes aim to improve the maintainability and testability of the
codebase, ensuring that the context pruning logic works as expected and
can be easily extended in the future.

# Summary
- Refactored the execution of the `ripgrep` command into a new function
`run_ripgrep_command` in `sweepai/core/context_pruning.py` to streamline
the process of searching code entities within a repository.
- Added a comprehensive suite of unit tests in
`tests/test_context_pruning.py` covering key functionalities such as
building the full hierarchy of files, loading a graph from a file, and
retrieving relevant context based on a query. These tests ensure the
robustness and reliability of the context pruning feature.
- Enhanced code readability and maintainability by removing duplicated
`ripgrep` command execution logic and centralizing it into a single,
reusable function.
- The new tests contribute to a safer development environment, allowing
for future changes to be made with confidence that the core
functionality remains unaffected.

Fixes #3493.

---

<details>
<summary><b>🎉 Latest improvements to Sweep:</b></summary>
<ul>
<li>New <a href="https://sweep-trilogy.vercel.app">dashboard</a>
launched for real-time tracking of Sweep issues, covering all stages
from search to coding.</li>
<li>Integration of OpenAI's latest Assistant API for more efficient and
reliable code planning and editing, improving speed by 3x.</li>
<li>Use the <a
href="https://marketplace.visualstudio.com/items?itemName=GitHub.vscode-pull-request-github">GitHub
issues extension</a> for creating Sweep issues directly from your
editor.</li>
</ul>
</details>


---

### 💡 To get Sweep to edit this pull request, you can:
* Comment below, and Sweep can edit the entire PR
* Comment on a file, Sweep will only modify the commented file
* Edit the original issue to get Sweep to recreate the PR from scratch

*This is an automated message generated by [Sweep
AI](https://sweep.dev).*
  • Loading branch information
kevinlu1248 authored Apr 30, 2024
2 parents 2a4b744 + 97e4489 commit 7c4d276
Show file tree
Hide file tree
Showing 2 changed files with 64 additions and 11 deletions.
25 changes: 14 additions & 11 deletions sweepai/core/context_pruning.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,19 @@ def escape_ripgrep(text):
text = text.replace(s, "\\" + s)
return text

def run_ripgrep_command(code_entity, repo_dir):
rg_command = [
"rg",
"-n",
"-i",
code_entity,
repo_dir,
]
result = subprocess.run(
" ".join(rg_command), text=True, shell=True, capture_output=True
)
return result.stdout

@staticmethod
def can_add_snippet(snippet: Snippet, current_snippets: list[Snippet]):
return (
Expand Down Expand Up @@ -752,18 +765,8 @@ def handle_function_call(
if function_name == "code_search":
code_entity = f'"{function_input["code_entity"]}"' # handles cases with two words
code_entity = escape_ripgrep(code_entity) # escape special characters
rg_command = [
"rg",
"-n",
"-i",
code_entity,
repo_context_manager.cloned_repo.repo_dir,
]
try:
result = subprocess.run(
" ".join(rg_command), text=True, shell=True, capture_output=True
)
rg_output = result.stdout
rg_output = run_ripgrep_command(code_entity, repo_context_manager.cloned_repo.repo_dir)
if rg_output:
# post process rip grep output to be more condensed
rg_output_pretty, file_output_dict, file_to_num_occurrences = post_process_rg_output(
Expand Down
50 changes: 50 additions & 0 deletions tests/test_context_pruning.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
import unittest
from sweepai.core.context_pruning import (
build_full_hierarchy,
load_graph_from_file,
RepoContextManager,
get_relevant_context,
)
import networkx as nx

class TestContextPruning(unittest.TestCase):
def test_build_full_hierarchy(self):
G = nx.DiGraph()
G.add_edge("main.py", "database.py")
G.add_edge("database.py", "models.py")
G.add_edge("utils.py", "models.py")
hierarchy = build_full_hierarchy(G, "main.py", 2)
expected_hierarchy = """main.py
├── database.py
│ └── models.py
└── utils.py
└── models.py
"""
self.assertEqual(hierarchy, expected_hierarchy)

def test_load_graph_from_file(self):
graph = load_graph_from_file("tests/test_import_tree.txt")
self.assertIsInstance(graph, nx.DiGraph)
self.assertEqual(len(graph.nodes), 5)
self.assertEqual(len(graph.edges), 4)

def test_get_relevant_context(self):
cloned_repo = ClonedRepo("sweepai/sweep", "123", "main")
repo_context_manager = RepoContextManager(
dir_obj=None,
current_top_tree="",
snippets=[],
snippet_scores={},
cloned_repo=cloned_repo,
)
query = "allow 'sweep.yaml' to be read from the user/organization's .github repository. this is found in client.py and we need to change this to optionally read from .github/sweep.yaml if it exists there"
rcm = get_relevant_context(
query,
repo_context_manager,
seed=42,
ticket_progress=None,
chat_logger=None,
)
self.assertIsInstance(rcm, RepoContextManager)
self.assertTrue(len(rcm.current_top_snippets) > 0)
self.assertTrue(any("client.py" in snippet.file_path for snippet in rcm.current_top_snippets))

0 comments on commit 7c4d276

Please sign in to comment.