-
Notifications
You must be signed in to change notification settings - Fork 55
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #68 from luxiangtaoya/luxiangtaoya-patch-1
data viz, rm img bg, math problem solving
- Loading branch information
Showing
11 changed files
with
512 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
83 changes: 83 additions & 0 deletions
83
src/en/guide/use_cases/agent/code_interpreter/data_visualization.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
# Data Visualization | ||
## Overview | ||
Data visualization is the process of representing data in a visual form, such as charts, graphs, and other visual elements. It helps us to discover patterns, trends, and correlations in the data, as well as provide insights and understanding. Through data visualization, we can gain a better understanding of the meaning of the data, communicate and explain the results, and support data-driven decision making and communication. | ||
## Example : | ||
### Task | ||
Use `CodeInterpreter` to perform a simple data analysis and visualize the sklearn Iris dataset. | ||
### Code | ||
```python | ||
import asyncio | ||
from metagpt.roles.code_interpreter import CodeInterpreter | ||
|
||
async def main(requirement: str = ""): | ||
code_interpreter = CodeInterpreter(use_tools=False, goal=requirement) | ||
await code_interpreter.run(requirement) | ||
|
||
if __name__ == "__main__": | ||
requirement = ( | ||
"Run data analysis on sklearn Iris dataset, include a plot." | ||
) | ||
asyncio.run(main(requirement)) | ||
``` | ||
### Execution process | ||
1. `CodeInterpreter` proposes the following solution tasks: | ||
```json | ||
[ | ||
{ | ||
"task_id": "1", | ||
"dependent_task_ids": [], | ||
"instruction": "Load the Iris dataset from sklearn." | ||
}, | ||
{ | ||
"task_id": "2", | ||
"dependent_task_ids": ["1"], | ||
"instruction": "Perform exploratory data analysis on the Iris dataset." | ||
}, | ||
{ | ||
"task_id": "3", | ||
"dependent_task_ids": ["2"], | ||
"instruction": "Create a plot visualizing the Iris dataset features." | ||
} | ||
] | ||
``` | ||
`CodeInterpreter` is able to divide the problem into logical tasks, and And run according to the steps of loading the data, analyzing the data, and plotting the chart. | ||
|
||
2. `CodeInterpreter` writes the following code: | ||
```python | ||
# ----------------------------------task3------------------------------------ | ||
from sklearn.datasets import load_iris | ||
iris_data = load_iris() | ||
iris_data.keys() | ||
!pip install scikit-learn | ||
from sklearn.datasets import load_iris | ||
iris_data = load_iris() | ||
iris_data.keys() | ||
# ----------------------------------task2------------------------------------ | ||
import pandas as pd | ||
|
||
# Create a DataFrame from the iris dataset | ||
iris_df = pd.DataFrame(iris_data['data'], columns=iris_data['feature_names']) | ||
iris_df['species'] = pd.Categorical.from_codes(iris_data['target'], iris_data['target_names']) | ||
|
||
# Summary statistics | ||
summary_statistics = iris_df.describe() | ||
|
||
# Check for missing values | ||
missing_values = iris_df.isnull().sum() | ||
|
||
(summary_statistics, missing_values) | ||
# ----------------------------------task3------------------------------------ | ||
import matplotlib.pyplot as plt | ||
import seaborn as sns | ||
|
||
# Use seaborn's pairplot to visualize the dataset features | ||
sns.set(style='whitegrid', context='notebook') | ||
iris_pairplot = sns.pairplot(iris_df, hue='species', height=2.5) | ||
plt.show() | ||
``` | ||
During the completion of task 1, an error occurred on the first execution due to the lack of scikit-learn installed in the environment. However, `CodeInterpreter` can analyze and resolve this issue by installing scikit-learn. In task 3, `CodeInterpreter` uses the pairplot function from seaborn to create a scatterplot matrix, which visualizes the relationships between different features in the dataset and differentiates data points of different species using colors. Finally, `plt.show()` is used to display the graph. | ||
### Output | ||
Below is the graph plotted by `CodeInterpreter` running the code. It is evident that the code executed successfully and generated a beautiful visualization table, which can help us analyze the features of the dataset more effectively. | ||
<div align=center> | ||
<img src="../../../../../public/image/guide/use_cases/CodeInterpreter/output.png" width="1000" height="1000"> | ||
</div> |
76 changes: 76 additions & 0 deletions
76
src/en/guide/use_cases/agent/code_interpreter/image_removebg.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Remove The Background of Image | ||
|
||
## Overview | ||
Image background removal is a technique used to separate the main objects from the background in an image. It finds applications in various fields such as image editing, person segmentation, product showcasing, and computer vision. By removing the background, it highlights the subject, enhances the visual appeal of the image, and provides a cleaner base for further processing and analysis. | ||
## Example : | ||
### Task | ||
Use `CodeInterpreter` to remove background from a picture of a dog. | ||
### Code | ||
```python | ||
import asyncio | ||
from metagpt.roles.code_interpreter import CodeInterpreter | ||
|
||
async def main(requirement: str = ""): | ||
code_interpreter = CodeInterpreter(use_tools=False, goal=requirement) | ||
await code_interpreter.run(requirement) | ||
|
||
if __name__ == "__main__": | ||
image_path = '/data/luxiangtao/data_agents_opt-code_intepreter/dog.JPEG' | ||
save_path = '/data/luxiangtao/data_agents_opt-code_intepreter/dog_rmg.png' | ||
requirement = ( | ||
f"This is a image, you need to use python toolkit rembg to remove the background of the image and save the result. image path:{image_path}; save path:{save_path}." | ||
) | ||
asyncio.run(main(requirement)) | ||
``` | ||
### Execution process | ||
1. `CodeInterpreter` proposes the following solution steps: | ||
```json | ||
[ | ||
{ | ||
"task_id": "1", | ||
"dependent_task_ids": [], | ||
"instruction": "Install the rembg package using pip." | ||
}, | ||
{ | ||
"task_id": "2", | ||
"dependent_task_ids": ["1"], | ||
"instruction": "Use the rembg package to remove the background from the image at the specified path." | ||
}, | ||
{ | ||
"task_id": "3", | ||
"dependent_task_ids": ["2"], | ||
"instruction": "Save the image with the background removed to the specified save path." | ||
} | ||
] | ||
``` | ||
`CodeInterpreter` is able to divide the problem into logical tasks, and here we can see that the first step is to install the Python library "rembg". | ||
|
||
2. `CodeInterpreter` writes the following code: | ||
```python | ||
# -----------------------------task1------------------------------- | ||
!pip install rembg | ||
# -----------------------------task2------------------------------- | ||
from rembg import remove | ||
input_path = '/data/luxiangtao/data_agents_opt-code_intepreter/beauty.JPEG' | ||
output_path = '/data/luxiangtao/data_agents_opt-code_intepreter/beauty_rmg.png' | ||
|
||
# Read the input image | ||
with open(input_path, 'rb') as i: | ||
input_image = i.read() | ||
|
||
# Remove the background | ||
output_image = remove(input_image) | ||
|
||
# ------------------------------task3------------------------------- | ||
# Write the output image | ||
with open(output_path, 'wb') as o: | ||
o.write(output_image) | ||
``` | ||
`rembg` is an open-source Python toolkit that enables automatic image background removal and can run on CPU. When we mention the use of this toolkit in the requirements, `CodeInterpreter` is capable of automatically installing and correctly utilizing this toolkit.(This is likely because LLM learned the usage of the "rembg" Python library during its training) | ||
### Output | ||
Here is the input image of a dog and the image of the dog with the background removed. It can be seen that the background removal effect is excellent, and `CodeInterpreter` can smoothly accomplish this problem. | ||
<div align=center> | ||
<img src="../../../../../public/image/guide/use_cases/CodeInterpreter/dog.JPEG" width="500" height="300"> | ||
<img src="../../../../../public/image/guide/use_cases/CodeInterpreter/dog_rmg.png" width="500" height="300"> | ||
</div> | ||
|
88 changes: 88 additions & 0 deletions
88
src/en/guide/use_cases/agent/code_interpreter/solve_mathematical_problems.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
# Solve Mathematical Problems | ||
|
||
## Overview | ||
Use `CodeInterpreter` to solve Mathematical problems randomly selected from the level5 level of the Math dataset. | ||
## Example : | ||
|
||
### Problem | ||
At a school, all 60 students play on at least one of three teams: Basketball, Soccer, and Mathletics. 8 students play all three sports, half the students play basketball, and the ratio of the size of the math team to the size of the basketball team to the size of the soccer team is $4:3:2$. How many students at the school play on exactly two teams? | ||
### Code | ||
```python | ||
import asyncio | ||
|
||
from metagpt.roles.code_interpreter import CodeInterpreter | ||
|
||
async def main(requirement: str = ""): | ||
code_interpreter = CodeInterpreter(use_tools=False, goal=requirement) | ||
await code_interpreter.run(requirement) | ||
|
||
if __name__ == "__main__": | ||
problem = "At a school, all 60 students play on at least one of three teams: Basketball, Soccer, and Mathletics. 8 students play all three sports, half the students play basketball, and the ratio of the size of the math team to the size of the basketball team to the size of the soccer team is $4:3:2$. How many students at the school play on exactly two teams?" | ||
requirement = ( | ||
f"This is a math problem:{problem}. You can analyze and solve it step by step or use Python code to solve it." | ||
) | ||
|
||
asyncio.run(main(requirement)) | ||
``` | ||
### Execution process | ||
1. `CodeInterpreter` proposes the following solution steps: | ||
```json | ||
[ | ||
{ | ||
"task_id": "1", | ||
"dependent_task_ids": [], | ||
"instruction": "Calculate the total number of students playing basketball." | ||
}, | ||
{ | ||
"task_id": "2", | ||
"dependent_task_ids": ["1"], | ||
"instruction": "Determine the number of students on each team using the given ratios." | ||
}, | ||
{ | ||
"task_id": "3", | ||
"dependent_task_ids": ["1", "2"], | ||
"instruction": "Use the inclusion-exclusion principle to find the number of students playing on exactly two teams." | ||
} | ||
] | ||
``` | ||
Since half the students are on the basketball team, it is easy to calculate the total number of basketball team members; Then the total number of math team and soccer team can be calculated according to the proportion of the number of math team, basketball team and soccer team. Finally, the answer is obtained according to the exclusion principle. It can be seen that the steps planned by `CodeInterpreter` are very reasonable and in line with human thinking of solving mathematical problems. | ||
|
||
2. `CodeInterpreter` writes the following code: | ||
```python | ||
# ------------------------------task1------------------------------ | ||
total_students = 60 | ||
half_basketball = total_students / 2 | ||
half_basketball | ||
# -------------------------------task2----------------------------- | ||
basketball_team = 30 | ||
ratio_math_basketball_soccer = [4, 3, 2] | ||
|
||
# Calculate the total ratio units | ||
total_ratio_units = sum(ratio_math_basketball_soccer) | ||
|
||
# Calculate the number of students per ratio unit | ||
students_per_unit = basketball_team / ratio_math_basketball_soccer[1] | ||
|
||
# Calculate the number of students on each team | ||
math_team = students_per_unit * ratio_math_basketball_soccer[0] | ||
soccer_team = students_per_unit * ratio_math_basketball_soccer[2] | ||
|
||
(math_team, basketball_team, soccer_team) | ||
|
||
# ------------------------------task3------------------------------- | ||
# Given values from previous tasks | ||
math_team = 40.0 | ||
basketball_team = 30 | ||
soccer_team = 20.0 | ||
students_all_three = 8 | ||
|
||
# Total number of students | ||
total_students = 60 | ||
|
||
# Calculate the number of students playing in exactly two teams using the inclusion-exclusion principle | ||
students_two_teams = (math_team + basketball_team + soccer_team) - total_students - (2 * students_all_three) | ||
students_two_teams | ||
``` | ||
### Output | ||
The code generated by `CodeInterpreter` follows the planned task exactly and runs successfully, eventually calculating the correct answer to be: **14** | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
83 changes: 83 additions & 0 deletions
83
src/zh/guide/use_cases/agent/code_interpreter/data_visualization.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
# 数据可视化分析 | ||
## 概述 | ||
数据可视化是通过图表、图形和其他视觉元素将数据转化为可理解和易于分析的形式。它帮助我们发现数据中的模式、趋势和关联,以及提供洞察力和见解。通过数据可视化,我们可以更好地理解数据的含义,传达和解释数据的结果,并支持数据驱动的决策和沟通。 | ||
## 例子 | ||
### 任务 | ||
使用`CodeInterpreter`对sklearn Iris数据集进行简单的数据分析并绘制可视化图表。 | ||
### 代码 | ||
```python | ||
import asyncio | ||
from metagpt.roles.code_interpreter import CodeInterpreter | ||
|
||
async def main(requirement: str = ""): | ||
code_interpreter = CodeInterpreter(use_tools=False, goal=requirement) | ||
await code_interpreter.run(requirement) | ||
|
||
if __name__ == "__main__": | ||
requirement = ( | ||
"Run data analysis on sklearn Iris dataset, include a plot." | ||
) | ||
asyncio.run(main(requirement)) | ||
``` | ||
### 运行过程 | ||
1. `CodeInterpreter` 提出的`task`如下: | ||
```json | ||
[ | ||
{ | ||
"task_id": "1", | ||
"dependent_task_ids": [], | ||
"instruction": "Load the Iris dataset from sklearn." | ||
}, | ||
{ | ||
"task_id": "2", | ||
"dependent_task_ids": ["1"], | ||
"instruction": "Perform exploratory data analysis on the Iris dataset." | ||
}, | ||
{ | ||
"task_id": "3", | ||
"dependent_task_ids": ["2"], | ||
"instruction": "Create a plot visualizing the Iris dataset features." | ||
} | ||
] | ||
``` | ||
`CodeInterpreter` 能够把任务分解为合理的`tasks`, 并按照加载数据、分析数据和绘制图表的步骤运行。 | ||
|
||
2. `CodeInterpreter`写的代码如下: | ||
```python | ||
# ----------------------------------task3------------------------------------ | ||
from sklearn.datasets import load_iris | ||
iris_data = load_iris() | ||
iris_data.keys() | ||
!pip install scikit-learn | ||
from sklearn.datasets import load_iris | ||
iris_data = load_iris() | ||
iris_data.keys() | ||
# ----------------------------------task2------------------------------------ | ||
import pandas as pd | ||
|
||
# Create a DataFrame from the iris dataset | ||
iris_df = pd.DataFrame(iris_data['data'], columns=iris_data['feature_names']) | ||
iris_df['species'] = pd.Categorical.from_codes(iris_data['target'], iris_data['target_names']) | ||
|
||
# Summary statistics | ||
summary_statistics = iris_df.describe() | ||
|
||
# Check for missing values | ||
missing_values = iris_df.isnull().sum() | ||
|
||
(summary_statistics, missing_values) | ||
# ----------------------------------task3------------------------------------ | ||
import matplotlib.pyplot as plt | ||
import seaborn as sns | ||
|
||
# Use seaborn's pairplot to visualize the dataset features | ||
sns.set(style='whitegrid', context='notebook') | ||
iris_pairplot = sns.pairplot(iris_df, hue='species', height=2.5) | ||
plt.show() | ||
``` | ||
在完成`task1`时,由于环境中没有安装`scikit-learn`导致第一次执行报错,但`CodeInterpreter`可以分析并通过安装`scikit-learn`来解决这个问题。在`task3`中`CodeInterpreter`使用`seaborn`的`pairplot`函数绘制一个散点图矩阵,用于可视化数据集中不同特征之间的关系,并通过颜色区分不同种类的数据点,最后使用`plt.show()`将图表显示出来。 | ||
### 运行结果 | ||
下面是`CodeInterpreter`运行代码绘制出的图,可以看出代码成功执行并绘制了精美的可视化图表,帮助我们更好地对数据集特征进行分析。 | ||
<div align=center> | ||
<img src="../../../../../public/image/guide/use_cases/CodeInterpreter/output.png" width="1000" height="1000"> | ||
</div> |
Oops, something went wrong.