diff --git a/docs/core-concepts/Tools.md b/docs/core-concepts/Tools.md index f3564d83b2..1fa4a6fe15 100644 --- a/docs/core-concepts/Tools.md +++ b/docs/core-concepts/Tools.md @@ -105,6 +105,7 @@ Here is a list of the available tools and their descriptions: | **CodeInterpreterTool** | A tool for interpreting python code. | | **ComposioTool** | Enables use of Composio tools. | | **CSVSearchTool** | A RAG tool designed for searching within CSV files, tailored to handle structured data. | +| **DALL-E Tool** | A tool for generating images using the DALL-E API. | | **DirectorySearchTool** | A RAG tool for searching within directories, useful for navigating through file systems. | | **DOCXSearchTool** | A RAG tool aimed at searching within DOCX documents, ideal for processing Word files. | | **DirectoryReadTool** | Facilitates reading and processing of directory structures and their contents. | @@ -121,6 +122,7 @@ Here is a list of the available tools and their descriptions: | **MDXSearchTool** | A RAG tool tailored for searching within Markdown (MDX) files, useful for documentation. | | **PDFSearchTool** | A RAG tool aimed at searching within PDF documents, ideal for processing scanned documents. | | **PGSearchTool** | A RAG tool optimized for searching within PostgreSQL databases, suitable for database queries. | +| **Vision Tool** | A tool for generating images using the DALL-E API. | | **RagTool** | A general-purpose RAG tool capable of handling various data sources and types. | | **ScrapeElementFromWebsiteTool** | Enables scraping specific elements from websites, useful for targeted data extraction. | | **ScrapeWebsiteTool** | Facilitates scraping entire websites, ideal for comprehensive data collection. | diff --git a/docs/tools/DALL-ETool.md b/docs/tools/DALL-ETool.md new file mode 100644 index 0000000000..a315c7c101 --- /dev/null +++ b/docs/tools/DALL-ETool.md @@ -0,0 +1,41 @@ +# DALL-E Tool + +## Description +This tool is used to give the Agent the ability to generate images using the DALL-E model. It is a transformer-based model that generates images from textual descriptions. This tool allows the Agent to generate images based on the text input provided by the user. + +## Installation +Install the crewai_tools package +```shell +pip install 'crewai[tools]' +``` + +## Example + +Remember that when using this tool, the text must be generated by the Agent itself. The text must be a description of the image you want to generate. + +```python +from crewai_tools import DallETool + +Agent( + ... + tools=[DallETool()], +) +``` + +If needed you can also tweak the parameters of the DALL-E model by passing them as arguments to the `DallETool` class. For example: + +```python +from crewai_tools import DallETool + +dalle_tool = DallETool(model: str = "dall-e-3", + size: str = "1024x1024", + quality: str = "standard", + n: int = 1) + +Agent( + ... + tools=[dalle_tool] +) +``` + +The parameter are based on the `client.images.generate` method from the OpenAI API. For more information on the parameters, please refer to the [OpenAI API documentation](https://platform.openai.com/docs/guides/images/introduction?lang=python). diff --git a/docs/tools/VisionTool.md b/docs/tools/VisionTool.md new file mode 100644 index 0000000000..5ed7f7d3c9 --- /dev/null +++ b/docs/tools/VisionTool.md @@ -0,0 +1,30 @@ +# Vision Tool + +## Description + +This tool is used to extract text from images. When passed to the agent it will extract the text from the image and then use it to generate a response, report or any other output. The URL or the PATH of the image should be passed to the Agent. + + +## Installation +Install the crewai_tools package +```shell +pip install 'crewai[tools]' +``` + +## Usage + +In order to use the VisionTool, the OpenAI API key should be set in the environment variable `OPENAI_API_KEY`. + +```python +from crewai_tools import VisionTool + +vision_tool = VisionTool() + +@agent +def researcher(self) -> Agent: + return Agent( + config=self.agents_config["researcher"], + allow_delegation=False, + tools=[vision_tool] + ) +``` \ No newline at end of file diff --git a/mkdocs.yml b/mkdocs.yml index 22158a0a4c..207b68b228 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -162,6 +162,7 @@ nav: - File Read: 'tools/FileReadTool.md' - Selenium Scraper: 'tools/SeleniumScrapingTool.md' - Directory RAG Search: 'tools/DirectorySearchTool.md' + - DALL-E Tool: 'tools/DALL-ETool.md' - PDF RAG Search: 'tools/PDFSearchTool.md' - TXT RAG Search: 'tools/TXTSearchTool.md' - CSV RAG Search: 'tools/CSVSearchTool.md' @@ -170,6 +171,7 @@ nav: - Docx Rag Search: 'tools/DOCXSearchTool.md' - MDX RAG Search: 'tools/MDXSearchTool.md' - PG RAG Search: 'tools/PGSearchTool.md' + - Vision Tool: 'tools/VisionTool.md' - Website RAG Search: 'tools/WebsiteSearchTool.md' - Github RAG Search: 'tools/GitHubSearchTool.md' - Code Docs RAG Search: 'tools/CodeDocsSearchTool.md'