Merge branch 'main' into devin/1735621211-fix-llm-parameter-case-norm…

…alization
crewAIInc · Jan 5, 2025 · 56fb691 · 56fb691
2 parents 4c3253e + 440883e
commit 56fb691
Show file tree

Hide file tree

Showing 55 changed files with 2,681 additions and 1,049 deletions.
diff --git a/.gitignore b/.gitignore
@@ -21,3 +21,4 @@ crew_tasks_output.json
 .mypy_cache
 .ruff_cache
 .venv
+agentops.log
diff --git a/docs/concepts/flows.mdx b/docs/concepts/flows.mdx
@@ -138,7 +138,7 @@ print("---- Final Output ----")
 print(final_output)
 ````
 
-``` text Output
+```text Output
 ---- Final Output ----
 Second method received: Output from first_method
 ````

diff --git a/docs/concepts/knowledge.mdx b/docs/concepts/knowledge.mdx
@@ -4,8 +4,6 @@ description: What is knowledge in CrewAI and how to use it.
 icon: book
 ---
 
-# Using Knowledge in CrewAI
-
 ## What is Knowledge?
 
 Knowledge in CrewAI is a powerful system that allows AI agents to access and utilize external information sources during their tasks.
@@ -36,7 +34,20 @@ CrewAI supports various types of knowledge sources out of the box:
   </Card>
 </CardGroup>
 
-## Quick Start
+## Supported Knowledge Parameters
+
+| Parameter                    | Type                                | Required | Description                                                                                                                                           |
+| :--------------------------- | :---------------------------------- | :------- | :---------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `sources`                  | **List[BaseKnowledgeSource]**        | Yes      | List of knowledge sources that provide content to be stored and queried. Can include PDF, CSV, Excel, JSON, text files, or string content.           |
+| `collection_name`          | **str**                              | No       | Name of the collection where the knowledge will be stored. Used to identify different sets of knowledge. Defaults to "knowledge" if not provided.     |
+| `storage`                  | **Optional[KnowledgeStorage]**       | No       | Custom storage configuration for managing how the knowledge is stored and retrieved. If not provided, a default storage will be created.              |
+
+## Quickstart Example
+
+<Tip>
+For file-Based Knowledge Sources, make sure to place your files in a `knowledge` directory at the root of your project. 
+Also, use relative paths from the `knowledge` directory when creating the source.
+</Tip>
 
 Here's an example using string-based knowledge:
 
@@ -80,7 +91,8 @@ result = crew.kickoff(inputs={"question": "What city does John live in and how o
 ```
 
 
-Here's another example with the `CrewDoclingSource`
+Here's another example with the `CrewDoclingSource`. The CrewDoclingSource is actually quite versatile and can handle multiple file formats including TXT, PDF, DOCX, HTML, and more. 
+
 ```python Code
 from crewai import LLM, Agent, Crew, Process, Task
 from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
@@ -128,39 +140,217 @@ result = crew.kickoff(
 )
 ```
 
+## More Examples
+
+Here are examples of how to use different types of knowledge sources:
+
+### Text File Knowledge Source
+```python
+from crewai.knowledge.source.crew_docling_source import CrewDoclingSource
+
+# Create a text file knowledge source
+text_source = CrewDoclingSource(
+    file_paths=["document.txt", "another.txt"]
+)
+
+# Create crew with text file source on agents or crew level
+agent = Agent(
+    ...
+    knowledge_sources=[text_source]
+)
+
+crew = Crew(
+    ...
+    knowledge_sources=[text_source]
+)
+```
+
+### PDF Knowledge Source
+```python
+from crewai.knowledge.source.pdf_knowledge_source import PDFKnowledgeSource
+
+# Create a PDF knowledge source
+pdf_source = PDFKnowledgeSource(
+    file_paths=["document.pdf", "another.pdf"]
+)
+
+# Create crew with PDF knowledge source on agents or crew level
+agent = Agent(
+    ...
+    knowledge_sources=[pdf_source]
+)
+
+crew = Crew(
+    ...
+    knowledge_sources=[pdf_source]
+)
+```
+
+### CSV Knowledge Source
+```python
+from crewai.knowledge.source.csv_knowledge_source import CSVKnowledgeSource
+
+# Create a CSV knowledge source
+csv_source = CSVKnowledgeSource(
+    file_paths=["data.csv"]
+)
+
+# Create crew with CSV knowledge source or on agent level
+agent = Agent(
+    ...
+    knowledge_sources=[csv_source]
+)
+
+crew = Crew(
+    ...
+    knowledge_sources=[csv_source]
+)
+```
+
+### Excel Knowledge Source
+```python
+from crewai.knowledge.source.excel_knowledge_source import ExcelKnowledgeSource
+
+# Create an Excel knowledge source
+excel_source = ExcelKnowledgeSource(
+    file_paths=["spreadsheet.xlsx"]
+)
+
+# Create crew with Excel knowledge source on agents or crew level
+agent = Agent(
+    ...
+    knowledge_sources=[excel_source]
+)
+
+crew = Crew(
+    ...
+    knowledge_sources=[excel_source]
+)
+```
+
+### JSON Knowledge Source
+```python
+from crewai.knowledge.source.json_knowledge_source import JSONKnowledgeSource
+
+# Create a JSON knowledge source
+json_source = JSONKnowledgeSource(
+    file_paths=["data.json"]
+)
+
+# Create crew with JSON knowledge source on agents or crew level
+agent = Agent(
+    ...
+    knowledge_sources=[json_source]
+)
+
+crew = Crew(
+    ...
+    knowledge_sources=[json_source]
+)
+```
+
 ## Knowledge Configuration
 
 ### Chunking Configuration
 
-Control how content is split for processing by setting the chunk size and overlap.
+Knowledge sources automatically chunk content for better processing. 
+You can configure chunking behavior in your knowledge sources:
 
-```python Code
-knowledge_source = StringKnowledgeSource(
-    content="Long content...",
-    chunk_size=4000,     # Characters per chunk (default)
-    chunk_overlap=200    # Overlap between chunks (default)
+```python
+from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
+
+source = StringKnowledgeSource(
+    content="Your content here",
+    chunk_size=4000,      # Maximum size of each chunk (default: 4000)
+    chunk_overlap=200     # Overlap between chunks (default: 200)
 )
 ```
 
-## Embedder Configuration
+The chunking configuration helps in:
+- Breaking down large documents into manageable pieces
+- Maintaining context through chunk overlap
+- Optimizing retrieval accuracy
+
+### Embeddings Configuration
+
+You can also configure the embedder for the knowledge store. 
+This is useful if you want to use a different embedder for the knowledge store than the one used for the agents.
+The `embedder` parameter supports various embedding model providers that include:
+- `openai`: OpenAI's embedding models
+- `google`: Google's text embedding models
+- `azure`: Azure OpenAI embeddings
+- `ollama`: Local embeddings with Ollama
+- `vertexai`: Google Cloud VertexAI embeddings
+- `cohere`: Cohere's embedding models
+- `bedrock`: AWS Bedrock embeddings
+- `huggingface`: Hugging Face models
+- `watson`: IBM Watson embeddings
+
+Here's an example of how to configure the embedder for the knowledge store using Google's `text-embedding-004` model:
+<CodeGroup>
+```python Example
+from crewai import Agent, Task, Crew, Process, LLM
+from crewai.knowledge.source.string_knowledge_source import StringKnowledgeSource
+import os
 
-You can also configure the embedder for the knowledge store. This is useful if you want to use a different embedder for the knowledge store than the one used for the agents.
+# Get the GEMINI API key
+GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")
 
-```python Code
-...
+# Create a knowledge source
+content = "Users name is John. He is 30 years old and lives in San Francisco."
 string_source = StringKnowledgeSource(
-    content="Users name is John. He is 30 years old and lives in San Francisco.",
+    content=content,
+)
+
+# Create an LLM with a temperature of 0 to ensure deterministic outputs
+gemini_llm = LLM(
+    model="gemini/gemini-1.5-pro-002",
+    api_key=GEMINI_API_KEY,
+    temperature=0,
+)
+
+# Create an agent with the knowledge store
+agent = Agent(
+    role="About User",
+    goal="You know everything about the user.",
+    backstory="""You are a master at understanding people and their preferences.""",
+    verbose=True,
+    allow_delegation=False,
+    llm=gemini_llm,
+)
+
+task = Task(
+    description="Answer the following questions about the user: {question}",
+    expected_output="An answer to the question.",
+    agent=agent,
 )
+
 crew = Crew(
-    ...
+    agents=[agent],
+    tasks=[task],
+    verbose=True,
+    process=Process.sequential,
     knowledge_sources=[string_source],
     embedder={
-        "provider": "openai",
-        "config": {"model": "text-embedding-3-small"},
-    },
+        "provider": "google",
+        "config": {
+            "model": "models/text-embedding-004",
+            "api_key": GEMINI_API_KEY,
+        }
+    }
 )
+
+result = crew.kickoff(inputs={"question": "What city does John live in and how old is he?"})
 ```
+```text Output
+# Agent: About User
+## Task: Answer the following questions about the user: What city does John live in and how old is he?
 
+# Agent: About User
+## Final Answer: 
+John is 30 years old and lives in San Francisco.
+```
+</CodeGroup>
 ## Clearing Knowledge
 
 If you need to clear the knowledge stored in CrewAI, you can use the `crewai reset-memories` command with the `--knowledge` option.

diff --git a/...o/Portkey-Observability-and-Guardrails.md → .../portkey-observability-and-guardrails.mdx b/...o/Portkey-Observability-and-Guardrails.md → .../portkey-observability-and-guardrails.mdx