Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a graphrag guide #4978

Merged
merged 4 commits into from
Feb 18, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ releases! 🌟

3. Start up the server using the pre-built Docker images:

> The command below downloads the `v0.16.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download an RAGFlow edition different from `v0.16.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.16.0` for the full edition `v0.16.0`.
> The command below downloads the `v0.16.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.16.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.16.0` for the full edition `v0.16.0`.

```bash
$ cd ragflow
Expand Down
2 changes: 1 addition & 1 deletion docker/docker-compose-gpu.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# The RAGFlow team do not actively maintain docker-compose-gpu.yml, so use them at your own risk.
# However, you are welcome to file a pull request to improve it.
# Pull requests to improve it are welcome.
include:
- ./docker-compose-base.yml

Expand Down
2 changes: 1 addition & 1 deletion docs/guides/accelerate_question_answering.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Please note that some of your settings may consume a significant amount of time.
## 1. Accelerate document indexing

- Use GPU to reduce embedding time.
- On the configuration page of your knowledge base, toggle off **Use RAPTOR to enhance retrieval**.
- On the configuration page of your knowledge base, switch off **Use RAPTOR to enhance retrieval**.
- The **Knowledge Graph** chunk method (GraphRAG) is time-consuming.
- Disable **Auto-keyword** and **Auto-question** on the configuration page of yor knowledge base, as both depend on the LLM.

Expand Down
76 changes: 76 additions & 0 deletions docs/guides/configure_knowledge_base/construct_knowledge_graph.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
---
sidebar_position: 2
slug: /construct_knowledge_graph
---

# Construct knowledge graph

Generate a knowledge graph for your knowledge base.

---

To enhance multi-hop question-answering, RAGFlow adds a knowledge graph construction step between data extraction and indexing, as illustrated below. This step creates additional chunks from existing ones generated by your specified chunk method.

![Image](https://github.com/user-attachments/assets/edf0528d-cb46-46fc-aef4-edb98996949b)

As of v0.16.0, RAGFlow supports constructing a knowledge graph on a knowledge base, allowing you to construct a *unified* graph across multiple files within your knowledge base. When a newly uploaded file starts parsing, the generated graph will automatically update.

:::danger WARNING
Constructing a knowledge graph requires significant memory, computational resources, and tokens.
:::

## Scenarios

Knowledge graphs are especially useful for multi-hop question-answering involving *nested* logic. They outperform traditional extraction approaches when you are performing question answering on books or works with complex entities and relationships.

## Prerequisites

The system's default chat model is used to generate knowledge graph. Before proceeding, ensure that you have an chat model properly configured:

![Image](https://github.com/user-attachments/assets/6bc34279-68c3-4d99-8d20-b7bd1dafc1c1)

## Configurations

### Entity types (*Required*)

The types of the entities to extract from your knowledge base. The default types are: **organization**, **person**, **event**, and **category**. Add or remove types to suit your specific knowledge base.

### Method

The method to use to construct knowledge graph:

- **General**: Use prompts provided by [GraphRAG](https://github.com/microsoft/graphrag) to extract entities and relationships.
- **Light**: (Default) Use prompts provided by [LightRAG](https://github.com/HKUDS/LightRAG) to extract entities and relationships. This option consumes fewer tokens, less memory, and fewer computational resources.

### Entity resolution

Whether to enable entity resolution. You can think of this as an entity deduplication switch. When enabled, the LLM will combine similar entities - e.g., '2025' and 'the year of 2025', or 'IT' and 'Information Technology' - to construct a more accurate graph.

- (Default) Disable entity resolution. This option consumes fewer tokens.
- Enable entity resolution.

### Community report generation

In a knowledge graph, a community is a cluster of entities linked by relationships. You can have the LLM generate an abstract for each community, known as a community report. See [here](https://www.microsoft.com/en-us/research/blog/graphrag-improving-global-search-via-dynamic-community-selection/) for more information. This indicates whether to generate community reports:

- Generate community reports.
- (Default) Do not generate community reports. This options consumes fewer tokens.

## Procedure

1. On the **Configuration** page of your knowledge base, switch on **Extract knowledge graph** or adjust its settings as needed, and click **Save** to confirm your changes.

- *The default GraphRAG configurations for your knowlege base are now set and files uploaded from this point onward will automatically use these settings during parsing.*
- *Files parsed before this update will retain their original knowledge graph settings.*

2. The knowledge graph of your knowlege base does *not* automatically update *until* a newly uploaded file is parsed.

_A **Knowledge Graph** entry appears under **Configuration** once a knowledge graph is created._

3. Click **Knowledge Graph** to view the details of the generated graph.

## Frequently asked questions

### Can I have different knowledge graph settings for different files in my knowledge base?

Yes, you can. Just one graph is generated per knowledge base. The smaller graphs of your files will be *combined* into one big, unified graph at the end of the graph extraction process.
6 changes: 3 additions & 3 deletions docs/guides/configure_knowledge_base/set_metadata.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
sidebar_position: 0
sidebar_position: 1
slug: /set_metada
---

Expand All @@ -9,13 +9,13 @@ Add metadata to an uploaded file

---

On the **Dataset** page of your knowledge base, you can add metadata to any uploaded file. This approach enables you to 'tag' additional information like URL, author, and date, to an existing file or dataset. In an AI-powered chat, such information will be sent to the LLM with the retrieved chunks for content generation.
On the **Dataset** page of your knowledge base, you can add metadata to any uploaded file. This approach enables you to 'tag' additional information like URL, author, date, and more to an existing file or dataset. In an AI-powered chat, such information will be sent to the LLM with the retrieved chunks for content generation.

For example, if you have a dataset of HTML files and want the LLM to cite the source URL when responding to your query, add a `"url"` parameter to each file's metadata.

![Image](https://github.com/user-attachments/assets/78cb5035-e96c-43f9-82d7-8fef1b68c843)

:::note TIP
:::tip NOTE
Ensure that your metadata is in JSON format; otherwise, your updates will not be applied.
:::

Expand Down
2 changes: 1 addition & 1 deletion docs/quickstart.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
3. Use the pre-built Docker images and start up the server:

:::tip NOTE
The command below downloads the `v0.16.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download an RAGFlow edition different from `v0.15.1-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.15.1` for the full edition `v0.15.1`.
The command below downloads the `v0.16.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.16.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.16.0` for the full edition `v0.16.0`.
:::

```bash
Expand Down
2 changes: 1 addition & 1 deletion docs/references/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ Yes, we support enhancing user queries based on existing context of an ongoing c

1. On the **Chat** page, hover over the desired assistant and select **Edit**.
2. In the **Chat Configuration** popup, click the **Prompt Engine** tab.
3. Toggle on **Multi-turn optimization** to enable this feature.
3. Switch on **Multi-turn optimization** to enable this feature.

---

Expand Down
3 changes: 2 additions & 1 deletion docs/release_notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Released on February 6, 2025.
- New UI language: Portuguese.
- Allows setting metadata for a specific file in a knowledge base to support AI-powered chats.
- Upgrades RAGFlow's document engine [Infinity](https://github.com/infiniflow/infinity) to v0.6.0.dev3.
- Supports GPU acceleration for DeepDoc (see [docker-compose-gpu.yml](https://github.com/infiniflow/ragflow/blob/main/docker/docker-compose-gpu.yml)).
- Supports creating and referencing a **Tag** knowledge base as a key milestone towards bridging the semantic gap between query and response.

:::danger IMPORTANT
Expand Down Expand Up @@ -96,7 +97,7 @@ Released on December 18, 2024.

### Improvements

- Upgrades the Document Layout Analysis model in Deepdoc.
- Upgrades the Document Layout Analysis model in DeepDoc.
- Significantly enhances the retrieval performance when using [Infinity](https://github.com/infiniflow/infinity) as document engine.

### Related APIs
Expand Down