Skip to content

Commit

Permalink
Merge branch 'main' into cdifonzo/basic-documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
christinedif authored Jun 21, 2024
2 parents 2d86134 + 6ab1a0d commit 0b57d60
Show file tree
Hide file tree
Showing 24 changed files with 832 additions and 606 deletions.
23 changes: 10 additions & 13 deletions TRANSPARENCY.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,28 +7,25 @@ GraphRAG is an AI-based content interpretation and search capability. Using LLMs
GraphRAG is able to connect information across large volumes of information and use these connections to answer questions that are difficult or impossible to answer using keyword and vector-based search mechanisms. Building on the previous question, provide semi-technical, high-level information on how the system offers functionality for various uses. This lets a system using GraphRAG to answer questions where the answers span many documents as well as thematic questions such as “what are the top themes in this dataset?.”

## What are GraphRAG’s intended use(s)?
GraphRAG is intended to support critical information discovery and analysis use cases where the information required to arrive at a useful insight spans many documents, is noisy, is mixed with mis and/or dis-information, or when the questions users aim to answer are more abstract or thematic than the underlying data can directly answer.

GraphRAG is designed to be used in settings where users are already trained on responsible analytic approaches and critical reasoning is expected. GraphRAG is capable of providing high degrees of insight on complex information topics, however human analysis by a domain expert of the answers is needed in order to verify and augment GraphRAG’s generated responses.

GraphRAG is intended to be deployed and used with a domain specific corpus of text data. GraphRAG itself does not collect user data, but users are encouraged to verify data privacy policies of the chosen LLM used to configure GraphRAG.
- GraphRAG is intended to support critical information discovery and analysis use cases where the information required to arrive at a useful insight spans many documents, is noisy, is mixed with mis and/or dis-information, or when the questions users aim to answer are more abstract or thematic than the underlying data can directly answer.
- GraphRAG is designed to be used in settings where users are already trained on responsible analytic approaches and critical reasoning is expected. GraphRAG is capable of providing high degrees of insight on complex information topics, however human analysis by a domain expert of the answers is needed in order to verify and augment GraphRAG’s generated responses.
- GraphRAG is intended to be deployed and used with a domain specific corpus of text data. GraphRAG itself does not collect user data, but users are encouraged to verify data privacy policies of the chosen LLM used to configure GraphRAG.

## How was GraphRAG evaluated? What metrics are used to measure performance?

GraphRAG has been evaluated in multiple ways. The primary concerns are 1) accurate representation of the data set, 2) providing transparency and groundedness of responses, 3) resilience to prompt and data corpus injection attacks, and 4) low hallucination rates. Details on how each of these has been evaluated is outlined below by number.
1. Accurate representation of the dataset has been tested by both manual inspection and automated testing against a “gold answer” that is created from randomly selected subsets of a test corpus.
1. GraphRAG has been tested against datasets with known confusors and noise in multiple domains. These tests include both automated evaluation of answer detail (as compared to vector search approaches) as well as manual inspection using questions that are known to be difficult or impossible for other search systems to answer.
1. Transparency and groundedness of responses is tested via automated answer coverage evaluation and human inspection of the underlying context returned.
1. We test both user prompt injection attacks (“jailbreaks”) and cross prompt injection attacks (“data attacks”) using manual and semi-automated techniques.
1. Hallucination rates are evaluated using claim coverage metrics, manual inspection of answer and source, and adversarial attacks to attempt a forced hallucination through adversarial and exceptionally challenging datasets.

## What are the limitations of GraphRAG? How can users minimize the impact of GraphRAG’s limitations when using the system?
GraphRAG depends on a well-constructed indexing examples. For general applications (e.g. content oriented around people, places, organizations, things, etc.) we provide example indexing prompts. For unique datasets effective indexing can depend on proper identification of domain-specific concepts.

Indexing is a relatively expensive operation; a best practice to mitigate indexing is to create a small test dataset in the target domain to ensure indexer performance prior to large indexing operations.
- GraphRAG depends on well-constructed indexing examples. For general applications (e.g. content oriented around people, places, organizations, things, etc.) we provide example indexing prompts. For unique datasets, effective indexing can depend on proper identification of domain-specific concepts.
- Indexing is a relatively expensive operation; a best practice to mitigate indexing is to create a small test dataset in the target domain to ensure indexer performance prior to large indexing operations.
- GraphRAG is designed to accept well-formatted UTF-8 text only. Input data that does not conform to this specification will cause issues in indexing with unreliable effects.

## What operational factors and settings allow for effective and responsible use of GraphRAG?
GraphRAG is designed for use by users with domain sophistication and experience working through difficult information challenges. While the approach is generally robust to injection attacks and identifying conflicting sources of information, the system is designed for trusted users. Proper human analysis of responses is important to generate reliable insights, and the provenance of information should be traced to ensure human agreement with the inferences made as part of the answer generation.

GraphRAG yields the most effective results on natural language text data that is collectively focused on an overall topic or theme, and that is entity rich – entities being people, places, things, or objects that can be uniquely identified.

While GraphRAG has been evaluated for its resilience to prompt and data corpus injection attacks, and has been probed for specific types of harms, the LLM that the user configures with GraphRAG may produce inappropriate or offensive content, which may make it inappropriate to deploy for sensitive contexts without additional mitigations that are specific to the use case and model. Developers should assess outputs for their context and use available safety classifiers, model specific safety filters and features (such as [https://azure.microsoft.com/en-us/products/ai-services/ai-content-safety](https://azure.microsoft.com/en-us/products/ai-services/ai-content-safety)), or custom solutions appropriate for their use case.
- GraphRAG is designed for use by users with domain sophistication and experience working through difficult information challenges. While the approach is generally robust to injection attacks and identifying conflicting sources of information, the system is designed for trusted users. Proper human analysis of responses is important to generate reliable insights, and the provenance of information should be traced to ensure human agreement with the inferences made as part of the answer generation.
- GraphRAG yields the most effective results on natural language text data that is collectively focused on an overall topic or theme, and that is entity rich – entities being people, places, things, or objects that can be uniquely identified.
- GraphRAG has been evaluated for its resilience to prompt and data corpus injection attacks and has been probed for specific types of harms. However, the LLM that the user configures with GraphRAG may produce inappropriate or offensive content which may make it inappropriate to deploy for sensitive contexts without additional mitigations that are specific to the use case and model. Developers should assess outputs for their context and use available safety classifiers, model specific safety filters and features (such as [https://azure.microsoft.com/en-us/products/ai-services/ai-content-safety](https://azure.microsoft.com/en-us/products/ai-services/ai-content-safety)), or custom solutions appropriate for their use case. The use of content safety filters is recommended to prevent XPIA and UPIA attacks, as well as to limit harmful content generation by malicious users. Discretion is advised when modifying or removing filters for applications that require it.
Binary file modified backend/graphrag-wheel/graphrag-0.0.1-py3-none-any.whl
Binary file not shown.
3 changes: 1 addition & 2 deletions backend/graphrag-wheel/note.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,4 @@ This graphrag wheel file was built from the following repo

https://github.com/microsoft/graphrag

on commit hash 6afec2ae421eabf468e1da8595e1e23d4bda5ecb

on commit hash b860d08a907e834166edf03e617d9cfeac946a64
4 changes: 0 additions & 4 deletions backend/run-indexing-job.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,10 @@

parser = argparse.ArgumentParser(description="Kickoff indexing job.")
parser.add_argument("-i", "--index-name", required=True)
parser.add_argument("-s", "--storage-name", required=True)
parser.add_argument("-e", "--entity-config", required=False)
args = parser.parse_args()

asyncio.run(
_start_indexing_pipeline(
index_name=args.index_name,
storage_name=args.storage_name,
entity_config_name=args.entity_config,
)
)
4 changes: 2 additions & 2 deletions backend/src/aks-batch-job-template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ kind: Job
metadata:
name: PLACEHOLDER
spec:
ttlSecondsAfterFinished: 0
backoffLimit: 30
ttlSecondsAfterFinished: 120
backoffLimit: 6
template:
metadata:
labels:
Expand Down
2 changes: 1 addition & 1 deletion backend/src/api/experimental.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ def stream_response(report_df, query, end_callback=(lambda x: x), timeout=300):
this_directory = os.path.dirname(
os.path.abspath(inspect.getfile(inspect.currentframe()))
)
data = yaml.safe_load(open(f"{this_directory}/pipeline_settings.yaml"))
data = yaml.safe_load(open(f"{this_directory}/pipeline-settings.yaml"))
# layer the custom settings on top of the default configuration settings of graphrag
parameters = create_graphrag_config(data, ".")

Expand Down
Loading

0 comments on commit 0b57d60

Please sign in to comment.