This directory contains the source code for Chat Copilot's backend web API service. The front end web application component can be found in the webapp/ directory.
To configure and run either the full Chat Copilot application or only the backend API, please view the main instructions.
The following material is under development and may not be complete or accurate.
- build (CopilotChatWebApi)
- run (CopilotChatWebApi)
- [optional] watch (CopilotChatWebApi)
-
Open the solution file in Visual Studio 2022 or newer (
CopilotChat.sln
). -
In Solution Explorer, right-click on
CopilotChatWebApi
and selectSet as Startup Project
. -
Start debugging by pressing
F5
or selecting the menu itemDebug
->Start Debugging
. -
(Optional) To enable support for uploading image file formats such as png, jpg and tiff, there are two options for
SemanticMemory:ImageOcrType
section of./appsettings.json
, the Tesseract open source library and Azure Form Recognizer.- Tesseract we have included the Tesseract nuget package.
- You will need to obtain one or more tessdata language data files such as
eng.traineddata
and add them to your./data
directory or the location specified in theSemanticMemory:Services:Tesseract:FilePath
location in./appsettings.json
. - Set the
Copy to Output Directory
value toCopy if newer
.
- You will need to obtain one or more tessdata language data files such as
- Azure Form Recognizer we have included the Azure.AI.FormRecognizer nuget package.
- You will need to obtain an Azure Form Recognizer resource and add the
SemanticMemory:Services:AzureFormRecognizer:Endpoint
andSemanticMemory:Services:AzureFormRecognizer:Key
values to the./appsettings.json
file.
- You will need to obtain an Azure Form Recognizer resource and add the
- Tesseract we have included the Tesseract nuget package.
Running Memory Service
The memory service handles the creation and querying of kernel memory, including cognitive memory and documents.
Running the memory creation pipeline in the webapi process. This also means the memory creation is synchronous.
No additional configuration is needed.
You can choose either Volatile or TextFile as the SimpleVectorDb implementation.
Running the memory creation pipeline steps in different processes. This means the memory creation is asynchronous. This allows better scalability if you have many chat sessions active at the same time or you have big documents that require minutes to process.
- In ./webapi/appsettings.json, set
SemanticMemory:DataIngestion:OrchestrationType
toDistributed
. - In ../memorypipeline/appsettings.json, set
SemanticMemory:DataIngestion:OrchestrationType
toDistributed
. - Make sure the following settings in the ./webapi/appsettings.json and ../memorypipeline/appsettings.json respectively point to the same locations on your machine so that both processes can access the data:
SemanticMemory:Services:SimpleFileStorage:Directory
SemanticMemory:Services:SimpleQueues:Directory
SemanticMemory:Services:SimpleVectorDb:Directory
Do not configure SimpleVectorDb to use Volatile. Volatile storage cannot be shared across processes.
- You need to run both the webapi and the memorypipeline.
(Optional) Use hosted resources: Azure Storage Account, Azure Cognitive Search
-
In ./webapi/appsettings.json and ../memorypipeline/appsettings.json, set
SemanticMemory:ContentStorageType
toAzureBlobs
. -
In ./webapi/appsettings.json and ../memorypipeline/appsettings.json, set
SemanticMemory:DataIngestion:DistributedOrchestration:QueueType
toAzureQueue
. -
In ./webapi/appsettings.json and ../memorypipeline/appsettings.json, set
SemanticMemory:DataIngestion:VectorDbTypes:0
toAzureCognitiveSearch
. -
In ./webapi/appsettings.json and ../memorypipeline/appsettings.json, set
SemanticMemory:Retrieval:VectorDbType
toAzureCognitiveSearch
. -
Run the following to set up the authentication to the resources:
dotnet user-secrets set SemanticMemory:Services:AzureBlobs:Auth ConnectionString dotnet user-secrets set SemanticMemory:Services:AzureBlobs:ConnectionString [your secret] dotnet user-secrets set SemanticMemory:Services:AzureQueue:Auth ConnectionString # Only needed when running distributed processing dotnet user-secrets set SemanticMemory:Services:AzureQueue:ConnectionString [your secret] # Only needed when running distributed processing dotnet user-secrets set SemanticMemory:Services:AzureCognitiveSearch:Endpoint [your secret] dotnet user-secrets set SemanticMemory:Services:AzureCognitiveSearch:APIKey [your secret]
-
For more information and other options, please refer to the memorypipeline.
You can also add OpenAI plugins that will be managed by the webapi (as opposed to being managed by the webapp). Soon, all OpenAI plugins will be managed by the webapi.
By default, a third party OpenAI plugin called Klarna Shopping is already added.
Please refer to here for more details.
If you want to use SequentialPlanner (multi-step) instead ActionPlanner (single-step), we recommend using gpt-4
or gpt-3.5-turbo
as the planner model. SequentialPlanner works best with gpt-4
. Using gpt-3.5-turbo
will require using a relevancy filter.
To enable sequential planner,
- In ./webapi/appsettings.json, set
"Type": "Sequential"
under thePlanner
section. - Then, set your preferred Planner model (
gpt-4
orgpt-3.5-turbo
) under theAIService
configuration section. - If using
gpt-4
, no other changes are required. - If using
gpt-3.5-turbo
: change CopilotChatPlanner.cs to initialize SequentialPlanner with a RelevancyThreshold*.- Add
using
statement to top of file:using Microsoft.SemanticKernel.Planning.Sequential;
- The
CreatePlanAsync
method should return the following line ifthis._plannerOptions?.Type == "Sequential"
is true:* Thereturn new SequentialPlanner(this.Kernel, new SequentialPlannerConfig { RelevancyThreshold = 0.75 }).CreatePlanAsync(goal);
RelevancyThreshold
is a number from 0 to 1 that represents how similar a goal is to a function's name/description/inputs. You want to tune that value when using SequentialPlanner to help keep things scoped while not missing on on things that are relevant or including too many things that really aren't.0.75
is an arbitrary threshold and we recommend developers play around with this number to see what best fits their scenarios.
- Add
- Restart the
webapi
- Chat Copilot should be now running locally with SequentialPlanner.
Azure Cosmos DB can be used as a persistent chat store for Chat Copilot. Chat stores are used for storing chat sessions, participants, and messages.
In an effort to optimize performance, each container must be created with a specific partition key:
Store | ContainerName | PartitionKey |
---|---|---|
Chat Sessions | chatsessions | /id (default) |
Chat Messages | chatmessages | /chatId |
Chat Memory Sources | chatmemorysources | /chatId |
Chat Partipants | chatparticipants | /userId |
For existing customers using CosmosDB before Release 0.3, our recommendation is to remove the existing Cosmos DB containers and redeploy to realize the performance update related to the partition schema. To preserve existing chats, containers can be migrated as described here.
By default, the service uses an in-memory volatile memory store that, when the service stops or restarts, forgets all memories. Qdrant is a persistent scalable vector search engine that can be deployed locally in a container or at-scale in the cloud.
To enable the Qdrant memory store, you must first deploy Qdrant locally and then configure the Chat Copilot API service to use it.
Before you get started, make sure you have the following additional requirements in place:
- Docker Desktop for hosting the Qdrant vector search engine.
-
Open a terminal and use Docker to pull down the container image.
docker pull qdrant/qdrant
-
Change directory to this repo and create a
./data/qdrant
directory to use as persistent storage. Then start the Qdrant container on port6333
using the./data/qdrant
folder as the persistent storage location.mkdir ./data/qdrant docker run --name copilotchat -p 6333:6333 -v "$(pwd)/data/qdrant:/qdrant/storage" qdrant/qdrant
To stop the container, in another terminal window run
docker container stop copilotchat; docker container rm copilotchat;
.
Azure Cognitive Search can be used as a persistent memory store for Chat Copilot. The service uses its vector search capabilities.
Enabling telemetry on CopilotChatApi allows you to capture data about requests to and from the API, allowing you to monitor the deployment and monitor how the application is being used.
To use Application Insights, first create an instance in your Azure subscription that you can use for this purpose.
On the resource overview page, in the top right use the copy button to copy the Connection String and paste this into the APPLICATIONINSIGHTS_CONNECTION_STRING
setting as either a appsettings value, or add it as a secret.
In addition to this there are some custom events that can inform you how users are using the service such as PluginFunction
.
To access these custom events the suggested method is to use Azure Data Explorer (ADX). To access data from Application Insights in ADX, create a new dashboard and add a new Data Source (use the ellipsis dropdown in the top right).
In the Cluster URI use the following link: https://ade.applicationinsights.io/subscriptions/<Your subscription Id>
. The subscription id is shown on the resource page for your Applications Insights instance. You can then select the Database for the Application Insights resource.
For more info see Query data in Azure Monitor using Azure Data Explorer.
CopilotChat specific events are in a table called customEvents
.
For example to see the most recent 100 plugin function invocations:
customEvents
| where timestamp between (_startTime .. _endTime)
| where name == "PluginFunction"
| extend plugin = tostring(customDimensions.pluginName)
| extend function = tostring(customDimensions.functionName)
| extend success = tobool(customDimensions.success)
| extend userId = tostring(customDimensions.userId)
| extend environment = tostring(customDimensions.AspNetCoreEnvironment)
| extend pluginFunction = strcat(plugin, '/', function)
| project timestamp, pluginFunction, success, userId, environment
| order by timestamp desc
| limit 100
Or to report the success rate of plugin functions against environments, you can first add a parameter to the dashboard to filter the environment.
You can use this query to show the environments available by adding the Source
as this Query
:
customEvents
| where timestamp between (['_startTime'] .. ['_endTime']) // Time range filtering
| extend environment = tostring(customDimensions.AspNetCoreEnvironment)
| distinct environment
Name the variable _environment
, select Multiple Selection
and tick Add empty "Select all" value
. Finally Select all
as the Default value
.
You can then query the success rate with this query:
customEvents
| where timestamp between (_startTime .. _endTime)
| where name == "PluginFunction"
| extend plugin = tostring(customDimensions.pluginName)
| extend function = tostring(customDimensions.functionName)
| extend success = tobool(customDimensions.success)
| extend environment = tostring(customDimensions.AspNetCoreEnvironment)
| extend pluginFunction = strcat(plugin, '/', function)
| summarize Total=count(), Success=countif(success) by pluginFunction, environment
| project pluginFunction, SuccessPercentage = 100.0 * Success/Total, environment
| order by SuccessPercentage asc
You may wish to use the Visual tab to turn on conditional formatting to highlight low success rates or render it as a chart.
Finally you could render this data over time with a query like this:
customEvents
| where timestamp between (_startTime .. _endTime)
| where name == "PluginFunction"
| extend plugin = tostring(customDimensions.pluginName)
| extend function = tostring(customDimensions.functionName)
| extend success = tobool(customDimensions.success)
| extend environment = tostring(customDimensions.AspNetCoreEnvironment)
| extend pluginFunction = strcat(plugin, '/', function)
| summarize Total=count(), Success=countif(success) by pluginFunction, environment, bin(timestamp,1m)
| project pluginFunction, SuccessPercentage = 100.0 * Success/Total, environment, timestamp
| order by timestamp asc
Then use a Time chart on the Visual tab.
Though plugins can contain both semantic and native functions, Chat Copilot currently only supports plugins of isolated types due to import limitations, so you must separate your plugins into respective folders for each.
If you wish to load custom plugins into the kernel or planner:
-
Create two new folders under
./Plugins
directory named./SemanticPlugins
and./NativePlugins
. There, you can add your custom plugins (synonymous with plugins). -
Then, comment out the respective options in
appsettings.json
:"Service": { // "TimeoutLimitInS": "120" "SemanticPluginsDirectory": "./Plugins/SemanticPlugins", "NativePluginsDirectory": "./Plugins/NativePlugins" // "KeyVault": "" // "InMaintenance": true },
-
By default, custom plugins are only loaded into planner's kernel for discovery at runtime. If you want to load the plugins into the core chat Kernel, you'll have to add the plugin registration into the
AddSemanticKernelServices
method ofSemanticKernelExtensions.cs
. Uncomment the line withservices.AddKernelSetupHook
and pass in theRegisterPluginsAsync
hook:internal static IServiceCollection AddSemanticKernelServices(this IServiceCollection services) { ... // Add any additional setup needed for the kernel. // Uncomment the following line and pass in your custom hook. builder.Services.AddKernelSetupHook(RegisterPluginsAsync); return services; }
If you want to deploy your custom plugins with the webapi, additional configuration is required. You have the following options:
-
[Recommended] Create custom setup hooks to import your plugins into the kernel and planner.
The default
RegisterPluginsAsync
function uses reflection to import native functions from your custom plugin files. C# reflection is a powerful but slow mechanism that dynamically inspects and invokes types and methods at runtime. It works well for loading a few plugin files, but it can degrade performance and increase memory usage if you have many plugins or complex types. Therefore, we recommend creating your own import function to load your custom plugins manually. This way, you can avoid reflection overhead and have more control over how and when your plugins are loaded.Create a function to load your custom plugins at build and pass that function as a hook to
AddKernelSetupHook
orAddPlannerSetupHook
inSemanticKernelExtensions.cs
. See the next two sections for details on how to do this. This bypasses the need to load the plugins at runtime, and consequently, there's no need to ship the source files for your custom plugins. Remember to comment out theNativePluginsDirectory
orSemanticPluginsDirectory
options inappsettings.json
to prevent any potential pathing errors.
Alternatively,
-
If you want to use local files for custom plugins and don't mind exposing your source code, you need to make sure that the files are copied to the output directory when you publish or run the app. The deployed app expects to find the files in a subdirectory specified by the
NativePluginsDirectory
orSemanticPluginsDirectory
option, which is relative to the assembly location by default. To copy the files to the output directory,Mark the files and the subdirectory as Copy to Output Directory in the project file or the file properties. For example, if your files are in a subdirectories called
Plugins\NativePlugins
andPlugins\SemanticPlugins
, you can uncomment the following lines theCopilotChatWebApi.csproj
file:<Content Include="Plugins\NativePlugins\*.*"> <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory> </Content> <Content Include="Plugins\SemanticPlugins\*.*"> <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory> </Content>
-
Change the respective directory option to use an absolute path or a different base path, but make sure that the files are accessible from that location.
Chat Copilot's Semantic Kernel can be customized with additional plugins or settings by using a custom hook that performs any complimentary setup of the kernel. A custom hook is a delegate that takes an IServiceProvider
and an IKernel
as parameters and performs any desired actions on the kernel, such as registering additional plugins, setting kernel options, adding dependency injections, importing data, etc. To use a custom hook, you can pass it as an argument to the AddKernelSetupHook
call in the AddSemanticKernelServices
method of SemanticKernelExtensions.cs
.
For example, the following code snippet shows how to create a custom hook that registers a plugin called MyPlugin and passes it to AddKernelSetupHook
:
// Define a custom hook that registers MyPlugin with the kernel
private static Task MyCustomSetupHook(IServiceProvider sp, IKernel kernel)
{
// Import your plugin into the kernel with the name "MyPlugin"
kernel.ImportFunctions(new MyPlugin(), nameof(MyPlugin));
// Perform any other setup actions on the kernel
// ...
}
Then in the AddSemanticKernelServices
method of SemanticKernelExtensions.cs
, pass your hook into the services.AddKernelSetupHook
call:
internal static IServiceCollection AddSemanticKernelServices(this IServiceCollection services)
{
...
// Add any additional setup needed for the kernel.
// Uncomment the following line and pass in your custom hook.
builder.Services.AddKernelSetupHook(MyCustomSetupHook);
return services;
}
The planner uses a separate kernel instance that can be configured with plugins that are specific to the planning process. Note that these plugins will be persistent across all chat requests.
To customize the planner's kernel, you can use a custom hook that registers plugins at build time. A custom hook is a delegate that takes an IServiceProvider
and an IKernel
as parameters and performs any desired actions on the kernel. By default, the planner will register plugins using SemanticKernelExtensions.RegisterPluginsAsync
to load files from the Service.SemanticPluginsDirectory
and Service.NativePluginsDirectory
option values in appsettings.json
.
To use a custom hook, you can pass it as an argument to the AddPlannerSetupHook
call in the AddPlannerServices
method of SemanticKernelExtensions.cs
, which will invoke the hook after the planner's kernel is created. See section above for an example of a custom hook function.
Note: This will override the call to
RegisterPluginsAsync
.
Then in the AddPlannerServices
method of SemanticKernelExtensions.cs
, pass your hook into the services.AddPlannerSetupHook
call:
internal static IServiceCollection AddPlannerServices(this IServiceCollection services)
{
...
// Register any custom plugins with the planner's kernel.
builder.Services.AddPlannerSetupHook(MyCustomSetupHook);
return services;
}