Skip to content

StormHub/TinyToolBox.AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lightweight AI application evaluation templates

The LLM-as-a-Judge approach uses large language models (LLMs) to evaluate AI-generated text based on predefined criteria.

  • Battle
  • ClosedQA
  • Humor
  • Factuality
  • Moderation
  • Security
  • Summarization
  • SQL
  • Translation
  • Fine-tuned binary classifiers

Example of quick test of AI application output

Note that the name and parameters must match individual prompts above.

// Setup semantic kernel with ChatCompletion first
// Create PromptExecutionSettings and set 'Temperature'
const string isThisFunny = "I am a brown fox";
var json = 
    $$"""
    {
        "humor" : {
            "output" : "{{isThisFunny}}"
        },
        "factuality" : {
            "input" : "What color was Cotton?",
            "output": "white",
            "expected": "white"
        }
    }
    """;
await foreach (var result in 
          kernel.Run(json, executionSettings: executionSettings))
{
    Console.WriteLine($"[{result.Key}]: result: {result.Value?.Item1}, score: {result.Value?.Item2}");
}

Complete example here

Releases

No releases published

Packages

No packages published

Languages