NomosArtificial / static-eval Public

Notifications You must be signed in to change notification settings
Fork 6
Star 7

Static Evaluation of the Legal Reasoning and Legal Knowledge of Large Language Models

7 stars 6 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
inference_scripts		inference_scripts
legalbench		legalbench
static-eval		static-eval
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Repository files navigation

Static Evaluation of the Legal Reasoning and Legal Knowledge of Large Language Models

This repo contains LegalBench's training set along with scripts for evaluating LLMs on that data.

The functions needed for eval can be found here.

Notebooks and Scripts

Example of running eval on the training set (including OpenAI APIs, and using Modal and Baseten for inference on open-source LLMs): https://github.com/NomosArtificial/static-eval/blob/main/static-eval/eval_notebook.ipynb
Example of script for hosting model on Modal.com for inference (follows https://modal.com/docs/guide/ex/falcon_gptq): https://github.com/NomosArtificial/static-eval/blob/main/inference_scripts/modal_falcon7B.py
Tutorial for baseten deployment of Falcon: https://www.baseten.co/blog/deploy-falcon-40b-on-baseten

About

Static Evaluation of the Legal Reasoning and Legal Knowledge of Large Language Models

Custom properties

Report repository

Releases

No releases published

Packages

No packages published

Contributors 5

Languages