SciSafeEval

Usage

Clone this repo from remote to local:

git clone --recursive https://github.com/DavidLee528/SciSafeEval.git

Code Structure

The SciSafeEval code is organized into three primary components:

Generator: Interfaces with target large language models (LLMs) to generate outputs. (code/generator.py)
Probe: Utilizes the SciSafeEval dataset to probe the LLMs. (code/probe.py)
Detector: Evaluates whether the attack is successful based on the output from the target LLMs. (code/detector.py)

These components are integrated within the main.py file, which serves as the central script coordinating their interactions.

Citation

@article{li2024scisafeeval,
  title={SciSafeEval: A Comprehensive Benchmark for Safety Alignment of Large Language Models in Scientific Tasks},
  author={Li, Tianhao and Lu, Jingyu and Chu, Chuangxin and Zeng, Tianyu and Zheng, Yujia and Li, Mei and Huang, Haotian and Wu, Bin and Liu, Zuoxian and Ma, Kai and others},
  journal={arXiv preprint arXiv:2410.03769},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

SciSafeEval

Usage

Code Structure

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

SciSafeEval

Usage

Code Structure

Citation