Skip to content

Latest commit

 

History

History
140 lines (118 loc) · 4.95 KB

README.md

File metadata and controls

140 lines (118 loc) · 4.95 KB

KnowUnDo

To Forget or Not? Towards Practical Knowledge Unlearning for LLMs

📃 arXiv • 🤗 Dataset

🔔 Overview

We provide the KnowUnDo, a benchmark containing copyrighted content and user privacy domains to evaluate if the unlearning process inadvertently erases essential knowledge. Access our KnowUnDo directly on Hugging Face.

To address this, we propose a simple yet effective method, MemFlex, which utilizes gradient information to precisely target and unlearn sensitive parameters.

📊 Load Datasets

You can easily load the datasets following below.

from datasets import load_dataset

dataset = load_dataset("zjunlp/KnowUnDo", name='copyright', split='unlearn')
  • Available configuration names and corresponding splits:
    • copyright: unlearn, retention;
    • privacy: unlearn, retention;

🚀 How to run

Environment Setup

git clone https://github.com/zjunlp/KnowUnDo.git
cd KnowUnDo
conda create -n KnowUnDo python==3.10

conda activate KnowUnDo
pip install -e .
pip install -r requirements.txt

cd llm_unlearn/apex
pip install -v --no-cache-dir ./

Download Large Language Models (LLMs)

# directory: KnowUnDo
mkdir models
cd models
git lfs install
git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
git clone https://huggingface.co/Qwen/Qwen1.5-7B-Chat

Pretrain LLMs in Our Setting

# directory: pretrain
bash run_finetune_lora.sh

Knowledge Localization (Optional)

We have released the localized knowledge region. You can perform the localization yourself as follows.

# directory: pretrain
bash run_localization.sh

Prepare tokenized datasets

# directory: llm_unlearn
cd utils
bash tokenize_datasets.sh
  • --val for the val split of the dataset.
  • --prompt for concating direct_prompt before the question in the datasets.

Unlearning experiments

# directory: llm_unlearn
bash run_baselines_lora.sh
bash run_ours_lora.sh
  • Available methods with corresponding arguments:
    • --unlearn_method gradient_ascent
    • --unlearn_method random_label --completely_random True (named Fine-tuning with Random Labels in the paper)
    • --unlearn_method random_label --top_k 1 --rm_groundtruth True (named Unlearning with Adversarial Samples in the paper)
    • --unlearn_method ascent_plus_descent
    • --unlearn_method ascent_plus_kl_divergence
    • --unlearn_method ascent_plus_descent --general True
    • --unlearn_method ascent_plus_kl_divergence --general True
    • --unlearn_method memflex (the strong baseline proposed by us)

Eval Unlearned Model

You can evaluate multiple unlearned models together by running our script only once.

# directory: llm_unlearn
bash run_eval_baselines_lora.sh
  • --direct_prompt=True means concating direct_prompt before the question in the datasets.

🎉 Acknowledgement

We would like to express our sincere gratitude to the excellent work Unlearning LLM, TOFU, LLaMA, and Qwen.

📖 Citation

If you use or extend our work, please cite the paper as follows:

@article{tian2024forget,
  title={To forget or not? towards practical knowledge unlearning for large language models},
  author={Tian, Bozhong and Liang, Xiaozhuan and Cheng, Siyuan and Liu, Qingbin and Wang, Mengru and Sui, Dianbo and Chen, Xi and Chen, Huajun and Zhang, Ningyu},
  journal={arXiv preprint arXiv:2407.01920},
  year={2024}
}