KnowUnDo

To Forget or Not? Towards Practical Knowledge Unlearning for LLMs

🔔 Overview • 📊 Load Datasets • 🚀 How to Run • 📖 Citation •

🔔 Overview

We provide the KnowUnDo, a benchmark containing copyrighted content and user privacy domains to evaluate if the unlearning process inadvertently erases essential knowledge. Access our KnowUnDo directly on Hugging Face.

To address this, we propose a simple yet effective method, MemFlex, which utilizes gradient information to precisely target and unlearn sensitive parameters.

📊 Load Datasets

You can easily load the datasets following below.

from datasets import load_dataset

dataset = load_dataset("zjunlp/KnowUnDo", name='copyright', split='unlearn')

Available configuration names and corresponding splits:
- copyright: unlearn, retention;
- privacy: unlearn, retention;

🚀 How to run

Environment Setup

git clone https://github.com/zjunlp/KnowUnDo.git
cd KnowUnDo
conda create -n KnowUnDo python==3.10

conda activate KnowUnDo
pip install -e .
pip install -r requirements.txt

cd llm_unlearn/apex
pip install -v --no-cache-dir ./

Download Large Language Models (LLMs)

# directory: KnowUnDo
mkdir models
cd models
git lfs install
git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
git clone https://huggingface.co/Qwen/Qwen1.5-7B-Chat

Pretrain LLMs in Our Setting

# directory: pretrain
bash run_finetune_lora.sh

Knowledge Localization (Optional)

We have released the localized knowledge region. You can perform the localization yourself as follows.

# directory: pretrain
bash run_localization.sh

Prepare tokenized datasets

# directory: llm_unlearn
cd utils
bash tokenize_datasets.sh

--val for the val split of the dataset.
--prompt for concating direct_prompt before the question in the datasets.

Unlearning experiments

# directory: llm_unlearn
bash run_baselines_lora.sh
bash run_ours_lora.sh

Available methods with corresponding arguments:
- --unlearn_method gradient_ascent
- --unlearn_method random_label --completely_random True (named Fine-tuning with Random Labels in the paper)
- --unlearn_method random_label --top_k 1 --rm_groundtruth True (named Unlearning with Adversarial Samples in the paper)
- --unlearn_method ascent_plus_descent
- --unlearn_method ascent_plus_kl_divergence
- --unlearn_method ascent_plus_descent --general True
- --unlearn_method ascent_plus_kl_divergence --general True
- --unlearn_method memflex (the strong baseline proposed by us)

Eval Unlearned Model

You can evaluate multiple unlearned models together by running our script only once.

# directory: llm_unlearn
bash run_eval_baselines_lora.sh

--direct_prompt=True means concating direct_prompt before the question in the datasets.

🎉 Acknowledgement

We would like to express our sincere gratitude to the excellent work Unlearning LLM, TOFU, LLaMA, and Qwen.

📖 Citation

If you use or extend our work, please cite the paper as follows:

@article{tian2024forget,
  title={To forget or not? towards practical knowledge unlearning for large language models},
  author={Tian, Bozhong and Liang, Xiaozhuan and Cheng, Siyuan and Liu, Qingbin and Wang, Mengru and Sui, Dianbo and Chen, Xi and Chen, Huajun and Zhang, Ningyu},
  journal={arXiv preprint arXiv:2407.01920},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

KnowUnDo

To Forget or Not? Towards Practical Knowledge Unlearning for LLMs

🔔 Overview

📊 Load Datasets

🚀 How to run

Environment Setup

Download Large Language Models (LLMs)

Pretrain LLMs in Our Setting

Knowledge Localization (Optional)

Prepare tokenized datasets

Unlearning experiments

Eval Unlearned Model

🎉 Acknowledgement

📖 Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

KnowUnDo

To Forget or Not? Towards Practical Knowledge Unlearning for LLMs

🔔 Overview

📊 Load Datasets

🚀 How to run

Environment Setup

Download Large Language Models (LLMs)

Pretrain LLMs in Our Setting

Knowledge Localization (Optional)

Prepare tokenized datasets

Unlearning experiments

Eval Unlearned Model

🎉 Acknowledgement

📖 Citation