Skip to content

Code and evaluation data for my Master's thesis on flavor detection and bias mitigation.

Notifications You must be signed in to change notification settings

ppommer/master-thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Towards Fairness in NLP: Neural Methods for Flavor Detection and Bias Mitigation

This repository contains the source code for my Master's thesis Towards Fairness in NLP: Neural Methods for Flavor Detection and Bias Mitigation.

Evaluation

This folder contains the evaluation results described in Chapter 5: Results and Evaluation. The human evaluation forms and the analysis sheet are located here. The Perspective API evaluation results and scripts are located here. You need an API key to run the scripts.

Neutralizing Bias

This folder contains MODULAR and CONCURRENT described in Chapter 4: Experimental Setup based on the paper Automatically Neutralizing Subjective Bias in Text. You can find the original repository here.

Requirements

  1. Set up your environment:
$ virtualenv -p python3 .venv-nb
$ source .venv-nb/bin/activate
$ pip install -r req-nb.txt
$ python
>> import nltk; nltk.download("punkt")
  1. Download the Wiki Neutrality Corpus (WNC) data here. Extract it to the data folder.

  2. Download the MODULAR checkpoint here and save it to the model folder or train your own model using this script. Please contact me if you need a checkpoint for CONCURRENT. You can train your own model using this script.

  3. Use the model interface or the inference scripts for inference.

Style Transfer Paraphrase

This folder contains STRAP described in Chapter 4: Experimental Setup based on the paper Reformulating Unsupervised Style Transfer as Paraphrase Generation. You can find the original repository here.

Requirements

  1. Set up your environment:
$ virtualenv -p python3 .venv-stp
$ source .venv-stp/bin/activate
$ pip install transformers
$ pip install torch==1.6.0+cu92 torchvision==0.7.0+cu92 -f https://download.pytorch.org/whl/torch_stable.html
$ pip install -r req-stp.txt
  1. Please contact me if you need the preprocessed WNC data.

  2. You can download the Diverse Paraphraser (paraphraser_gpt2_large) here. Save it to the models folder. Please contact me if you need a checkpoint for the Inverse Paraphraser trained on WNC. If you want to train it yourself, please follow the steps described here.

  3. You can use the command line-based interface to interact with the model. It is documented here.

  4. Evaluation requires the SIM model (Wieting et al., 2019). You can download it here and save it to this folder.

Playground

Requirements

  1. Set up your environment:
$ virtualenv -p python3.10 .venv-pg
$ source .venv-pg/bin/activate
$ pip install -r req-pg.txt
  1. Download pretrained GloVe word embeddings and extract the .zip file. Save glove.6B.50d.txt to the data folder.

  2. Follow the descriptions in the notebooks.

About

Code and evaluation data for my Master's thesis on flavor detection and bias mitigation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published