Skip to content

Latest commit

 

History

History
49 lines (37 loc) · 1.23 KB

README.md

File metadata and controls

49 lines (37 loc) · 1.23 KB

Article Rewriter

Create an environment with dependencies specified in environment.yml:

conda env create -f environment.yml

Activate the new environment:

conda activate article-rewriter

Note: all Medium articles collected inside data/ was collected using a custom Chrome extension.

Add your raw data to annotate inside data/summarizer/markdown_data.csv and data/paraphraser/markdown_data.csv.

Prepare data:

python src/prepare_data_paraphraser.py
python src/prepare_data_summarizer.py

Add the markdown back to the target_without_markdown in data/summarizer/to_annotate_data.csv and data/paraphraser/to_annotate_data.csv under a new column target_with_markdown.

Store these new files as data/summarizer/annotated.csv and data/paraphraser/annotated.csv.

Fine-tune data:

python src/finetune_paraphraser.py
python src/finetune_summarizer.py

Resulting models will be stored under models/summarizer and models/paraphraser.

Evaluate models:

python src/evaluate_paraphraser.py
python src/evaluate_summarizer.py

Rewrite articles (summarize then paraphrase and add non-paragraph parts back):

python src/rewrite.py

Deactivate an active environment:

conda deactivate