General Information

Train und run the Seq2Tag model for the task of Grammatical Error Correction for the Ukrainian language.

Installation

Currently, there is no PyPI package for this project, but I hope to add it soon!

First, please install Poetry. Then, in the root of the project run poetry install. This will install all the needed dependencies.

Training

At the moment, there is no CLI command to train the model.
However, you can do it directly from code:

from ua_gec import Corpus
from gec.seq2tag import Seq2TagManager

# you can pass any custom list of documents compatible with the UA-GEC python package annotation
corpus = Corpus(partition="all", annotation_layer="gec-only")
seq2tag = Seq2TagManager(corpus=corpus, min_error_occurrence=3)
seq2tag.train()
seq2tag.push() # you will need to log in to your HuggingFace account first

Accuracy & Performance

The model was trained only on the GEC part of the UA-GEC dataset.
It reaches the F0.5 score of 0.6707 on the UNLP 2023 Shared Task in Grammatical Error Correction for Ukrainian. The model is not supposed to be used in production but it serves as a foundation for training larger models using synthetic data.

Since the model predicts the transformation tag for a token instead of rewriting it, the model is pretty fast. Correcting the UA-GEC test dataset (1509 documents) with 3 stages takes only ~82 seconds on a single GPU.

Interface

We use Gradio to interact with the model. The interface expects a Seq2Tag model to explain predictions.

To start the web interface, please run poetry run gradio interface.py.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
gec		gec
.gitignore		.gitignore
README.md		README.md
interface.py		interface.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

General Information

Installation

Training

Accuracy & Performance

Interface

About

Releases

Packages

Languages

BonySmoke/grammar-tag

Folders and files

Latest commit

History

Repository files navigation

General Information

Installation

Training

Accuracy & Performance

Interface

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages