Skip to content
/ I-ForGer Public

A dataset of words with continuous formality annotations

License

Notifications You must be signed in to change notification settings

ee-2/I-ForGer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

I-ForGer

I-ForGer is a German-language lexicon with words ordered along the informal-formal dimension. The lexicon is in CSV format and comprises continuous gradings of formality, i.e., scores between +1 (most formal) and -1 (most informal).

The 3,000 words were obtained utilizing lexicographic resources and sentence-based similarity computations. Crowdsourcing (data workers were German native speakers) and Best-worst scaling were used to assess their formality.

When using I-ForGer, please cite:

@inproceedings{Eder21,
    title = "Acquiring a Formality-Informed Lexical Resource for Style Analysis",
    author = "Eder, Elisabeth  and
      	      Krieg-Holz, Ulrike  and
      	      Hahn, Udo",
    booktitle = "Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
    month = apr,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.eacl-main.174",
    doi = "10.18653/v1/2021.eacl-main.174",
    pages = "2028--2041"
}

About

A dataset of words with continuous formality annotations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published