-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathREADME.Rmd
46 lines (29 loc) · 2.56 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
output: github_document
title: The digitised and annotated Holle List of the barrier island languages, off the west coast of Sumatra, Indonesia
author: '[Gede Primahadi Wijaya Rajeg](https://www.ling-phil.ox.ac.uk/people/gede-rajeg) <a itemprop="sameAs" content="https://orcid.org/0000-0002-2047-8621" href="https://orcid.org/0000-0002-2047-8621" target="orcid.widget" rel="noopener noreferrer" style="vertical-align:top;"><img src="https://orcid.org/sites/default/files/images/orcid_16x16.png" style="width:1em;margin-right:.5em;" alt="ORCID iD icon"></a></br>University of Oxford, UK and Universitas Udayana, Indonesia'
link-citations: true
bibliography: references.bib
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
<!-- badges: start -->
[![Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.](https://www.repostatus.org/badges/latest/wip.svg)](https://www.repostatus.org/#wip)
<!-- badges: end -->
## Overview
A repository hosting (a work-in-progress for) the digitised and curated dataset of the Holle List vocabularies of languages off the west coast of Sumatra, Indonesia [@holle1987; @holleli1980]. The goal is to allow computational matching between the vocabularies of these languages with the [main, digitised Holle List](https://engganolang.github.io/digitised-holle-list/) [@rajeg2023]. The matching will provide the English, Dutch, and Indonesian translations for these vocabularies that are carried over from the main Holle List.
### Notes
- Files in the `plaintexts` directory that are done are:
- lekon.txt (✅)
- simalur.txt (✅)
- tapah.txt (✅)
- mentawai1933.txt (✅)
These files will be supplemented by the additional data from different languages (see the note below).
- The digitisation process for the lists of [Mentawai]{style="color:crimson"} (not the "1933" one), [Nias (1905 and 1911)]{style="color:crimson"}, [Salang and Sigule]{style="color:crimson"}, [Sigulei and Salang]{style="color:crimson"}, and [Seumalur]{style="color:crimson"} was conducted by my students at the English Lexicology and Lexicography course in the Bachelor of English Literature, Faculty of Humanities, Udayana University. This is part of a class project to introduce WeSay app to the students.
- UPDATE: The .sfm plain-text files from the WeSay projects are to be further processed (particularly matching the main word list with the notes)
### References