sobre (#1): adicionado descrição curta e Papel de Etica.AI

EticaAI · Dec 2, 2020 · d675777 · d675777
1 parent 470a109
commit d675777
Show file tree

Hide file tree

Showing 2 changed files with 47 additions and 26 deletions.
diff --git a/README.en.md b/README.en.md
@@ -1,14 +1,27 @@
-# EticaAI Linguistic Datasets PT
-** [public draft Linguistic data sets for Portuguese
-with flexible licenses. **
-
-(Note: this document has sample text in Portuguese to test the
-`bin / translate-readme` tool.)
-
-The aim of this project, inspired by the spirit of [FOSS] (https://pt.wikipedia.org/wiki/Software_livre_e_de_c%C3%B3digo_aberto),
-is to list sources of knowledge representations that, depending on the language
-and local cultures, cannot be imported. They require special attention,
-multidisciplinary character, and that ideally _ should already be ready and
-acceptably validated_: when they do not exist, at best, they can
-force them to be made by a non-specialist and impair quality, and at worst, until
-prevent the production of innovative technologies.
+# Linguistic data sets in Portuguese via cooperation with communities
+**[work in progress] Permanent project to coordinate the creation and update
+linguistic data sets (such as those that can be used to detect
+discrimination and hate speech) preferably validated by people
+representatives of affected groups or subject matter experts. Dedicated to
+public domain.**
+
+## Role of Etica.AI
+
+Unlike [EticaAI/linguistic-datasets-portuguese](https://github.com/EticaAI/linguistic-datasets-portuguese)
+(which is a list for different data sets
+in Portuguese from different sources) this repository contains
+reference for the data sets themselves where Etica.AI serves as
+organization to allow collaboration on an ongoing basis.
+
+Linguistic datasets in Portuguese are rare, not very complete and, when they exist,
+they are often on a restricted use license. The importance of our
+work here, to even release commercial use, has the potential to help
+in automations (such as detection of verbal attacks).
+
+## Role of people in the community
+
+(...)
+
+## Working files
+- HXL-CPLP-Publico
+  - <https://drive.google.com/drive/u/1/folders/1VLm29IBV6iOnfagRKKD8cLntDAjIjL0z>
diff --git a/README.md b/README.md
@@ -1,18 +1,26 @@
-# EticaAI Linguistic Datasets PT
-**[rascunho público Conjuntos de dados linguísticos para língua portuguesa
-com licença flexíveis.**
+# Conjuntos de dados linguísticos em português via cooperação com comunidades
+**[trabalho em progresso] Projeto permanente para coordenar a criação e atualização
+de conjuntos de dados inguísticos (como os que podem ser usados para detectar
+discriminação e discursos de ódio) preferencialmente validados por pessoas
+representantes dos grupos afetados ou de especialistas do assunto. Dedicado ao
+domínio público.**
 
-(Nota: este documento está com texto de exempl em português para testar a
-ferramenta `bin/translate-readme`.)
+## Papel da Etica.AI
 
-O objetivo desse projeto, inspirado pelo espírito de [FOSS](https://pt.wikipedia.org/wiki/Software_livre_e_de_c%C3%B3digo_aberto),
-é listar fontes de representações de conhecimento que, ao depender da língua
-e das culturas locais, não podem ser importadas. Requerem atenção especial, de
-caráter multidisciplinar, e que idealmente _já deveria estar prontas e
-aceitavelmente validadas_: quando não existem, na melhor das hipóteses, podem
-forçar serem feitas por não especialista e prejudicar qualidade, e na pior, até
-impedir a produção de tecnologias inovadoras.
+Diferente do [EticaAI/linguistic-datasets-portuguese](https://github.com/EticaAI/linguistic-datasets-portuguese)
+(que é uma lista para diferentes conjuntos de dados
+linguísticos em português de diversas fontes) este repositório contém
+referência para os próprios conjuntos de dados onde Etica.AI serve como
+organização para permitir colaboração de forma permanente.
 
+Datasets linguísticos em português são raros, pouco completos e, quando existem,
+frequentemente estão em licença de uso restrito. A importância do nosso
+trabalho aqui, de até mesmo liberar uso comercial, tem potencial para ajudar
+em automações (como detecção de de ataques verbais).
+
+## Papel de pessoas da comunidade
+
+(...)
 
 ## Arquivos de trabalho
 - HXL-CPLP-Publico