Skip to content

Latest commit

 

History

History
15 lines (14 loc) · 756 Bytes

course-description.md

File metadata and controls

15 lines (14 loc) · 756 Bytes

Topics (subject to change per instructor/class whim) (will not be presented in this order):

Linguistic Stack (graphemes/phones - words - syntax - semantics - pragmatics - discourse) Tools: Corpora, Corpus statistics, Data cleaning and munging Annotation and crowdwork Evaluation Models/approaches: rule-based, automata/grammars, perceptron, logistic regression, neural network models Effective written and oral communication Components/Tasks/Subtasks: Language Models Syntax: POS tags, constituency tree, dependency tree, parsing Semantics: lexical, formal, inference tasks Information Extraction: Named Entities, Relations, Events Generation: Machine Translation, Summarization, Dialogue, Creative Generation