Skip to content

Latest commit

 

History

History
4 lines (3 loc) · 584 Bytes

File metadata and controls

4 lines (3 loc) · 584 Bytes

Type

A type is every distinct entry in a corpus. So with “The apple boys like boys who like apples”. Here the 2 boys tokens are of the same type. So types can occur more often, so you can connect a frequency to them. Apple and apples are not the same type!

Types are typically unambiguous. More than 80% of words in the English language are unambiguous. But the types that are ambiguous. occur frequently. Almost two thirds of the tokens in a corpus are ambiguous.