With dependency parsing, you examine the dependencies between the phrases of a sentence in order to determine its grammatical structure. You parse an input sequence by staring at the root of the input, and then you jump from token to token based on grammatical binary relationships (dependencies).
If you have jumped to every part of the sentence, then you have parsed the sentence. The parse tree is the sentence you get a **directed graph.
The description of images above: You start with the root, which then goes into the main verb of the sentence. Then you go to the subject of that verb. Then you get the argument of the verb. Then you get modifiers of the argument. For instance morning.
So there is no hierarchy, you just describe the sentence based on the relationships which exists between the tokens. It is mostly based on structure and not on meaning, but you can also base on meaning.
Dependency parsing explicitly and directly encode information about relationships across tokens which are often quite buried and hard to see in constituent based trees like a context free grammar.
Dependency parsing also deals very well with morphological complex languages with free word order (which is not like English) without needing to have super specific grammar. For instance, an affix could mark the subject. For these types of languages, chunking or constituency based parsing works worse. So it's like a set of words connected together by grammar binary relationships.
The main content words in the sentence are called heads, and the relations are based on the heads.
Syntactic roles like subject, object. Like the verb
There are modifiers of heads like blue sky where blue is a modifier of the head sky.
After you have analysed a sentence, you can store it as a graph. You can store a graph as a set of tuples with two items.
We call this here
A dependency tree is a directed graph where
- There is one root node with no incoming arc.
- Every other vertex has exactly one incoming arc.
- There is only one path from the root to each vertex in V.
These conditions are necessary to have the dependency tree in the way we care about.
Using special treebanks. Dependency treebanks. Linguists have created corpora with annotated typed binary directed decency relations that are used to train dependency parsers.
If you don't have a dependency treebank, you can also use deterministic approaches with a constituent based treebank, but this does not work very well. A trained linguist can do better.
This checks if it went to the correct label. So it's ok if you come from the good head as long as where the relationship goes is the correct label.
This does not check if it went to the correct label. So it's ok if you come from the correct head as long as where the relationship goes is the correct head.
This combines the labelled and unlabelled scores.
This scores how many estimated dependencies have the same head and dependent as the gold standard, and also the correct type.