What is Coreference Resolution
- Identify all mentions that refer to the same real world entity.
- Applications
- Mention: span of text referring to some entity
For detection: use other NLP systems | Kinds of mentions | Detection Methods | | —- | —- | | Pronouns | part-of-speech tagger | | Named entities | NER system | | Noun phrases | parser |
问题
- bad mentions
- train a classifer to filter out spurious mentions
- keep all mentions as condidate mentions and discard all singleton mentions
- avoid pipelined system
- bad mentions
Coreference: when 2 mentions refer to the same entity in the world.
- Anaphora: when a term (anaphor) refers to another term (antecedent).
- not all noun phrases have reference
- not all anaphoric relations are conferential
Cataphora: the antercedent comes after the anaphor
Coreference Model
Rule-based
Hobbs’ naive algorithm
Winograd Schema
Mention Pair
核心思想
Train a binary classifer that assigns every pair of mentions a probability of being coreferent:
Training and Test
Training | Test |
---|---|
Disadvantage
- many mentions only have one clear antecedent
Compute Probabilities
Non-neural statistical classifier
- features
Simple neural network
End-to-end Model
Cluster-Based Method
核心思想
- start with each mention in it’s own singleton cluster
- merge a pair of clusters at each step
- use a model to score which cluster merges are good
- mention-pair decision is difficult but clustre-pair decision is easier
- 有更多的信息,可以和 cluster 中的多个 mention 比较