What is Coreference Resolution

  • Identify all mentions that refer to the same real world entity.
  • Applications
    • full text understanding
    • machine translation (language have different features for gender, number, dropped pronouns, etc.)
    • dialogue systems

      Coreference Resolution steps

  1. Detect the mentions (easy)
  2. Cluster the mentions (hard)

    Mention Detection

  • Mention: span of text referring to some entity
  • For detection: use other NLP systems | Kinds of mentions | Detection Methods | | —- | —- | | Pronouns | part-of-speech tagger | | Named entities | NER system | | Noun phrases | parser |

  • 问题

    • bad mentions
      • train a classifer to filter out spurious mentions
      • keep all mentions as condidate mentions and discard all singleton mentions
    • avoid pipelined system
      • train a classifer specifically for mention detection
      • joinly do mention-detection and coreference resolution end-to-end instead of 2 steps

        Coreference and Anaphora

  • Coreference: when 2 mentions refer to the same entity in the world.

  • Anaphora: when a term (anaphor) refers to another term (antecedent).
    • not all noun phrases have reference
    • not all anaphoric relations are conferential
  • Cataphora: the antercedent comes after the anaphor

    Coreference Model

    Rule-based

    Hobbs’ naive algorithm

    image.png

    Winograd Schema

    image.png

    Mention Pair

    核心思想

  • Train a binary classifer that assigns every pair of mentions a probability of being coreferent: Coreference Resolution - 图3

image.png

Training and Test

Training Test
image.png image.png

Disadvantage

  • many mentions only have one clear antecedent
    • but we are asking the model to predict all of them
    • solution: train the model to predict only one antecedent for each mention

      Mention Ranking

      核心思想

      image.png

      Training and Test

      | Training | Test | | :—-: | :—-: | | image.png | image.png |

Compute Probabilities

Non-neural statistical classifier

  • features

image.png

Simple neural network

image.png

End-to-end Model

image.png
image.png
image.png
image.png

Cluster-Based Method

核心思想

  • start with each mention in it’s own singleton cluster
  • merge a pair of clusters at each step
    • use a model to score which cluster merges are good
  • mention-pair decision is difficult but clustre-pair decision is easier
    • 有更多的信息,可以和 cluster 中的多个 mention 比较

image.png

Clustering Model Architecture

image.png

  • 训练方法

    • 当前 cluster 的 merge 取决于之前所进行的 merge,不能用正常的监督学习方法
    • 使用强化学习的方法训练

      Coreference Evaluation

  • many different metrics: MUC, CEAF, LEA, B-CUBED, BLANC

  • often report the average over a fww different metrics
  • example: B-CUBED

image.png