Semantics

  • Denotational Semantics: The concept of representing an idea as a symbol (a word or a one-hot vector).
    • sparse
    • cannot capture similarity
    • a “localist” representation
  • Distributional Semantics: The concept of representing the meaning of a word based on the context in which it usually appears.

    • A word’s meaning is given by the words that frequently appear close-by
    • context: when a word appears in a text, its context is the set of words that appear nearby (within a fixed-size window)
    • dense
    • can better capture similarity

      One-hot Vector

      Represent every word as an Word Representation - 图1 vector with all 0s and one 1 at the index of that word in the sorted english language.
  • Word Representation - 图2:the size of our vocabulary

  • denotational semantics
  • examples:

image.png

  • problem:

    • dose not give us directly any notion of similarity
    • Word Representation - 图4 is large

      Word Vectors

      Learn to encode similarity in the vectors themselves.
  • dense word

  • it is similar to vectors of words that appear in similar context
  • are sometimes called word embeddings or word representations
  • are a distributed representation

    Visualization of word vectors

  • 向量维度较高,无法直接可视化。

  • 可以将向量投影到低维空间(如二维空间),再可视化。