Constituency Parsing

Constituent

constituent: can be a single word or a phrases as a single unit within a hierarchical structure.
a phrase: is a sequence of two or more words built around a head lexical item and working as a unit within a sentence.
to be a phrase, a group of words should:
- come together to play a specific role in the sentence
- can be moved together or replaced as a whole
People interpret the meaning of large text units by semantic composition of smaller elements

Constituency Parse Tree
Non-terminals: types of phrases
Terminals: the exact words

What we want is
- constituency sentence parsing
- learn structure and representation
  Recursive vs recurrent neural networks

Recursive	Recurrent

- require a tree structure	- cannot capture phrases without prefix context - often capture too much of last words in final vector

Constituency Parsing and Recursive Neural Network - 图7

问题：单一的矩阵不能处理过于复杂的组成
- 两个输入之间没有交互
- 对所有的语法类型，使用了相同的参数矩阵
  改进：Syntactically-United RNN
为不同的语法类型使用了不同的参数矩阵。
参数矩阵被初始化为单位矩阵：即对两个输入取平均
- 模型可以学到哪个两个输入向量哪个更重要
- 可以学到对向量进行哪些旋转或放缩可以提高性能
通过可视化学习后的参数矩阵，可以看出学到了什么。可以发现学到了 soft head words