隐马尔可夫模型是一种概率图模型。我们知道,机器学习模型可以从频率派和贝叶斯派两个方向考虑,在频率派的方法中的核心是优化问题,而在贝叶斯派的方法中,核心是积分问题,也发展出来了一系列的积分方法如变分推断,MCMC 等。概率图模型最基本的模型可以分为有向图(贝叶斯网络)和无向图(马尔可夫随机场)两个方面,例如 GMM,在这些基本的模型上,如果样本之间存在关联,可以认为样本中附带了时序信息,从而样本之间不独立全同分布的,这种模型就叫做动态模型,隐变量随着时间发生变化,于是观测变量也发生变化:

11.隐马尔可夫模型 - 图1
根据状态变量的特点,可以分为:

  1. HMM,状态变量(隐变量)是离散的
  2. Kalman 滤波,状态变量是连续的,线性的
  3. 粒子滤波,状态变量是连续,非线性的

HMM

HMM 用概率图表示为:

11.隐马尔可夫模型 - 图2)%0Aend%0Asubgraph%20three%0A%09t3—%3Ex3((x3))%0Aend%0Asubgraph%20two%0A%09t2—%3Ex2((x2))%0Aend%0Asubgraph%20one%0A%09t1—%3Ex1((x1))%0Aend%0A%0At2—%3Et3%3B%0At3—%3Et4%3B#code=graph%20TD%3B%0At1—%3Et2%3B%0Asubgraph%20four%0A%09t4—%3Ex4%28%28x4%29%29%0Aend%0Asubgraph%20three%0A%09t3—%3Ex3%28%28x3%29%29%0Aend%0Asubgraph%20two%0A%09t2—%3Ex2%28%28x2%29%29%0Aend%0Asubgraph%20one%0A%09t1—%3Ex1%28%28x1%29%29%0Aend%0A%0At2—%3Et3%3B%0At3—%3Et4%3B&id=b76f3bfa&type=mermaid#crop=0&crop=0&crop=1&crop=1&id=jGYAn&originHeight=619&originWidth=548&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
上图表示了四个时刻的隐变量变化。用参数 11.隐马尔可夫模型 - 图3#card=math&code=%5Clambda%3D%28%5Cpi%2CA%2CB%29&height=18&width=84#crop=0&crop=0&crop=1&crop=1&id=UZ3CR&originHeight=26&originWidth=119&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 来表示,其中 11.隐马尔可夫模型 - 图4 是开始的概率分布,11.隐马尔可夫模型 - 图5 为状态转移矩阵,11.隐马尔可夫模型 - 图6 为发射矩阵。

下面使用 11.隐马尔可夫模型 - 图7 来表示观测变量,11.隐马尔可夫模型 - 图8 为观测序列,11.隐马尔可夫模型 - 图9 表示观测的值域,11.隐马尔可夫模型 - 图10 表示状态变量,11.隐马尔可夫模型 - 图11 为状态序列,11.隐马尔可夫模型 - 图12 表示状态变量的值域。定义 11.隐马尔可夫模型 - 图13)#card=math&code=A%3D%28a%7Bij%7D%3Dp%28i%7Bt%2B1%7D%3Dq_j%7Ci_t%3Dq_i%29%29&height=19&width=201#crop=0&crop=0&crop=1&crop=1&id=rypfn&originHeight=27&originWidth=283&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 表示状态转移矩阵,11.隐马尔可夫模型 - 图14%3Dp(o_t%3Dv_k%7Ci_t%3Dq_j))#card=math&code=B%3D%28b_j%28k%29%3Dp%28o_t%3Dv_k%7Ci_t%3Dq_j%29%29&height=19&width=207#crop=0&crop=0&crop=1&crop=1&id=XDkCV&originHeight=27&originWidth=290&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 表示发射矩阵。

在 HMM 中,有两个基本假设:

  1. 齐次 Markov 假设(未来只依赖于当前):11.隐马尔可夫模型 - 图15%3Dp(i%7Bt%2B1%7D%7Ci_t)%0A#card=math&code=p%28i%7Bt%2B1%7D%7Cit%2Ci%7Bt-1%7D%2C%5Ccdots%2Ci1%2Co_t%2Co%7Bt-1%7D%2C%5Ccdots%2Co1%29%3Dp%28i%7Bt%2B1%7D%7Ci_t%29%0A&height=18&width=311#crop=0&crop=0&crop=1&crop=1&height=24&id=J7LzR&originHeight=26&originWidth=435&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=&width=402)
  2. 观测独立假设:11.隐马尔可夫模型 - 图16%3Dp(ot%7Ci_t)%0A#card=math&code=p%28o_t%7Ci_t%2Ci%7Bt-1%7D%2C%5Ccdots%2Ci1%2Co%7Bt-1%7D%2C%5Ccdots%2Co_1%29%3Dp%28o_t%7Ci_t%29%0A&height=18&width=269#crop=0&crop=0&crop=1&crop=1&id=vDnn3&originHeight=26&originWidth=377&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

HMM 要解决三个问题:

  1. Evaluation:11.隐马尔可夫模型 - 图17#card=math&code=p%28O%7C%5Clambda%29&height=18&width=44#crop=0&crop=0&crop=1&crop=1&id=RH9du&originHeight=26&originWidth=62&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),Forward-Backward 算法
  2. Learning:11.隐马尔可夫模型 - 图18#card=math&code=%5Clambda%3D%5Cmathop%7Bargmax%7D%5Climits_%7B%5Clambda%7Dp%28O%7C%5Clambda%29&height=28&width=126#crop=0&crop=0&crop=1&crop=1&id=MNZ98&originHeight=41&originWidth=176&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),EM 算法(Baum-Welch)
  3. Decoding:11.隐马尔可夫模型 - 图19#card=math&code=I%3D%5Cmathop%7Bargmax%7D%5Climits_%7BI%7Dp%28I%7CO%2C%5Clambda%29&height=28&width=139#crop=0&crop=0&crop=1&crop=1&id=goDUN&originHeight=41&originWidth=194&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),Vierbi 算法
    1. 预测问题:11.隐马尔可夫模型 - 图20#card=math&code=p%28i_%7Bt%2B1%7D%7Co_1%2Co_2%2C%5Ccdots%2Co_t%29&height=18&width=128#crop=0&crop=0&crop=1&crop=1&id=jNF4z&originHeight=26&originWidth=180&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
    2. 滤波问题:11.隐马尔可夫模型 - 图21#card=math&code=p%28i_t%7Co_1%2Co_2%2C%5Ccdots%2Co_t%29&height=18&width=115#crop=0&crop=0&crop=1&crop=1&id=YkK1Z&originHeight=26&originWidth=161&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

Evaluation

11.隐马尔可夫模型 - 图22%3D%5Csum%5Climits%7BI%7Dp(I%2CO%7C%5Clambda)%3D%5Csum%5Climits%7BI%7Dp(O%7CI%2C%5Clambda)p(I%7C%5Clambda)%0A#card=math&code=p%28O%7C%5Clambda%29%3D%5Csum%5Climits%7BI%7Dp%28I%2CO%7C%5Clambda%29%3D%5Csum%5Climits%7BI%7Dp%28O%7CI%2C%5Clambda%29p%28I%7C%5Clambda%29%0A&height=35&width=287#crop=0&crop=0&crop=1&crop=1&id=CGTwf&originHeight=50&originWidth=401&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

11.隐马尔可夫模型 - 图23%3Dp(i1%2Ci_2%2C%5Ccdots%2Ci_t%7C%5Clambda)%3Dp(i_t%7Ci_1%2Ci_2%2C%5Ccdots%2Ci%7Bt-1%7D%2C%5Clambda)p(i1%2Ci_2%2C%5Ccdots%2Ci%7Bt-1%7D%7C%5Clambda)%0A#card=math&code=p%28I%7C%5Clambda%29%3Dp%28i1%2Ci_2%2C%5Ccdots%2Ci_t%7C%5Clambda%29%3Dp%28i_t%7Ci_1%2Ci_2%2C%5Ccdots%2Ci%7Bt-1%7D%2C%5Clambda%29p%28i1%2Ci_2%2C%5Ccdots%2Ci%7Bt-1%7D%7C%5Clambda%29%0A&height=18&width=443#crop=0&crop=0&crop=1&crop=1&id=aF0Rp&originHeight=26&originWidth=620&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

根据齐次 Markov 假设:

11.隐马尔可夫模型 - 图24%3Dp(it%7Ci%7Bt-1%7D)%3Da%7Bi%7Bt-1%7Dit%7D%0A#card=math&code=p%28i_t%7Ci_1%2Ci_2%2C%5Ccdots%2Ci%7Bt-1%7D%2C%5Clambda%29%3Dp%28it%7Ci%7Bt-1%7D%29%3Da%7Bi%7Bt-1%7Di_t%7D%0A&height=19&width=271#crop=0&crop=0&crop=1&crop=1&id=JP52c&originHeight=27&originWidth=379&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

所以:

11.隐马尔可夫模型 - 图25%3D%5Cpi1%5Cprod%5Climits%7Bt%3D2%7D%5ETa%7Bi%7Bt-1%7D%2Cit%7D%0A#card=math&code=p%28I%7C%5Clambda%29%3D%5Cpi_1%5Cprod%5Climits%7Bt%3D2%7D%5ETa%7Bi%7Bt-1%7D%2Ci_t%7D%0A&height=47&width=138#crop=0&crop=0&crop=1&crop=1&id=cke18&originHeight=66&originWidth=194&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

又由于:

11.隐马尔可夫模型 - 图26%3D%5Cprod%5Climits%7Bt%3D1%7D%5ETb%7Bit%7D(o_t)%0A#card=math&code=p%28O%7CI%2C%5Clambda%29%3D%5Cprod%5Climits%7Bt%3D1%7D%5ETb_%7Bi_t%7D%28o_t%29%0A&height=47&width=140#crop=0&crop=0&crop=1&crop=1&id=csgnp&originHeight=66&originWidth=195&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

于是:

11.隐马尔可夫模型 - 图27%3D%5Csum%5Climits%7BI%7D%5Cpi%7Bi1%7D%5Cprod%5Climits%7Bt%3D2%7D%5ETa%7Bi%7Bt-1%7D%2Cit%7D%5Cprod%5Climits%7Bt%3D1%7D%5ETb%7Bi_t%7D(o_t)%0A#card=math&code=p%28O%7C%5Clambda%29%3D%5Csum%5Climits%7BI%7D%5Cpi%7Bi_1%7D%5Cprod%5Climits%7Bt%3D2%7D%5ETa%7Bi%7Bt-1%7D%2Cit%7D%5Cprod%5Climits%7Bt%3D1%7D%5ETb_%7Bi_t%7D%28o_t%29%0A&height=47&width=234#crop=0&crop=0&crop=1&crop=1&id=psVfl&originHeight=66&originWidth=327&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

我们看到,上面的式子中的求和符号是对所有的观测变量求和,于是复杂度为 11.隐马尔可夫模型 - 图28#card=math&code=O%28N%5ET%29&height=20&width=45#crop=0&crop=0&crop=1&crop=1&id=T6nvf&originHeight=29&originWidth=64&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。

下面,记 11.隐马尔可夫模型 - 图29%3Dp(o_1%2Co_2%2C%5Ccdots%2Co_t%2Ci_t%3Dq_i%7C%5Clambda)#card=math&code=%5Calpha_t%28i%29%3Dp%28o_1%2Co_2%2C%5Ccdots%2Co_t%2Ci_t%3Dq_i%7C%5Clambda%29&height=18&width=213#crop=0&crop=0&crop=1&crop=1&id=H5N9J&originHeight=26&originWidth=298&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),所以,11.隐马尔可夫模型 - 图30%3Dp(O%2Ci_T%3Dq_i%7C%5Clambda)#card=math&code=%5Calpha_T%28i%29%3Dp%28O%2Ci_T%3Dq_i%7C%5Clambda%29&height=18&width=151#crop=0&crop=0&crop=1&crop=1&id=SYQSu&originHeight=26&originWidth=212&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)。我们看到:

11.隐马尔可夫模型 - 图31%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp(O%2Ci_T%3Dq_i%7C%5Clambda)%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5CalphaT(i)%0A#card=math&code=p%28O%7C%5Clambda%29%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp%28O%2CiT%3Dq_i%7C%5Clambda%29%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Calpha_T%28i%29%0A&height=47&width=263#crop=0&crop=0&crop=1&crop=1&id=jpX8g&originHeight=66&originWidth=368&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

11.隐马尔可夫模型 - 图32#card=math&code=%5Calpha_%7Bt%2B1%7D%28j%29&height=18&width=45#crop=0&crop=0&crop=1&crop=1&id=eKuAH&originHeight=26&originWidth=65&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=):

11.隐马尔可夫模型 - 图33%26%3Dp(o1%2Co_2%2C%5Ccdots%2Co%7Bt%2B1%7D%2Ci%7Bt%2B1%7D%3Dq_j%7C%5Clambda)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp(o1%2Co_2%2C%5Ccdots%2Co%7Bt%2B1%7D%2Ci%7Bt%2B1%7D%3Dq_j%2Ci_t%3Dq_i%7C%5Clambda)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp(o%7Bt%2B1%7D%7Co_1%2Co_2%2C%5Ccdots%2Ci%7Bt%2B1%7D%3Dqj%2Ci_t%3Dq_i%7C%5Clambda)p(o_1%2C%5Ccdots%2Co_t%2Ci_t%3Dq_i%2Ci%7Bt%2B1%7D%3Dqj%7C%5Clambda)%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%5Calpha%7Bt%2B1%7D%28j%29%26%3Dp%28o1%2Co_2%2C%5Ccdots%2Co%7Bt%2B1%7D%2Ci%7Bt%2B1%7D%3Dq_j%7C%5Clambda%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp%28o1%2Co_2%2C%5Ccdots%2Co%7Bt%2B1%7D%2Ci%7Bt%2B1%7D%3Dq_j%2Ci_t%3Dq_i%7C%5Clambda%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp%28o%7Bt%2B1%7D%7Co_1%2Co_2%2C%5Ccdots%2Ci%7Bt%2B1%7D%3Dqj%2Ci_t%3Dq_i%7C%5Clambda%29p%28o_1%2C%5Ccdots%2Co_t%2Ci_t%3Dq_i%2Ci%7Bt%2B1%7D%3Dq_j%7C%5Clambda%29%0A%5Cend%7Balign%7D%0A&height=116&width=535#crop=0&crop=0&crop=1&crop=1&id=YIjWm&originHeight=164&originWidth=748&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

利用观测独立假设:

11.隐马尔可夫模型 - 图34%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp(o%7Bt%2B1%7D%7Ci%7Bt%2B1%7D%3Dq_j)p(o_1%2C%5Ccdots%2Co_t%2Ci_t%3Dq_i%2Ci%7Bt%2B1%7D%3Dqj%7C%5Clambda)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp(o%7Bt%2B1%7D%7Ci%7Bt%2B1%7D%3Dqj)p(i%7Bt%2B1%7D%3Dqj%7Co_1%2C%5Ccdots%2Co_t%2Ci_t%3Dq_i%2C%5Clambda)p(o_1%2C%5Ccdots%2Co_t%2Ci_t%3Dq_i%7C%5Clambda)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENb%7Bj%7D(o_t)a%7Bij%7D%5Calphat(i)%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%5Calpha%7Bt%2B1%7D%28j%29%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp%28o%7Bt%2B1%7D%7Ci%7Bt%2B1%7D%3Dq_j%29p%28o_1%2C%5Ccdots%2Co_t%2Ci_t%3Dq_i%2Ci%7Bt%2B1%7D%3Dqj%7C%5Clambda%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp%28o%7Bt%2B1%7D%7Ci%7Bt%2B1%7D%3Dqj%29p%28i%7Bt%2B1%7D%3Dqj%7Co_1%2C%5Ccdots%2Co_t%2Ci_t%3Dq_i%2C%5Clambda%29p%28o_1%2C%5Ccdots%2Co_t%2Ci_t%3Dq_i%7C%5Clambda%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENb%7Bj%7D%28o_t%29a%7Bij%7D%5Calpha_t%28i%29%0A%5Cend%7Balign%7D%0A&height=142&width=546#crop=0&crop=0&crop=1&crop=1&id=Ms0uH&originHeight=200&originWidth=763&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

上面利用了齐次 Markov 假设得到了一个递推公式,这个算法叫做前向算法。

还有一种算法叫做后向算法,定义 11.隐马尔可夫模型 - 图35%3Dp(o%7Bt%2B1%7D%2Co%7Bt%2B1%7D%2C%5Ccdots%EF%BC%8CoT%7Ci_t%3Di%2C%5Clambda)#card=math&code=%5Cbeta_t%28i%29%3Dp%28o%7Bt%2B1%7D%2Co_%7Bt%2B1%7D%2C%5Ccdots%EF%BC%8Co_T%7Ci_t%3Di%2C%5Clambda%29&height=21&width=240#crop=0&crop=0&crop=1&crop=1&id=brlAa&originHeight=30&originWidth=336&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=):

11.隐马尔可夫模型 - 图36%26%3Dp(o1%2C%5Ccdots%2Co_T%7C%5Clambda)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp(o1%2Co_2%2C%5Ccdots%2Co_T%2Ci_1%3Dq_i%7C%5Clambda)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp(o1%2Co_2%2C%5Ccdots%2Co_T%7Ci_1%3Dq_i%2C%5Clambda)%5Cpi_i%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp(o1%7Co_2%2C%5Ccdots%2Co_T%2Ci_1%3Dq_i%2C%5Clambda)p(o_2%2C%5Ccdots%2Co_T%7Ci_1%3Dq_i%2C%5Clambda)%5Cpi_i%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENbi(o_1)%5Cpi_i%5Cbeta_1(i)%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7Dp%28O%7C%5Clambda%29%26%3Dp%28o_1%2C%5Ccdots%2Co_T%7C%5Clambda%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp%28o1%2Co_2%2C%5Ccdots%2Co_T%2Ci_1%3Dq_i%7C%5Clambda%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp%28o1%2Co_2%2C%5Ccdots%2Co_T%7Ci_1%3Dq_i%2C%5Clambda%29%5Cpi_i%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENp%28o1%7Co_2%2C%5Ccdots%2Co_T%2Ci_1%3Dq_i%2C%5Clambda%29p%28o_2%2C%5Ccdots%2Co_T%7Ci_1%3Dq_i%2C%5Clambda%29%5Cpi_i%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bi%3D1%7D%5ENb_i%28o_1%29%5Cpi_i%5Cbeta_1%28i%29%0A%5Cend%7Balign%7D%0A&height=211&width=418#crop=0&crop=0&crop=1&crop=1&id=hPyBH&originHeight=296&originWidth=585&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

对于这个 11.隐马尔可夫模型 - 图37#card=math&code=%5Cbeta_1%28i%29&height=18&width=32#crop=0&crop=0&crop=1&crop=1&id=EOdy7&originHeight=26&originWidth=45&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=):

11.隐马尔可夫模型 - 图38%26%3Dp(o%7Bt%2B1%7D%2C%5Ccdots%2Co_T%7Ci_t%3Dq_i)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bj%3D1%7D%5ENp(o%7Bt%2B1%7D%2Co%7Bt%2B2%7D%2C%5Ccdots%2CoT%2Ci%7Bt%2B1%7D%3Dqj%7Ci_t%3Dq_i)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bj%3D1%7D%5ENp(o%7Bt%2B1%7D%2C%5Ccdots%2Co_T%7Ci%7Bt%2B1%7D%3Dqj%2Ci_t%3Dq_i)p(i%7Bt%2B1%7D%3Dqj%7Ci_t%3Dq_i)%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bj%3D1%7D%5ENp(o%7Bt%2B1%7D%2C%5Ccdots%2Co_T%7Ci%7Bt%2B1%7D%3Dqj)a%7Bij%7D%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bj%3D1%7D%5ENp(o%7Bt%2B1%7D%7Co%7Bt%2B2%7D%2C%5Ccdots%2Co_T%2Ci%7Bt%2B1%7D%3Dqj)p(o%7Bt%2B2%7D%2C%5Ccdots%2CoT%7Ci%7Bt%2B1%7D%3Dqj)a%7Bij%7D%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bj%3D1%7D%5ENb_j(o%7Bt%2B1%7D)a%7Bij%7D%5Cbeta%7Bt%2B1%7D(j)%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%5Cbetat%28i%29%26%3Dp%28o%7Bt%2B1%7D%2C%5Ccdots%2CoT%7Ci_t%3Dq_i%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bj%3D1%7D%5ENp%28o%7Bt%2B1%7D%2Co%7Bt%2B2%7D%2C%5Ccdots%2CoT%2Ci%7Bt%2B1%7D%3Dqj%7Ci_t%3Dq_i%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bj%3D1%7D%5ENp%28o%7Bt%2B1%7D%2C%5Ccdots%2Co_T%7Ci%7Bt%2B1%7D%3Dqj%2Ci_t%3Dq_i%29p%28i%7Bt%2B1%7D%3Dqj%7Ci_t%3Dq_i%29%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bj%3D1%7D%5ENp%28o%7Bt%2B1%7D%2C%5Ccdots%2Co_T%7Ci%7Bt%2B1%7D%3Dqj%29a%7Bij%7D%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bj%3D1%7D%5ENp%28o%7Bt%2B1%7D%7Co%7Bt%2B2%7D%2C%5Ccdots%2Co_T%2Ci%7Bt%2B1%7D%3Dqj%29p%28o%7Bt%2B2%7D%2C%5Ccdots%2CoT%7Ci%7Bt%2B1%7D%3Dqj%29a%7Bij%7D%5Cnonumber%5C%5C%0A%26%3D%5Csum%5Climits%7Bj%3D1%7D%5ENb_j%28o%7Bt%2B1%7D%29a%7Bij%7D%5Cbeta%7Bt%2B1%7D%28j%29%0A%5Cend%7Balign%7D%0A&height=271&width=440#crop=0&crop=0&crop=1&crop=1&id=V4wCW&originHeight=380&originWidth=616&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

于是后向地得到了第一项。

Learning

为了学习得到参数的最优值,在 MLE 中:

11.隐马尔可夫模型 - 图39%0A#card=math&code=%5Clambda%7BMLE%7D%3D%5Cmathop%7Bargmax%7D%5Clambda%20p%28O%7C%5Clambda%29%0A&height=28&width=153#crop=0&crop=0&crop=1&crop=1&id=C0fhX&originHeight=41&originWidth=215&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

我们采用 EM 算法(在这里也叫 Baum Welch 算法),用上标表示迭代:

11.隐马尔可夫模型 - 图40p(Z%7CX%2C%5Ctheta%5Et)dz%0A#card=math&code=%5Ctheta%5E%7Bt%2B1%7D%3D%5Cmathop%7Bargmax%7D_%7B%5Ctheta%7D%5Cint_z%5Clog%20p%28X%2CZ%7C%5Ctheta%29p%28Z%7CX%2C%5Ctheta%5Et%29dz%0A&height=37&width=280#crop=0&crop=0&crop=1&crop=1&id=cPMSt&originHeight=53&originWidth=391&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

其中,11.隐马尔可夫模型 - 图41 是观测变量,11.隐马尔可夫模型 - 图42 是隐变量序列。于是:

11.隐马尔可夫模型 - 图43p(I%7CO%2C%5Clambda%5Et)%5C%5C%0A%3D%5Cmathop%7Bargmax%7D%5Clambda%5Csum%5Climits_I%5Clog%20p(O%2CI%7C%5Clambda)p(O%2CI%7C%5Clambda%5Et)%0A#card=math&code=%5Clambda%5E%7Bt%2B1%7D%3D%5Cmathop%7Bargmax%7D%5Clambda%5Csum%5ClimitsI%5Clog%20p%28O%2CI%7C%5Clambda%29p%28I%7CO%2C%5Clambda%5Et%29%5C%5C%0A%3D%5Cmathop%7Bargmax%7D%5Clambda%5Csum%5Climits_I%5Clog%20p%28O%2CI%7C%5Clambda%29p%28O%2CI%7C%5Clambda%5Et%29%0A&height=74&width=643#crop=0&crop=0&crop=1&crop=1&id=RCkIF&originHeight=104&originWidth=900&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

这里利用了 11.隐马尔可夫模型 - 图44#card=math&code=p%28O%7C%5Clambda%5Et%29&height=19&width=49#crop=0&crop=0&crop=1&crop=1&id=vWjCl&originHeight=27&originWidth=69&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=) 和11.隐马尔可夫模型 - 图45 无关。将 Evaluation 中的式子代入:

11.隐马尔可夫模型 - 图46p(O%2CI%7C%5Clambda%5Et)%3D%5Csum%5ClimitsI%5B%5Clog%20%5Cpi%7Bi1%7D%2B%5Csum%5Climits%7Bt%3D2%7D%5ET%5Clog%20a%7Bi%7Bt-1%7D%2Cit%7D%2B%5Csum%5Climits%7Bt%3D1%7D%5ET%5Clog%20b%7Bi_t%7D(o_t)%5Dp(O%2CI%7C%5Clambda%5Et)%0A#card=math&code=%5Csum%5Climits_I%5Clog%20p%28O%2CI%7C%5Clambda%29p%28O%2CI%7C%5Clambda%5Et%29%3D%5Csum%5Climits_I%5B%5Clog%20%5Cpi%7Bi1%7D%2B%5Csum%5Climits%7Bt%3D2%7D%5ET%5Clog%20a%7Bi%7Bt-1%7D%2Cit%7D%2B%5Csum%5Climits%7Bt%3D1%7D%5ET%5Clog%20b_%7Bi_t%7D%28o_t%29%5Dp%28O%2CI%7C%5Clambda%5Et%29%0A&height=47&width=527#crop=0&crop=0&crop=1&crop=1&id=Hgw5o&originHeight=66&originWidth=737&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

11.隐马尔可夫模型 - 图47

11.隐马尔可夫模型 - 图48%5D%5Cnonumber%5C%5C%0A%26%3D%5Cmathop%7Bargmax%7D%5Cpi%5Csum%5Climits_I%5B%5Clog%20%5Cpi%7Bi1%7D%5Ccdot%20p(O%2Ci_1%2Ci_2%2C%5Ccdots%2Ci_T%7C%5Clambda%5Et)%5D%0A%5Cend%7Balign%7D%0A#card=math&code=%5Cbegin%7Balign%7D%5Cpi%5E%7Bt%2B1%7D%26%3D%5Cmathop%7Bargmax%7D%5Cpi%5Csum%5ClimitsI%5B%5Clog%20%5Cpi%7Bi1%7Dp%28O%2CI%7C%5Clambda%5Et%29%5D%5Cnonumber%5C%5C%0A%26%3D%5Cmathop%7Bargmax%7D%5Cpi%5Csum%5ClimitsI%5B%5Clog%20%5Cpi%7Bi_1%7D%5Ccdot%20p%28O%2Ci_1%2Ci_2%2C%5Ccdots%2Ci_T%7C%5Clambda%5Et%29%5D%0A%5Cend%7Balign%7D%0A&height=71&width=321#crop=0&crop=0&crop=1&crop=1&id=WbfXw&originHeight=101&originWidth=449&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

上面的式子中,对 11.隐马尔可夫模型 - 图49 求和可以将这些参数消掉:

11.隐马尔可夫模型 - 图50%5D%0A#card=math&code=%5Cpi%5E%7Bt%2B1%7D%3D%5Cmathop%7Bargmax%7D%5Cpi%5Csum%5Climits%7Bi1%7D%5B%5Clog%20%5Cpi%7Bi_1%7D%5Ccdot%20p%28O%2Ci_1%7C%5Clambda%5Et%29%5D%0A&height=37&width=250#crop=0&crop=0&crop=1&crop=1&id=sYIs5&originHeight=53&originWidth=350&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

上面的式子还有对 11.隐马尔可夫模型 - 图51 的约束 11.隐马尔可夫模型 - 图52。定义 Lagrange 函数:

11.隐马尔可夫模型 - 图53%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20%5Cpi_i%5Ccdot%20p(O%2Ci_1%3Dq_i%7C%5Clambda%5Et)%2B%5Ceta(%5Csum%5Climits%7Bi%3D1%7D%5EN%5Cpii-1)%0A#card=math&code=L%28%5Cpi%2C%5Ceta%29%3D%5Csum%5Climits%7Bi%3D1%7D%5EN%5Clog%20%5Cpii%5Ccdot%20p%28O%2Ci_1%3Dq_i%7C%5Clambda%5Et%29%2B%5Ceta%28%5Csum%5Climits%7Bi%3D1%7D%5EN%5Cpi_i-1%29%0A&height=47&width=334#crop=0&crop=0&crop=1&crop=1&id=XXDWg&originHeight=66&originWidth=468&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

于是:

11.隐马尔可夫模型 - 图54%2B%5Ceta%3D0%0A#card=math&code=%5Cfrac%7B%5Cpartial%20L%7D%7B%5Cpartial%5Cpi_i%7D%3D%5Cfrac%7B1%7D%7B%5Cpi_i%7Dp%28O%2Ci_1%3Dq_i%7C%5Clambda%5Et%29%2B%5Ceta%3D0%0A&height=37&width=218#crop=0&crop=0&crop=1&crop=1&id=P3Su9&originHeight=53&originWidth=306&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

对上式求和:

11.隐马尔可夫模型 - 图55%2B%5Cpii%5Ceta%3D0%5CRightarrow%5Ceta%3D-p(O%7C%5Clambda%5Et)%0A#card=math&code=%5Csum%5Climits%7Bi%3D1%7D%5ENp%28O%2Ci_1%3Dq_i%7C%5Clambda%5Et%29%2B%5Cpi_i%5Ceta%3D0%5CRightarrow%5Ceta%3D-p%28O%7C%5Clambda%5Et%29%0A&height=47&width=301#crop=0&crop=0&crop=1&crop=1&id=TouSc&originHeight=66&originWidth=421&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

所以:

11.隐马尔可夫模型 - 图56%7D%7Bp(O%7C%5Clambda%5Et)%7D%0A#card=math&code=%5Cpi_i%5E%7Bt%2B1%7D%3D%5Cfrac%7Bp%28O%2Ci_1%3Dq_i%7C%5Clambda%5Et%29%7D%7Bp%28O%7C%5Clambda%5Et%29%7D%0A&height=41&width=152#crop=0&crop=0&crop=1&crop=1&id=VPjdY&originHeight=59&originWidth=212&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

Decoding

Decoding 问题表述为:

11.隐马尔可夫模型 - 图57%0A#card=math&code=I%3D%5Cmathop%7Bargmax%7D%5Climits_%7BI%7Dp%28I%7CO%2C%5Clambda%29%0A&height=28&width=139#crop=0&crop=0&crop=1&crop=1&id=Tk1sY&originHeight=41&originWidth=194&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

我们需要找到一个序列,其概率最大,这个序列就是在参数空间中的一个路径,可以采用动态规划的思想。

定义:

11.隐马尔可夫模型 - 图58%3D%5Cmax%5Climits%7Bi_1%2C%5Ccdots%2Ci%7Bt-1%7D%7Dp(o1%2C%5Ccdots%2Co_t%2Ci_1%2C%5Ccdots%2Ci%7Bt-1%7D%2Cit%3Dq_i)%0A#card=math&code=%5Cdelta%7Bt%7D%28j%29%3D%5Cmax%5Climits%7Bi_1%2C%5Ccdots%2Ci%7Bt-1%7D%7Dp%28o1%2C%5Ccdots%2Co_t%2Ci_1%2C%5Ccdots%2Ci%7Bt-1%7D%2Ci_t%3Dq_i%29%0A&height=28&width=302#crop=0&crop=0&crop=1&crop=1&id=bamYC&originHeight=41&originWidth=423&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

于是:

11.隐马尔可夫模型 - 图59%3D%5Cmax%5Climits%7B1%5Cle%20i%5Cle%20N%7D%5Cdelta_t(i)a%7Bij%7Dbj(o%7Bt%2B1%7D)%0A#card=math&code=%5Cdelta%7Bt%2B1%7D%28j%29%3D%5Cmax%5Climits%7B1%5Cle%20i%5Cle%20N%7D%5Cdeltat%28i%29a%7Bij%7Dbj%28o%7Bt%2B1%7D%29%0A&height=26&width=197#crop=0&crop=0&crop=1&crop=1&id=lRogy&originHeight=38&originWidth=275&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

这个式子就是从上一步到下一步的概率再求最大值。记这个路径为:

11.隐马尔可夫模型 - 图60%3D%5Cmathop%7Bargmax%7D%5Climits%7B1%5Cle%20i%5Cle%20N%7D%5Cdelta_t(i)a%7Bij%7D%0A#card=math&code=%5Cpsi%7Bt%2B1%7D%28j%29%3D%5Cmathop%7Bargmax%7D%5Climits%7B1%5Cle%20i%5Cle%20N%7D%5Cdeltat%28i%29a%7Bij%7D%0A&height=29&width=166#crop=0&crop=0&crop=1&crop=1&id=SYbiD&originHeight=42&originWidth=233&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)

小结

HMM 是一种动态模型,是由混合树形模型和时序结合起来的一种模型(类似 GMM + Time)。对于类似 HMM 的这种状态空间模型,普遍的除了学习任务(采用 EM )外,还有推断任务,推断任务包括:

  1. 译码 Decoding:11.隐马尔可夫模型 - 图61#card=math&code=p%28z_1%2Cz_2%2C%5Ccdots%2Cz_t%7Cx_1%2Cx_2%2C%5Ccdots%2Cx_t%29&height=18&width=188#crop=0&crop=0&crop=1&crop=1&id=DmQj7&originHeight=26&originWidth=263&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
  2. 似然概率:11.隐马尔可夫模型 - 图62#card=math&code=p%28X%7C%5Ctheta%29&height=18&width=43#crop=0&crop=0&crop=1&crop=1&id=srUwn&originHeight=26&originWidth=61&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
  3. 滤波:11.隐马尔可夫模型 - 图63#card=math&code=%C2%A0p%28zt%7Cx_1%2C%5Ccdots%2Cx_t%29&height=18&width=98#crop=0&crop=0&crop=1&crop=1&id=VAzZk&originHeight=26&originWidth=138&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),Online
    ![](https://g.yuque.com/gr/latex?p(z_t%7Cx
    %7B1%3At%7D)%3D%5Cfrac%7Bp(x%7B1%3At%7D%2Cz_t)%7D%7Bp(x%7B1%3At%7D)%7D%3DC%5Calphat(z_t)%0A#card=math&code=p%28z_t%7Cx%7B1%3At%7D%29%3D%5Cfrac%7Bp%28x%7B1%3At%7D%2Cz_t%29%7D%7Bp%28x%7B1%3At%7D%29%7D%3DC%5Calpha_t%28z_t%29%0A&height=41&width=214#crop=0&crop=0&crop=1&crop=1&id=FP6YQ&originHeight=59&originWidth=299&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
  4. 平滑:11.隐马尔可夫模型 - 图64#card=math&code=p%28zt%7Cx_1%2C%5Ccdots%2Cx_T%29&height=18&width=102#crop=0&crop=0&crop=1&crop=1&id=tPLz8&originHeight=26&originWidth=143&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=),Offline
    ![](https://g.yuque.com/gr/latex?p(z_t%7Cx
    %7B1%3AT%7D)%3D%5Cfrac%7Bp(x%7B1%3AT%7D%2Cz_t)%7D%7Bp(x%7B1%3AT%7D)%7D%3D%5Cfrac%7B%5Calphat(z_t)p(x%7Bt%2B1%3AT%7D%7Cx%7B1%3At%7D%2Cz_t)%7D%7Bp(x%7B1%3AT%7D)%7D%0A#card=math&code=p%28zt%7Cx%7B1%3AT%7D%29%3D%5Cfrac%7Bp%28x%7B1%3AT%7D%2Cz_t%29%7D%7Bp%28x%7B1%3AT%7D%29%7D%3D%5Cfrac%7B%5Calphat%28z_t%29p%28x%7Bt%2B1%3AT%7D%7Cx%7B1%3At%7D%2Cz_t%29%7D%7Bp%28x%7B1%3AT%7D%29%7D%0A&height=41&width=317#crop=0&crop=0&crop=1&crop=1&id=FMkdq&originHeight=59&originWidth=444&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)


根据概率图的条件独立性,有:
11.隐马尔可夫模型 - 图65%3D%5Cfrac%7B%5Calphat(z_t)p(x%7Bt%2B1%3AT%7D%7Czt)%7D%7Bp(x%7B1%3AT%7D)%7D%3DC%5Calphat(z_t)%5Cbeta_t(z_t)%0A#card=math&code=p%28z_t%7Cx%7B1%3AT%7D%29%3D%5Cfrac%7B%5Calphat%28z_t%29p%28x%7Bt%2B1%3AT%7D%7Czt%29%7D%7Bp%28x%7B1%3AT%7D%29%7D%3DC%5Calpha_t%28z_t%29%5Cbeta_t%28z_t%29%0A&height=41&width=307#crop=0&crop=0&crop=1&crop=1&id=sJjna&originHeight=59&originWidth=430&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)


这个算法叫做前向后向算法。
5. 预测:11.隐马尔可夫模型 - 图66%2Cp(x%7Bt%2B1%7D%2Cx%7Bt%2B2%7D%7Cx1%2C%5Ccdots%2Cx_t)#card=math&code=p%28z%7Bt%2B1%7D%2Cz%7Bt%2B2%7D%7Cx_1%2C%5Ccdots%2Cx_t%29%2Cp%28x%7Bt%2B1%7D%2Cx%7Bt%2B2%7D%7Cx_1%2C%5Ccdots%2Cx_t%29&height=18&width=298#crop=0&crop=0&crop=1&crop=1&id=HOspw&originHeight=26&originWidth=418&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
![](https://g.yuque.com/gr/latex?p(z
%7Bt%2B1%7D%7Cx%7B1%3At%7D)%3D%5Csum%7Bzt%7Dp(z%7Bt%2B1%7D%2Czt%7Cx%7B1%3At%7D)%3D%5Csum%5Climits%7Bz_t%7Dp(z%7Bt%2B1%7D%7Czt)p(z_t%7Cx%7B1%3At%7D)%0A#card=math&code=p%28z%7Bt%2B1%7D%7Cx%7B1%3At%7D%29%3D%5Csum%7Bz_t%7Dp%28z%7Bt%2B1%7D%2Czt%7Cx%7B1%3At%7D%29%3D%5Csum%5Climits%7Bz_t%7Dp%28z%7Bt%2B1%7D%7Czt%29p%28z_t%7Cx%7B1%3At%7D%29%0A&height=36&width=369#crop=0&crop=0&crop=1&crop=1&id=DvTlT&originHeight=51&originWidth=516&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)
11.隐马尔可夫模型 - 图67%3D%5Csum%5Climits%7Bz%7Bt%2B1%7D%7Dp(x%7Bt%2B1%7D%2Cz%7Bt%2B1%7D%7Cx%7B1%3At%7D)%3D%5Csum%5Climits%7Bz%7Bt%2B1%7D%7Dp(x%7Bt%2B1%7D%7Cz%7Bt%2B1%7D)p(z%7Bt%2B1%7D%7Cx%7B1%3At%7D)%0A#card=math&code=p%28x%7Bt%2B1%7D%7Cx%7B1%3At%7D%29%3D%5Csum%5Climits%7Bz%7Bt%2B1%7D%7Dp%28x%7Bt%2B1%7D%2Cz%7Bt%2B1%7D%7Cx%7B1%3At%7D%29%3D%5Csum%5Climits%7Bz%7Bt%2B1%7D%7Dp%28x%7Bt%2B1%7D%7Cz%7Bt%2B1%7D%29p%28z%7Bt%2B1%7D%7Cx%7B1%3At%7D%29%0A&height=37&width=414#crop=0&crop=0&crop=1&crop=1&id=vLwkJ&originHeight=53&originWidth=580&originalType=binary&ratio=1&rotation=0&showTitle=false&status=done&style=none&title=)