外点法和内点法虽然好用,但它也有缺点:随着惩罚因子趋向于极限,罚函数的 Hesse 矩阵的条件数无限增大,变得越来越病态,这给计算带来了很大的困难。因此,Lagrangian 法应运而生。
只有等式约束的情况
考虑优化问题
%20%5C%5C%0A%5Ctext%20%7B%20s.t.%20%7D%20%26h%7Bj%7D(%5Cboldsymbol%7Bx%7D)%3D0%2C%20j%3D1%2C%20%5Ccdots%2C%20l%0A%5Cend%7Baligned%7D%20%5Ctag%7B1%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cmin%20%5Cquad%20%26f%28%5Cboldsymbol%7Bx%7D%29%20%5C%5C%0A%5Ctext%20%7B%20s.t.%20%7D%20%26h%7Bj%7D%28%5Cboldsymbol%7Bx%7D%29%3D0%2C%20j%3D1%2C%20%5Ccdots%2C%20l%0A%5Cend%7Baligned%7D%20%5Ctag%7B1%7D%0A&id=jgQpC)
运用乘子法首先需要定义增广 Lagrange 函数:
增广 Lagrange 函数与普通 Lagrange 函数的主要区别在于增加了惩罚项 %5ET%20%5Cboldsymbol%7Bh%7D(%5Cboldsymbol%7Bx%7D)#card=math&code=%5Cfrac%7B%5Csigma%7D%7B2%7D%20%5Cboldsymbol%7Bh%7D%28%5Cboldsymbol%7Bx%7D%29%5ET%20%5Cboldsymbol%7Bh%7D%28%5Cboldsymbol%7Bx%7D%29&id=KQGIi)。
此处不加证明的给出一个结论:对于某一个乘子 ,只要
充分大(不用趋近于无穷),就能通过最小化公式(2),来最小化问题(1)。这样就解决了内点法和外点法在迭代时惩罚因子趋近于无穷,导致数值计算出现病态的问题。那么,如何找到最优乘子
呢?一般方法是,先给定充分大的
和初始估计
,然后在迭代过程中修正
,不加证明的给出迭代式为:
%7D_j%20%3D%20v%5E%7B(k)%7D_j%20-%20%5Csigma%20h_j(%5Cboldsymbol%7Bx%7D%5E%7B(k)%7D)%2C%5Cquad%20j%3D1%2C…%2Cl%20%5Ctag%7B3%7D%0A#card=math&code=v%5E%7B%28k%2B1%29%7D_j%20%3D%20v%5E%7B%28k%29%7D_j%20-%20%5Csigma%20h_j%28%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D%29%2C%5Cquad%20j%3D1%2C…%2Cl%20%5Ctag%7B3%7D%0A&id=bUWRw)
在迭代过程中如果 %7D_j#card=math&code=v%5E%7B%28k%2B1%29%7D_j&id=uMh00) 收敛过慢,那么需要调大
,收敛快慢用
%7D)%7C%7C%20%2F%20%7C%7Ch(%5Cboldsymbol%7Bx%7D%5E%7B(k-1)%7D)%7C%7C#card=math&code=%7C%7Ch%28%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D%29%7C%7C%20%2F%20%7C%7Ch%28%5Cboldsymbol%7Bx%7D%5E%7B%28k-1%29%7D%29%7C%7C&id=RU47J) 来衡量。
那么只有等式约束的乘子法计算步骤如下:
- 给定初始点
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%280%29%7D&id=ZSs4y),乘子向量初始估计
,参数
,允许误差
,常数
,
#card=math&code=%5Cbeta%5Cin%280%2C1%29&id=zCfxT),置
- 以
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%28k-1%29%7D&id=wDS9g) 为初始点,求解无约束问题,假设其极小点为
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D&id=FnJHg)
%0A#card=math&code=%5Cmin_%7B%5Cboldsymbol%7Bx%7D%7D%20%5Cphi%28%5Cboldsymbol%7Bx%7D%2C%5Cboldsymbol%7Bv%7D%5Ek%2C%5Csigma%29%0A&id=oUYOz)
- 若
%7D)%7C%7C%20%3C%20%5Cepsilon#card=math&code=%7C%7Ch%28%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D%29%7C%7C%20%3C%20%5Cepsilon&id=PjYQz),则停止计算,得到点
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D&id=Yh2gh);否则进入步骤 4
- 若
%7D)%7C%7C%20%2F%20%7C%7Ch(%5Cboldsymbol%7Bx%7D%5E%7B(k-1)%7D)%7C%7C%20%5Cge%20%5Cbeta#card=math&code=%7C%7Ch%28%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D%29%7C%7C%20%2F%20%7C%7Ch%28%5Cboldsymbol%7Bx%7D%5E%7B%28k-1%29%7D%29%7C%7C%20%5Cge%20%5Cbeta&id=pi0Vc),则置
。然后进入步骤 5
- 用公式(3)计算
%7D_j#card=math&code=v%5E%7B%28k%2B1%29%7D_j&id=THhMI),置
,回到步骤 2
只有不等式约束的情况
考虑优化问题
%20%5C%5C%0A%5Ctext%20%7B%20s.t.%20%7D%20%26g%7Bj%7D(%5Cboldsymbol%7Bx%7D)%5Cge%200%2C%20j%3D1%2C%20%5Ccdots%2C%20m%0A%5Cend%7Baligned%7D%20%5Ctag%7B4%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cmin%20%5Cquad%20%26f%28%5Cboldsymbol%7Bx%7D%29%20%5C%5C%0A%5Ctext%20%7B%20s.t.%20%7D%20%26g%7Bj%7D%28%5Cboldsymbol%7Bx%7D%29%5Cge%200%2C%20j%3D1%2C%20%5Ccdots%2C%20m%0A%5Cend%7Baligned%7D%20%5Ctag%7B4%7D%0A&id=EmF3x)
引入变量 把不等式约束问题转化为等式约束问题。
%20%5C%5C%0A%5Ctext%20%7B%20s.t.%20%7D%20%26g%7Bj%7D(%5Cboldsymbol%7Bx%7D)-y_j%5E2%20%3D%200%2C%20%5Cquad%20j%3D1%2C%20%5Ccdots%2C%20m%0A%5Cend%7Baligned%7D%20%5Ctag%7B5%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cmin%20%5Cquad%20%26f%28%5Cboldsymbol%7Bx%7D%29%20%5C%5C%0A%5Ctext%20%7B%20s.t.%20%7D%20%26g%7Bj%7D%28%5Cboldsymbol%7Bx%7D%29-y_j%5E2%20%3D%200%2C%20%5Cquad%20j%3D1%2C%20%5Ccdots%2C%20m%0A%5Cend%7Baligned%7D%20%5Ctag%7B5%7D%0A&id=NawUc)
那么定义增广拉格朗日函数为:
%3Df(%5Cboldsymbol%7Bx%7D)-%5Csum%7Bj%3D1%7D%5Em%20w_j%20(g_j(%5Cboldsymbol%7Bx%7D)%20-%20y_j%5E2)%2B%5Cfrac%7B%5Csigma%7D%7B2%7D%5Csum%7Bj%3D1%7D%5Em%20(gj(%5Cboldsymbol%7Bx%7D)%20-%20y_j%5E2)%5E2%20%5Ctag%7B6%7D%0A#card=math&code=%5Cbar%7B%5Cphi%7D%28%5Cboldsymbol%7Bx%7D%2C%5Cboldsymbol%7By%7D%2C%5Cboldsymbol%7Bw%7D%2C%7B%5Csigma%7D%29%3Df%28%5Cboldsymbol%7Bx%7D%29-%5Csum%7Bj%3D1%7D%5Em%20wj%20%28g_j%28%5Cboldsymbol%7Bx%7D%29%20-%20y_j%5E2%29%2B%5Cfrac%7B%5Csigma%7D%7B2%7D%5Csum%7Bj%3D1%7D%5Em%20%28g_j%28%5Cboldsymbol%7Bx%7D%29%20-%20y_j%5E2%29%5E2%20%5Ctag%7B6%7D%0A&id=Yx5w8)
因此,我们可以求解公式(6)的极小值。下面我们想办法把 消掉。先考虑如下问题:
%20%5Ctag%7B7%7D%0A#card=math&code=%5Cmin_%7B%5Cboldsymbol%7By%7D%7D%20%5Cquad%20%5Cbar%7B%5Cphi%7D%28%5Cboldsymbol%7Bx%7D%2C%5Cboldsymbol%7By%7D%2C%5Cboldsymbol%7Bw%7D%2C%7B%5Csigma%7D%29%20%5Ctag%7B7%7D%0A&id=seMju)
用配方法,有
%2B%5Csum%7Bj%3D1%7D%5Em%5Cleft%5B-w_j%20(g_j(%5Cboldsymbol%7Bx%7D)%20-%20y_j%5E2)%20%2B%20%5Cfrac%7B%5Csigma%7D%7B2%7D%5Csum%7Bj%3D1%7D%5Em%20(gj(%5Cboldsymbol%7Bx%7D)%20-%20y_j%5E2)%5E2%5Cright%5D%20%5C%5C%0A%26%3D%20f(%5Cboldsymbol%7Bx%7D)%2B%5Csum%7Bj%3D1%7D%5Em%5Cleft%5B%5Cfrac%7B%5Csigma%7D%7B2%7D%20%5Cleft%5Byj%5E2-%5Cfrac%7B1%7D%7B%5Csigma%7D(%5Csigma%20g_j(%5Cboldsymbol%7Bx%7D)-w_j)%5Cright%5D%5E2-%5Cfrac%7Bw_j%5E2%7D%7B2%5Csigma%7D%5Cright%5D%0A%5Cend%7Baligned%7D%5Ctag%7B8%7D#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cbar%7B%5Cphi%7D%26%3Df%28%5Cboldsymbol%7Bx%7D%29%2B%5Csum%7Bj%3D1%7D%5Em%5Cleft%5B-wj%20%28g_j%28%5Cboldsymbol%7Bx%7D%29%20-%20y_j%5E2%29%20%2B%20%5Cfrac%7B%5Csigma%7D%7B2%7D%5Csum%7Bj%3D1%7D%5Em%20%28gj%28%5Cboldsymbol%7Bx%7D%29%20-%20y_j%5E2%29%5E2%5Cright%5D%20%5C%5C%0A%26%3D%20f%28%5Cboldsymbol%7Bx%7D%29%2B%5Csum%7Bj%3D1%7D%5Em%5Cleft%5B%5Cfrac%7B%5Csigma%7D%7B2%7D%20%5Cleft%5By_j%5E2-%5Cfrac%7B1%7D%7B%5Csigma%7D%28%5Csigma%20g_j%28%5Cboldsymbol%7Bx%7D%29-w_j%29%5Cright%5D%5E2-%5Cfrac%7Bw_j%5E2%7D%7B2%5Csigma%7D%5Cright%5D%0A%5Cend%7Baligned%7D%5Ctag%7B8%7D&id=uBxgJ)
为了使公式(8)关于 取得极小值,
- 当
-w_j%20%5Cge%200#card=math&code=%5Csigma%20g_j%28%5Cboldsymbol%7Bx%7D%29-w_j%20%5Cge%200&id=FnfBS) 时,
-w_j)%0A#card=math&code=y_j%5E2%3D%5Cfrac%7B1%7D%7B%5Csigma%7D%28%5Csigma%20g_j%28%5Cboldsymbol%7Bx%7D%29-w_j%29%0A&id=SCiqt)
- 当
-w_j%20%3C%200#card=math&code=%5Csigma%20g_j%28%5Cboldsymbol%7Bx%7D%29-w_j%20%3C%200&id=SpgZ4) 时,
综上
-w_j%5C%7D%0A#card=math&code=y_j%5E2%3D%5Cfrac%7B1%7D%7B%5Csigma%7D%5Cmax%20%5C%7B0%2C%20%5Csigma%20g_j%28%5Cboldsymbol%7Bx%7D%29-w_j%5C%7D%0A&id=k59hd)
因此,可以定义新的拉格朗日增广矩阵为:
%3Df(%5Cboldsymbol%7Bx%7D)%2B%5Cfrac%7B1%7D%7B2%5Csigma%7D%5Csum%7Bj%3D1%7D%5Em%20%5Cbigg%5B%5Cbig%5B%5Cmax(0%2C%20w_j-%5Csigma%20g_j(%5Cboldsymbol%7Bx%7D))%5Cbig%5D%5E2-w_j%5E2%5Cbigg%5D%5Ctag%7B9%7D%0A#card=math&code=%7B%5Cphi%7D%28%5Cboldsymbol%7Bx%7D%2C%5Cboldsymbol%7Bw%7D%2C%7B%5Csigma%7D%29%3Df%28%5Cboldsymbol%7Bx%7D%29%2B%5Cfrac%7B1%7D%7B2%5Csigma%7D%5Csum%7Bj%3D1%7D%5Em%20%5Cbigg%5B%5Cbig%5B%5Cmax%280%2C%20w_j-%5Csigma%20g_j%28%5Cboldsymbol%7Bx%7D%29%29%5Cbig%5D%5E2-w_j%5E2%5Cbigg%5D%5Ctag%7B9%7D%0A&id=isIIp)
那么问题(4)可以转化为下述问题:
%20%0A#card=math&code=%5Cmin%20%7B%5Cphi%7D%28%5Cboldsymbol%7Bx%7D%2C%5Cboldsymbol%7Bw%7D%2C%7B%5Csigma%7D%29%20%0A&id=HuApa)
计算中, 的迭代公式为:
%7D%20%3D%20%5Cmax(0%2Cw_j%5E%7B(k)%7D-%5Csigma%20g_j(%5Cboldsymbol%7Bx%7D%5E%7B(k)%7D))%2C%5Cquad%20j%3D1%2C…%2Cm%20%5Ctag%7B10%7D%0A#card=math&code=w_j%5E%7B%28k%2B1%29%7D%20%3D%20%5Cmax%280%2Cw_j%5E%7B%28k%29%7D-%5Csigma%20g_j%28%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D%29%29%2C%5Cquad%20j%3D1%2C…%2Cm%20%5Ctag%7B10%7D%0A&id=yy3N2)
那么只包含不等式约束的乘子法可表示为:
- 给定初始点
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%280%29%7D&id=ArYIb),乘子向量初始估计
,固定参数
,置
- 以
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%28k-1%29%7D&id=msujy) 为初始点,求解无约束问题,假设其极小点为
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D&id=JLsxZ)
%0A#card=math&code=%5Cmin_%7B%5Cboldsymbol%7Bx%7D%7D%20%5Cphi%28%5Cboldsymbol%7Bx%7D%2C%5Cboldsymbol%7Bw%7D%5Ek%2C%5Csigma%29%0A&id=hUOuX)
- 若
#card=math&code=g_i%28%5Cboldsymbol%7Bx%7D%29&id=balNw) 满足约束,则停止计算,得到点
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D&id=XzqUb);否则进入步骤 4
- 用公式(10)计算
%7D_j#card=math&code=v%5E%7B%28k%2B1%29%7D_j&id=XqmWx),置
,回到步骤 2
一般约束的情况
考虑优化问题
%20%5C%5C%0A%5Ctext%20%7B%20s.t.%20%7D%20%5Cquad%20%26g%7Bi%7D(%5Cboldsymbol%7Bx%7D)%20%5Cgeq%200%2C%20i%3D1%2C%20%5Ccdots%2C%20m%20%5C%5C%0A%5Cquad%20%26h%7Bj%7D(%5Cboldsymbol%7Bx%7D)%3D0%2C%20j%3D1%2C%20%5Ccdots%2C%20l%0A%5Cend%7Baligned%7D%20%5Ctag%7B11%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cmin%20%5Cquad%20%26f%28%5Cboldsymbol%7Bx%7D%29%20%5C%5C%0A%5Ctext%20%7B%20s.t.%20%7D%20%5Cquad%20%26g%7Bi%7D%28%5Cboldsymbol%7Bx%7D%29%20%5Cgeq%200%2C%20i%3D1%2C%20%5Ccdots%2C%20m%20%5C%5C%0A%5Cquad%20%26h%7Bj%7D%28%5Cboldsymbol%7Bx%7D%29%3D0%2C%20j%3D1%2C%20%5Ccdots%2C%20l%0A%5Cend%7Baligned%7D%20%5Ctag%7B11%7D%0A&id=UURfa)
定义增广拉格朗日函数为:
%3Df(%5Cboldsymbol%7Bx%7D)%2B%5Cfrac%7B1%7D%7B2%5Csigma%7D%5Csum%7Bj%3D1%7D%5Em%20%5Cbigg%5B%5Cbig%5B%5Cmax(0%2C%20w_j-%5Csigma%20g_j(%5Cboldsymbol%7Bx%7D))%5Cbig%5D%5E2-w_j%5E2%5Cbigg%5D-%5Csum%7Bj%3D1%7D%5El%20vj%20h_j(%5Cboldsymbol%7Bx%7D)%2B%5Cfrac%7B%5Csigma%7D%7B2%7D%5Csum%7Bj%3D1%7D%5El%20h%5E2j(%5Cboldsymbol%7Bx%7D)%5Ctag%7B12%7D%0A#card=math&code=%7B%5Cphi%7D%28%5Cboldsymbol%7Bx%7D%2C%5Cboldsymbol%7Bw%7D%2C%5Cboldsymbol%7Bv%7D%2C%7B%5Csigma%7D%29%3Df%28%5Cboldsymbol%7Bx%7D%29%2B%5Cfrac%7B1%7D%7B2%5Csigma%7D%5Csum%7Bj%3D1%7D%5Em%20%5Cbigg%5B%5Cbig%5B%5Cmax%280%2C%20wj-%5Csigma%20g_j%28%5Cboldsymbol%7Bx%7D%29%29%5Cbig%5D%5E2-w_j%5E2%5Cbigg%5D-%5Csum%7Bj%3D1%7D%5El%20vj%20h_j%28%5Cboldsymbol%7Bx%7D%29%2B%5Cfrac%7B%5Csigma%7D%7B2%7D%5Csum%7Bj%3D1%7D%5El%20h%5E2_j%28%5Cboldsymbol%7Bx%7D%29%5Ctag%7B12%7D%0A&id=obhhG)
对于两个参数的修正为:
%7D%20%26%3D%20%5Cmax(0%2Cw_j%5E%7B(k)%7D-%5Csigma%20g_j(%5Cboldsymbol%7Bx%7D%5E%7B(k)%7D))%2C%5Cquad%20j%3D1%2C…%2Cm%5C%5C%0Av%5E%7B(k%2B1)%7D_j%20%26%3D%20v%5E%7B(k)%7D_j%20-%20%5Csigma%20h_j(%5Cboldsymbol%7Bx%7D%5E%7B(k)%7D)%2C%5Cquad%20j%3D1%2C…%2Cl%20%20%0A%5Cend%7Baligned%7D%20%5Ctag%7B13%7D#card=math&code=%5Cbegin%7Baligned%7D%0Aw_j%5E%7B%28k%2B1%29%7D%20%26%3D%20%5Cmax%280%2Cw_j%5E%7B%28k%29%7D-%5Csigma%20g_j%28%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D%29%29%2C%5Cquad%20j%3D1%2C…%2Cm%5C%5C%0Av%5E%7B%28k%2B1%29%7D_j%20%26%3D%20v%5E%7B%28k%29%7D_j%20-%20%5Csigma%20h_j%28%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D%29%2C%5Cquad%20j%3D1%2C…%2Cl%20%20%0A%5Cend%7Baligned%7D%20%5Ctag%7B13%7D&id=fKp5S)
那么一般约束下乘子法计算步骤如下:
- 给定初始点
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%280%29%7D&id=p3JO4),乘子向量初始估计
,参数
,允许误差
,常数
,
#card=math&code=%5Cbeta%5Cin%280%2C1%29&id=ajbGU),置
- 以
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%28k-1%29%7D&id=DSBzU) 为初始点,求解无约束问题,假设其极小点为
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D&id=L9JcZ)
%0A#card=math&code=%5Cmin_%7B%5Cboldsymbol%7Bx%7D%7D%20%5Cphi%28%5Cboldsymbol%7Bx%7D%2C%5Cboldsymbol%7Bw%7D%5Ek%2C%5Cboldsymbol%7Bv%7D%5Ek%2C%5Csigma%29%0A&id=CQRUk)
- 若
%7D)%7C%7C%20%3C%20%5Cepsilon#card=math&code=%7C%7Ch%28%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D%29%7C%7C%20%3C%20%5Cepsilon&id=Q5coi) 且 则停止计算,得到点
%7D#card=math&code=%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D&id=tBmcv);否则进入步骤 4
- 若
%7D)%7C%7C%20%2F%20%7C%7Ch(%5Cboldsymbol%7Bx%7D%5E%7B(k-1)%7D)%7C%7C%20%5Cge%20%5Cbeta#card=math&code=%7C%7Ch%28%5Cboldsymbol%7Bx%7D%5E%7B%28k%29%7D%29%7C%7C%20%2F%20%7C%7Ch%28%5Cboldsymbol%7Bx%7D%5E%7B%28k-1%29%7D%29%7C%7C%20%5Cge%20%5Cbeta&id=NFpcK),置
。然后进入步骤 5
- 用公式(13)计算
%7D_j%2Cw%5E%7B(k%2B1)%7D_j#card=math&code=v%5E%7B%28k%2B1%29%7D_j%2Cw%5E%7B%28k%2B1%29%7D_j&id=RNEuL),置
,回到步骤 2
