在约束优化问题中,原始问题可能不好求解,所以通过将原始问题转化为对偶问题,方便求解。
那么,什么是对偶问题呢?
原始问题
假设%2C%20c_i(x)%2C%20h_j(x)#card=math&code=f%28x%29%2C%20c_i%28x%29%2C%20h_j%28x%29&height=21&id=MdfuT)在
上连续可微,那么原始问题为:
%20%5C%5C%0As.t.%20%5Cquad%20%26ci(x)%20%5Cleqslant%200%2C%20%5Cquad%20i%20%3D%201%2C2%2C…%2Ck%20%20%5C%5C%0A%26h_j(x)%3D0%2C%20%5Cquad%20j%3D1%2C2%2C…%2Cl%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%5Cdisplaystyle%20%5Cmin%7Bx%20%5Cin%20R%5En%7D%20%26f%28x%29%20%5C%5C%0As.t.%20%5Cquad%20%26c_i%28x%29%20%5Cleqslant%200%2C%20%5Cquad%20i%20%3D%201%2C2%2C…%2Ck%20%20%5C%5C%0A%26h_j%28x%29%3D0%2C%20%5Cquad%20j%3D1%2C2%2C…%2Cl%0A%5Cend%7Baligned%7D%0A&height=76&id=yjkq9)
上面的的两个条件是对
#card=math&code=f%28x%29&height=20&id=JfUhZ)的约束,我们也称上面的问题为约束优化问题。
解决约束优化问题我们不顺手,但是解决普通的优化问题,我们还是有很多办法的,拿一元函数#card=math&code=f%28x%29&height=20&id=LtpCx)为例,如果函数可微,我们只需要求其导数为0的点,就是极值点。所以,我们这里考虑将约束问题,转化为一个非约束问题,这可以使用广义拉格朗日函数来做到:
%3Df(x)%2B%5Csum%7Bi%3D1%7D%5Ek%20%5Calpha_i%20c_i(x)%20%2B%20%5Csum%7Bj%3D1%7D%5El%20%5Cbeta%7Bj%7Dh_j(x)%0A#card=math&code=L%28x%2C%5Calpha%2C%5Cbeta%29%3Df%28x%29%2B%5Csum%7Bi%3D1%7D%5Ek%20%5Calphai%20c_i%28x%29%20%2B%20%5Csum%7Bj%3D1%7D%5El%20%5Cbeta_%7Bj%7Dh_j%28x%29%0A&height=55&id=E7oUi)
上式里的是拉格朗日乘子,其实也就是一个是实数的参数,要求
。因为c是<=0的,我们后面会求L的max。
所以上面的式子并不难理解,他就是最上面我们约束优化问题的一个综合而已。
如果我们现在假设是常量,那么
#card=math&code=L%28x%2C%5Calpha%2C%5Cbeta%29&height=20&id=JKstG)就变成了以
为参数的函数,我们用
#card=math&code=%5Ctheta_P%28x%29&height=20&id=TpoP8)来表示:
%3D%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calpha_i%20%5Cgeqslant0%7D%20L(x%2C%20%5Calpha%2C%20%5Cbeta)%0A#card=math&code=%5Ctheta_P%28x%29%3D%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calpha_i%20%5Cgeqslant0%7D%20L%28x%2C%20%5Calpha%2C%20%5Cbeta%29%0A&height=31&id=JiY5H)
这里的下标表示这是原始问题,为什么前面多了个max?简单的说,因为增加了约束条件后,新的可行域要么不起作用,要么作用在边界上,因为c是<=0的,所以边界就是max。
假如此时违反了约束条件,即存在某个%20%3E0#card=math&code=c_i%28x%29%20%3E0&height=20&id=Fv9oc)或者存在某个
%20%5Cneq%200#card=math&code=h_j%28x%29%20%5Cneq%200&height=21&id=jr3vM),那么就有上式:
%3D%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calpha_i%20%5Cgeqslant0%7D%20%5Cbigg%5B%20f(x)%2B%5Csum%7Bi%3D1%7D%5Ek%20%5Calphai%20c_i(x)%20%2B%20%5Csum%7Bj%3D1%7D%5El%20%5Cbeta%7Bj%7Dh_j(x)%20%5Cbigg%5D%3D%2B%5Cinfty%0A#card=math&code=%5Ctheta_P%28x%29%3D%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calphai%20%5Cgeqslant0%7D%20%5Cbigg%5B%20f%28x%29%2B%5Csum%7Bi%3D1%7D%5Ek%20%5Calphai%20c_i%28x%29%20%2B%20%5Csum%7Bj%3D1%7D%5El%20%5Cbeta_%7Bj%7Dh_j%28x%29%20%5Cbigg%5D%3D%2B%5Cinfty%0A&height=55&id=DzN5g)
这是因为如果存在某个%20%3E0#card=math&code=c_i%28x%29%20%3E0&height=20&id=weZM4),那么可以令
,从而使得
%20%5Crightarrow%20%2B%5Cinfty#card=math&code=%5Calpha_i%20c_i%28x%29%20%5Crightarrow%20%2B%5Cinfty&height=20&id=USnh6);如果存在某个
%20%5Cneq%200#card=math&code=h_j%28x%29%20%5Cneq%200&height=21&id=mlB1j),则可以令
,使得
%20%5Crightarrow%20%2B%5Cinfty#card=math&code=%5Cbeta_j%20h_j%28x%29%20%5Crightarrow%20%2B%5Cinfty&height=21&id=tpQqI)。其余各
均取0即可。
假如满足约束,那么
%3D%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calpha_i%20%5Cgeqslant0%7D%20%5Cbigg%5B%20f(x)%20%5Cbigg%5D%0A#card=math&code=%5Ctheta_P%28x%29%3D%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calpha_i%20%5Cgeqslant0%7D%20%5Cbigg%5B%20f%28x%29%20%5Cbigg%5D%0A&height=45&id=HdQoq)
因此,
%20%3D%20%5Cbegin%7Bcases%7D%0A%20%20%20f(x)%20%26%5Ctext%7Bif%20%7D%20x%E6%BB%A1%E8%B6%B3%E5%8E%9F%E5%A7%8B%E9%97%AE%E9%A2%98%E7%BA%A6%E6%9D%9F%20%5C%5C%0A%20%20%20%2B%5Cinfty%20%26%5Ctext%7Bif%20%7D%20%E5%85%B6%E4%BB%96%20%0A%5Cend%7Bcases%7D%0A#card=math&code=%5Ctheta_P%28x%29%20%3D%20%5Cbegin%7Bcases%7D%0A%20%20%20f%28x%29%20%26%5Ctext%7Bif%20%7D%20x%E6%BB%A1%E8%B6%B3%E5%8E%9F%E5%A7%8B%E9%97%AE%E9%A2%98%E7%BA%A6%E6%9D%9F%20%5C%5C%0A%20%20%20%2B%5Cinfty%20%26%5Ctext%7Bif%20%7D%20%E5%85%B6%E4%BB%96%20%0A%5Cend%7Bcases%7D%0A&height=54&id=dRELQ)
所以,如果考虑极小化问题:
%3D%5Cminx%20%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calphai%20%5Cgeqslant0%7D%20L(x%2C%20%5Calpha%2C%20%5Cbeta)%0A#card=math&code=%5Cmin%7Bx%7D%20%5CthetaP%28x%29%3D%5Cmin_x%20%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calpha_i%20%5Cgeqslant0%7D%20L%28x%2C%20%5Calpha%2C%20%5Cbeta%29%0A&height=31&id=KljwN)
它与原始最优化问题是等价的,也就是我们一开始要求的有约束的最优化问题,他们有相同的解。上面的式子称为广义拉格朗日函数的极小极大问题,这样一来,原始的有约束的最优化问题,就让我们变成了无约束的广义的拉格朗日函数的极小极大问题。
我们定义原始问题的最优解为:
%0A#card=math&code=p%5E%2A%3D%5Cmin_x%20%5Ctheta_P%28x%29%0A&height=28&id=bPbW7)
对偶问题
我们首先定义:
%3D%5Cmin%20%7BL(x%2C%20%5Calpha%2C%20%5Cbeta)%7D%0A#card=math&code=%5Ctheta_%7BD%7D%28%5Calpha%2C%20%5Cbeta%29%3D%5Cmin%20%7BL%28x%2C%20%5Calpha%2C%20%5Cbeta%29%7D%0A&height=20&id=yHYLc)
然后考虑极大化#card=math&code=%5Ctheta_%7BD%7D%28%5Calpha%2C%20%5Cbeta%29&height=20&id=PXoMS),即:
%3D%5Cmax%7B%5Calpha%2C%5Cbeta%7D%20%5Cmin_x%20L(x%2C%20%5Calpha%2C%20%5Cbeta)%20%20%5C%5C%0A%26s.t.%20%5Cquad%20%5Calpha_i%20%5Cgeqslant%200%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%20%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%26%20%5Cmax%7B%5Calpha%2C%5Cbeta%7D%20%5CthetaD%28%5Calpha%2C%20%5Cbeta%29%3D%5Cmax%7B%5Calpha%2C%5Cbeta%7D%20%5Cmin_x%20L%28x%2C%20%5Calpha%2C%20%5Cbeta%29%20%20%5C%5C%0A%26s.t.%20%5Cquad%20%5Calpha_i%20%5Cgeqslant%200%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%20%0A%5Cend%7Baligned%7D%0A&height=54&id=iS9m3)
称为原始问题的对偶问题,定义对偶问题的最优值为:
%0A#card=math&code=d%5E%2A%3D%5Cmax_%7B%5Calpha%2C%5Cbeta%3B%5Calpha_i%20%5Cgeqslant0%7D%20%5Ctheta_D%28%5Calpha%2C%20%5Cbeta%29%0A&height=31&id=SUZc9)
原始问题和对偶问题的关系
我们定义了原始问题,利用拉格朗日函数转化为了无约束的形式,并用表示了其最优解;然后定义了原始问题的对偶问题,并用
表示了对偶问题的最优解,而原始问题的求解比较复杂,所以我们期望用解对偶问题来代替解原始问题,那么原始问题和对偶问题有什么联系呢?为什么我们可以替代呢?
下面有一个定理和一个推论:
定理1:
若原始问题和对偶问题都有最优解,则:
%20%5Cleqslant%20%5Cminx%20%20%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calphai%20%5Cgeqslant0%7D%20L(x%2C%20%5Calpha%2C%20%5Cbeta)%20%3D%20p%5E*%0A#card=math&code=d%5E%2A%3D%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calphai%20%5Cgeqslant0%7D%20%5Cmin_x%20L%28x%2C%20%5Calpha%2C%20%5Cbeta%29%20%5Cleqslant%20%5Cmin_x%20%20%5Cmax%7B%5Calpha%2C%5Cbeta%3B%5Calpha_i%20%5Cgeqslant0%7D%20L%28x%2C%20%5Calpha%2C%20%5Cbeta%29%20%3D%20p%5E%2A%0A&height=31&id=s7PNP)
推论:
设和
,分别是原始问题和对偶问题的可行解,并且
,则
和
,分别是原始问题和对偶问题的最优解。
定理2:
假设#card=math&code=f%28x%29&height=20&id=vFeNq)和
#card=math&code=c_i%28x%29&height=20&id=eiupU)是凸函数,
#card=math&code=h_j%28x%29&height=21&id=giGAc)是仿射函数,并且假设不等式约束
#card=math&code=c_i%28x%29&height=20&id=nF7y5)是严格可行的,即存在x,对所有i有
%3C0#card=math&code=c_i%28x%29%3C0&height=20&id=Y9CZR),则存在
,使得
是原始问题的解,而
是对偶问题的解,并且
%0A#card=math&code=p%5E%2A%3Dd%5E%2A%3DL%28x%5E%2A%2C%5Calpha%5E%2A%2C%5Cbeta%5E%2A%29%0A&height=20&id=RLI4i)
仿射函数是指
KKT条件
定理3:
假设#card=math&code=f%28x%29&height=20&id=ua4LY)和
#card=math&code=ci%28x%29&height=20&id=cBQpG)是凸函数,
#card=math&code=h_j%28x%29&height=21&id=MuFTl)是仿射函数,并且假设不等式约束
#card=math&code=c_i%28x%29&height=20&id=rCFe9)是严格可行的,即存在x,对所有i有
%3C0#card=math&code=c_i%28x%29%3C0&height=20&id=wWWqw),则
和$ \alpha^, \beta*$分别是原始问题和对偶问题的解的充分必要条件是$x, \alpha^, \beta^_$满足KKT条件:
%3D0%20%5C%5C%0A%26%20%5Cnabla%5Calpha%20L(x%5E%2C%20%5Calpha%5E%2C%20%5Cbeta%5E*)%3D0%20%5C%5C%0A%26%20%5Cnabla%5Cbeta%20L(x%5E%2C%20%5Calpha%5E%2C%20%5Cbeta%5E)%3D0%20%5C%5C%0A%26%20%5Calpha_i%20c_i(x%5E)%3D0%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%20%5C%5C%0A%26%20ci(x%5E)%20%5Cleqslant%200%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%20%5C%5C%0A%26%20a_i%5E%20%5Cgeqslant%200%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%20%5C%5C%20%0A%26%20h_j(x%5E*)%3D0%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%20%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A%26%20%5Cnabla_x%20L%28x%5E%2A%2C%20%5Calpha%5E%2A%2C%20%5Cbeta%5E%2A%29%3D0%20%5C%5C%0A%26%20%5Cnabla%5Calpha%20L%28x%5E%2A%2C%20%5Calpha%5E%2A%2C%20%5Cbeta%5E%2A%29%3D0%20%5C%5C%0A%26%20%5Cnabla_%5Cbeta%20L%28x%5E%2A%2C%20%5Calpha%5E%2A%2C%20%5Cbeta%5E%2A%29%3D0%20%5C%5C%0A%26%20%5Calpha_i%20c_i%28x%5E%2A%29%3D0%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%20%5C%5C%0A%26%20c_i%28x%5E%2A%29%20%5Cleqslant%200%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%20%5C%5C%0A%26%20a_i%5E%2A%20%5Cgeqslant%200%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%20%5C%5C%20%0A%26%20h_j%28x%5E%2A%29%3D0%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%20%5C%5C%0A%5Cend%7Baligned%7D%0A&height=160&id=cQFej)
其中%3D0%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%0A#card=math&code=%5Calpha_i%20c_i%28x%5E%2A%29%3D0%2C%20%5Cquad%20i%3D1%2C2%2C…%2Ck%0A&height=20&id=QK7bX)
称为KKT的对偶互补条件,由此条件知:若,则
%3D0#card=math&code=c_i%28x%5E%2A%29%3D0&height=20&id=zhTm4)。