线性回归
1. 线性回归的假设函数是什么形式?
2. 线性回归的损失函数是什么形式?
一般使用最小二乘法:
其中共有m个样本点,乘以1/2是为了方便计算。
最小二乘:如何理解最小二乘
3. 线性回归的训练方法:
4. 引入岭回归和Lasso回归的目的?如何达到此目的?
- 解决线性回归出现的过拟合的请况。
- 约束(限制)要优化的参数
在损失函数中引入正则化项来达到此目的:
- 岭回归损失函数
收缩系数的值,但不会达到零,这表明没有特征选择。
- Lasso回归损失函数
有特征选择;如果一批预测变量高度相关,则Lasso只挑选其中一个,并将其他缩减为零。
5. 引入ElasticNet回归的目的与场景
MSE+L1正则化+L2正则化
损失函数:
- 目的:在我们发现用Lasso回归太多特征被稀疏为0,而岭回归也正则化的不够(回归系数衰减太慢)的时候,可以考虑使用ElasticNet回归来综合,得到比较好的结果。
ref: 线性回归+逻辑回归.md
逻辑回归
简单介绍一下逻辑回归?
广义线性模型,主要解决分类问题,线性回归的结果 Y代入一个非线性变换的Sigmoid函数中,得到 [0,1] 范围的数 S , S 可以把它看成一个概率值,如果我们设置概率阈值为0.5, S 大于0.5—>正样本,小于0.5—>负样本,就可以进行分类了。
逻辑回归的损失函数是?
KL散度—>交叉熵
最大似然:
似然函数取对数,得到交叉熵损失函数:
逻辑回归是如何训练的?优化算法是?
梯度下降、牛顿法
LR为什么使用sigmoid函数?
- 线性模型的输出都是在之间的,而Sigmoid能够把它映射到之间。正好这个是概率的范围。
- Sigmoid也让逻辑回归的损失函数成为凸函数,
- 最大熵模型的角度 最大熵模型推导逻辑回归,最大熵模型推导逻辑回归2 更清晰
- 广义线性回归,指数族的角度 逻辑回归为什么用sigmoid
逻辑斯特回归为什么要对特征进行离散化?
- 逻辑回归属于广义线性模型,表达能力受限。单变量离散化为N个后,每个变量有单独的权重,相当于为模型引入了非线性,能够提升模型表达能力,加大拟合。离散特征的增加和减少都很容易,易于模型的快速迭代;
- 稀疏向量内积乘法运算速度快,计算结果方便存储,容易扩展;
- 方便交叉与特征组合:离散化后可以进行特征交叉,由个变量变为个变量,进一步引入非线性,提升表达能力;
- 简化模型:特征离散化以后,起到了简化了逻辑回归模型的作用,降低了模型过拟合的风险。
- 稳定性:特征离散化后,模型会更稳定,比如如果对用户年龄离散化,20-30作为一个区间,不会因为一个用户年龄长了一岁就变成一个完全不同的人;
- 离散化后的特征对异常数据有很强的鲁棒性:比如一个特征是年龄>30是1,否则0。如果特征没有离散化,一个异常数据“年龄300岁”会给模型造成很大的干扰。
参考资料:https://www.zhihu.com/question/31989952
*逻辑回归在训练的过程当中,如果有很多的特征高度相关或者说有一个特征重复了100遍,会造成怎样的影响?(逻辑回归对共线特征非常敏感!!
- 先说结论,如果在损失函数最终收敛的情况下,其实就算有很多特征高度相关也不会影响分类器的效果。
- 但是对特征本身来说的话,假设只有一个特征,在不考虑采样的情况下,你现在将它重复100遍。训练以后完以后,数据还是这么多,但是这个特征本身重复了100遍,实质上将原来的特征分成了100份,每一个特征都是原来特征权重值的百分之一。
- 如果在随机采样的情况下,其实训练收敛完以后,还是可以认为这100个特征和原来那一个特征扮演的效果一样,只是可能中间很多特征的值正负相消了。
为什么在训练过程中将高度相关的特征去掉
- 去掉高度相关的特征会让模型的可解释性更好
- 可以大大提高训练的速度。如果模型当中有很多特征高度相关的话,就算损失函数本身收敛了,但实际上参数是没有收敛的,这样会拉低训练的速度。其次是特征多了,本身就会增大训练的时间。
逻辑回归模型中,为什么常常要做特征组合(特征交叉)
逻辑回归模型属于线性模型,线性模型不能很好处理非线性特征,特征组合可以引入非线性特征,提升模型的表达能力。另外,基本特征可以认为是全局建模,组合特征更加精细,是个性化建模。
如果label={-1, +1},给出LR的损失函数?
假设label={-1,+1},则
%3Dh%7B%5Comega%7D(x)%0A#card=math&code=p%28y%3D1%7Cx%29%3Dh%7B%5Comega%7D%28x%29%0A&id=ceP0V)
%20%3D%201%20-%20h%7B%5Comega%7D%20(x)%0A#card=math&code=p%28y%3D-1%20%7C%20x%29%20%3D%201%20-%20h%7B%5Comega%7D%20%28x%29%0A&id=TtWBy)
对于sigmoid函数,有以下特性,
%20%3D%201%20-%20h(x)%0A#card=math&code=h%28-x%29%20%3D%201%20-%20h%28x%29%0A&id=swebQ)
%20%3D%20h%5Comega(yx)%0A#card=math&code=p%28y%7Cx%29%20%3D%20h%5Comega%28yx%29%0A&id=bnWMd)
同样,我们使用MLE作估计,
%26%3D%20%20%5Cprod%7Bi%3D1%7D%5E%7Bm%7D%20p(y_i%20%7C%20x_i%3B%20%5Comega)%20%20%5C%5C%0A%26%3D%20%20%5Cprod%7Bi%3D1%7D%5E%7Bm%7D%20h%5Comega(y_i%20x_i)%5C%5C%0A%26%3D%20%5Cprod%7Bi%3D1%7D%5E%7Bm%7D%20%5Cfrac%7B1%7D%7B1%2Be%5E%7B-yiwx_i%7D%7D%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0AL%28%5Comega%29%26%3D%20%20%5Cprod%7Bi%3D1%7D%5E%7Bm%7D%20p%28yi%20%7C%20x_i%3B%20%5Comega%29%20%20%5C%5C%0A%26%3D%20%20%5Cprod%7Bi%3D1%7D%5E%7Bm%7D%20h%5Comega%28y_i%20x_i%29%5C%5C%0A%26%3D%20%5Cprod%7Bi%3D1%7D%5E%7Bm%7D%20%5Cfrac%7B1%7D%7B1%2Be%5E%7B-y_iwx_i%7D%7D%0A%5Cend%7Baligned%7D%0A&id=BiNth)
对上式取对数及负值,得到损失为:
%26%3D%20-%5Clog%20%5Cprod%7Bi%3D1%7D%5E%7Bm%7D%20p(y_i%20%7C%20x_i%3B%20%5Comega)%20%20%5C%5C%0A%26%3D%20%20-%5Csum%7Bi%3D1%7D%5E%7Bm%7D%20%5Clog%20p(yi%20%7C%20x_i%3B%20%5Comega)%20%20%5C%5C%0A%26%3D%20%20-%5Csum%7Bi%3D1%7D%5E%7Bm%7D%20%5Clog%20%5Cfrac%7B1%7D%7B1%2Be%5E%7B-yiwx_i%7D%7D%5C%5C%0A%26%3D%20%20%5Csum%7Bi%3D1%7D%5E%7Bm%7D%20%5Clog(1%2Be%5E%7B-yiwx_i%7D)%5C%5C%0A%5Cend%7Baligned%7D%0A#card=math&code=%5Cbegin%7Baligned%7D%0A-%5Clog%20L%28%5Comega%29%26%3D%20-%5Clog%20%5Cprod%7Bi%3D1%7D%5E%7Bm%7D%20p%28yi%20%7C%20x_i%3B%20%5Comega%29%20%20%5C%5C%0A%26%3D%20%20-%5Csum%7Bi%3D1%7D%5E%7Bm%7D%20%5Clog%20p%28yi%20%7C%20x_i%3B%20%5Comega%29%20%20%5C%5C%0A%26%3D%20%20-%5Csum%7Bi%3D1%7D%5E%7Bm%7D%20%5Clog%20%5Cfrac%7B1%7D%7B1%2Be%5E%7B-yiwx_i%7D%7D%5C%5C%0A%26%3D%20%20%5Csum%7Bi%3D1%7D%5E%7Bm%7D%20%5Clog%281%2Be%5E%7B-y_iwx_i%7D%29%5C%5C%0A%5Cend%7Baligned%7D%0A&id=i3SrK)
即对于每一个样本,损失函数为:
%3D%5Clog(1%2Be%5E%7B-y_iwx_i%7D)%3C%2Ftitle%3E%0A%3Cdefs%20aria-hidden%3D%22true%22%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMATHI-4C%22%20d%3D%22M228%20637Q194%20637%20192%20641Q191%20643%20191%20649Q191%20673%20202%20682Q204%20683%20217%20683Q271%20680%20344%20680Q485%20680%20506%20683H518Q524%20677%20524%20674T522%20656Q517%20641%20513%20637H475Q406%20636%20394%20628Q387%20624%20380%20600T313%20336Q297%20271%20279%20198T252%2088L243%2052Q243%2048%20252%2048T311%2046H328Q360%2046%20379%2047T428%2054T478%2072T522%20106T564%20161Q580%20191%20594%20228T611%20270Q616%20273%20628%20273H641Q647%20264%20647%20262T627%20203T583%2083T557%209Q555%204%20553%203T537%200T494%20-1Q483%20-1%20418%20-1T294%200H116Q32%200%2032%2010Q32%2017%2034%2024Q39%2043%2044%2045Q48%2046%2059%2046H65Q92%2046%20125%2049Q139%2052%20144%2061Q147%2065%20216%20339T285%20628Q285%20635%20228%20637Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMAIN-28%22%20d%3D%22M94%20250Q94%20319%20104%20381T127%20488T164%20576T202%20643T244%20695T277%20729T302%20750H315H319Q333%20750%20333%20741Q333%20738%20316%20720T275%20667T226%20581T184%20443T167%20250T184%2058T225%20-81T274%20-167T316%20-220T333%20-241Q333%20-250%20318%20-250H315H302L274%20-226Q180%20-141%20137%20-14T94%20250Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMATHI-3C9%22%20d%3D%22M495%20384Q495%20406%20514%20424T555%20443Q574%20443%20589%20425T604%20364Q604%20334%20592%20278T555%20155T483%2038T377%20-11Q297%20-11%20267%2066Q266%2068%20260%2061Q201%20-11%20125%20-11Q15%20-11%2015%20139Q15%20230%2056%20325T123%20434Q135%20441%20147%20436Q160%20429%20160%20418Q160%20406%20140%20379T94%20306T62%20208Q61%20202%2061%20187Q61%20124%2085%20100T143%2076Q201%2076%20245%20129L253%20137V156Q258%20297%20317%20297Q348%20297%20348%20261Q348%20243%20338%20213T318%20158L308%20135Q309%20133%20310%20129T318%20115T334%2097T358%2083T393%2076Q456%2076%20501%20148T546%20274Q546%20305%20533%20325T508%20357T495%20384Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMAIN-29%22%20d%3D%22M60%20749L64%20750Q69%20750%2074%20750H86L114%20726Q208%20641%20251%20514T294%20250Q294%20182%20284%20119T261%2012T224%20-76T186%20-143T145%20-194T113%20-227T90%20-246Q87%20-249%2086%20-250H74Q66%20-250%2063%20-250T58%20-247T55%20-238Q56%20-237%2066%20-225Q221%20-64%20221%20250T66%20725Q56%20737%2055%20738Q55%20746%2060%20749Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMAIN-3D%22%20d%3D%22M56%20347Q56%20360%2070%20367H707Q722%20359%20722%20347Q722%20336%20708%20328L390%20327H72Q56%20332%2056%20347ZM56%20153Q56%20168%2072%20173H708Q722%20163%20722%20153Q722%20140%20707%20133H70Q56%20140%2056%20153Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMAIN-6C%22%20d%3D%22M42%2046H56Q95%2046%20103%2060V68Q103%2077%20103%2091T103%20124T104%20167T104%20217T104%20272T104%20329Q104%20366%20104%20407T104%20482T104%20542T103%20586T103%20603Q100%20622%2089%20628T44%20637H26V660Q26%20683%2028%20683L38%20684Q48%20685%2067%20686T104%20688Q121%20689%20141%20690T171%20693T182%20694H185V379Q185%2062%20186%2060Q190%2052%20198%2049Q219%2046%20247%2046H263V0H255L232%201Q209%202%20183%202T145%203T107%203T57%201L34%200H26V46H42Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMAIN-6F%22%20d%3D%22M28%20214Q28%20309%2093%20378T250%20448Q340%20448%20405%20380T471%20215Q471%20120%20407%2055T250%20-10Q153%20-10%2091%2057T28%20214ZM250%2030Q372%2030%20372%20193V225V250Q372%20272%20371%20288T364%20326T348%20362T317%20390T268%20410Q263%20411%20252%20411Q222%20411%20195%20399Q152%20377%20139%20338T126%20246V226Q126%20130%20145%2091Q177%2030%20250%2030Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMAIN-67%22%20d%3D%22M329%20409Q373%20453%20429%20453Q459%20453%20472%20434T485%20396Q485%20382%20476%20371T449%20360Q416%20360%20412%20390Q410%20404%20415%20411Q415%20412%20416%20414V415Q388%20412%20363%20393Q355%20388%20355%20386Q355%20385%20359%20381T368%20369T379%20351T388%20325T392%20292Q392%20230%20343%20187T222%20143Q172%20143%20123%20171Q112%20153%20112%20133Q112%2098%20138%2081Q147%2075%20155%2075T227%2073Q311%2072%20335%2067Q396%2058%20431%2026Q470%20-13%20470%20-72Q470%20-139%20392%20-175Q332%20-206%20250%20-206Q167%20-206%20107%20-175Q29%20-140%2029%20-75Q29%20-39%2050%20-15T92%2018L103%2024Q67%2055%2067%20108Q67%20155%2096%20193Q52%20237%2052%20292Q52%20355%20102%20398T223%20442Q274%20442%20318%20416L329%20409ZM299%20343Q294%20371%20273%20387T221%20404Q192%20404%20171%20388T145%20343Q142%20326%20142%20292Q142%20248%20149%20227T179%20192Q196%20182%20222%20182Q244%20182%20260%20189T283%20207T294%20227T299%20242Q302%20258%20302%20292T299%20343ZM403%20-75Q403%20-50%20389%20-34T348%20-11T299%20-2T245%200H218Q151%200%20138%20-6Q118%20-15%20107%20-34T95%20-74Q95%20-84%20101%20-97T122%20-127T170%20-155T250%20-167Q319%20-167%20361%20-139T403%20-75Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMAIN-31%22%20d%3D%22M213%20578L200%20573Q186%20568%20160%20563T102%20556H83V602H102Q149%20604%20189%20617T245%20641T273%20663Q275%20666%20285%20666Q294%20666%20302%20660V361L303%2061Q310%2054%20315%2052T339%2048T401%2046H427V0H416Q395%203%20257%203Q121%203%20100%200H88V46H114Q136%2046%20152%2046T177%2047T193%2050T201%2052T207%2057T213%2061V578Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMAIN-2B%22%20d%3D%22M56%20237T56%20250T70%20270H369V420L370%20570Q380%20583%20389%20583Q402%20583%20409%20568V270H707Q722%20262%20722%20250T707%20230H409V-68Q401%20-82%20391%20-82H389H387Q375%20-82%20369%20-68V230H70Q56%20237%2056%20250Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMATHI-65%22%20d%3D%22M39%20168Q39%20225%2058%20272T107%20350T174%20402T244%20433T307%20442H310Q355%20442%20388%20420T421%20355Q421%20265%20310%20237Q261%20224%20176%20223Q139%20223%20138%20221Q138%20219%20132%20186T125%20128Q125%2081%20146%2054T209%2026T302%2045T394%20111Q403%20121%20406%20121Q410%20121%20419%20112T429%2098T420%2082T390%2055T344%2024T281%20-1T205%20-11Q126%20-11%2083%2042T39%20168ZM373%20353Q367%20405%20305%20405Q272%20405%20244%20391T199%20357T170%20316T154%20280T149%20261Q149%20260%20169%20260Q282%20260%20327%20284T373%20353Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMAIN-2212%22%20d%3D%22M84%20237T84%20250T98%20270H679Q694%20262%20694%20250T679%20230H98Q84%20237%2084%20250Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMATHI-79%22%20d%3D%22M21%20287Q21%20301%2036%20335T84%20406T158%20442Q199%20442%20224%20419T250%20355Q248%20336%20247%20334Q247%20331%20231%20288T198%20191T182%20105Q182%2062%20196%2045T238%2027Q261%2027%20281%2038T312%2061T339%2094Q339%2095%20344%20114T358%20173T377%20247Q415%20397%20419%20404Q432%20431%20462%20431Q475%20431%20483%20424T494%20412T496%20403Q496%20390%20447%20193T391%20-23Q363%20-106%20294%20-155T156%20-205Q111%20-205%2077%20-183T43%20-117Q43%20-95%2050%20-80T69%20-58T89%20-48T106%20-45Q150%20-45%20150%20-87Q150%20-107%20138%20-122T115%20-142T102%20-147L99%20-148Q101%20-153%20118%20-160T152%20-167H160Q177%20-167%20186%20-165Q219%20-156%20247%20-127T290%20-65T313%20-9T321%2021L315%2017Q309%2013%20296%206T270%20-6Q250%20-11%20231%20-11Q185%20-11%20150%2011T104%2082Q103%2089%20103%20113Q103%20170%20138%20262T173%20379Q173%20380%20173%20381Q173%20390%20173%20393T169%20400T158%20404H154Q131%20404%20112%20385T82%20344T65%20302T57%20280Q55%20278%2041%20278H27Q21%20284%2021%20287Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMATHI-69%22%20d%3D%22M184%20600Q184%20624%20203%20642T247%20661Q265%20661%20277%20649T290%20619Q290%20596%20270%20577T226%20557Q211%20557%20198%20567T184%20600ZM21%20287Q21%20295%2030%20318T54%20369T98%20420T158%20442Q197%20442%20223%20419T250%20357Q250%20340%20236%20301T196%20196T154%2083Q149%2061%20149%2051Q149%2026%20166%2026Q175%2026%20185%2029T208%2043T235%2078T260%20137Q263%20149%20265%20151T282%20153Q302%20153%20302%20143Q302%20135%20293%20112T268%2061T223%2011T161%20-11Q129%20-11%20102%2010T74%2074Q74%2091%2079%20106T122%20220Q160%20321%20166%20341T173%20380Q173%20404%20156%20404H154Q124%20404%2099%20371T61%20287Q60%20286%2059%20284T58%20281T56%20279T53%20278T49%20278T41%20278H27Q21%20284%2021%20287Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMATHI-77%22%20d%3D%22M580%20385Q580%20406%20599%20424T641%20443Q659%20443%20674%20425T690%20368Q690%20339%20671%20253Q656%20197%20644%20161T609%2080T554%2012T482%20-11Q438%20-11%20404%205T355%2048Q354%2047%20352%2044Q311%20-11%20252%20-11Q226%20-11%20202%20-5T155%2014T118%2053T104%20116Q104%20170%20138%20262T173%20379Q173%20380%20173%20381Q173%20390%20173%20393T169%20400T158%20404H154Q131%20404%20112%20385T82%20344T65%20302T57%20280Q55%20278%2041%20278H27Q21%20284%2021%20287Q21%20293%2029%20315T52%20366T96%20418T161%20441Q204%20441%20227%20416T250%20358Q250%20340%20217%20250T184%20111Q184%2065%20205%2046T258%2026Q301%2026%20334%2087L339%2096V119Q339%20122%20339%20128T340%20136T341%20143T342%20152T345%20165T348%20182T354%20206T362%20238T373%20281Q402%20395%20406%20404Q419%20431%20449%20431Q468%20431%20475%20421T483%20402Q483%20389%20454%20274T422%20142Q420%20131%20420%20107V100Q420%2085%20423%2071T442%2042T487%2026Q558%2026%20600%20148Q609%20171%20620%20213T632%20273Q632%20306%20619%20325T593%20357T580%20385Z%22%3E%3C%2Fpath%3E%0A%3Cpath%20stroke-width%3D%221%22%20id%3D%22E1-MJMATHI-78%22%20d%3D%22M52%20289Q59%20331%20106%20386T222%20442Q257%20442%20286%20424T329%20379Q371%20442%20430%20442Q467%20442%20494%20420T522%20361Q522%20332%20508%20314T481%20292T458%20288Q439%20288%20427%20299T415%20328Q415%20374%20465%20391Q454%20404%20425%20404Q412%20404%20406%20402Q368%20386%20350%20336Q290%20115%20290%2078Q290%2050%20306%2038T341%2026Q378%2026%20414%2059T463%20140Q466%20150%20469%20151T485%20153H489Q504%20153%20504%20145Q504%20144%20502%20134Q486%2077%20440%2033T333%20-11Q263%20-11%20227%2052Q186%20-10%20133%20-10H127Q78%20-10%2057%2016T35%2071Q35%20103%2054%20123T99%20143Q142%20143%20142%20101Q142%2081%20130%2066T107%2046T94%2041L91%2040Q91%2039%2097%2036T113%2029T132%2026Q168%2026%20194%2071Q203%2087%20217%20139T245%20247T261%20313Q266%20340%20266%20352Q266%20380%20251%20392T217%20404Q177%20404%20142%20372T93%20290Q91%20281%2088%20280T72%20278H58Q52%20284%2052%20289Z%22%3E%3C%2Fpath%3E%0A%3C%2Fdefs%3E%0A%3Cg%20stroke%3D%22currentColor%22%20fill%3D%22currentColor%22%20stroke-width%3D%220%22%20transform%3D%22matrix(1%200%200%20-1%200%200)%22%20aria-hidden%3D%22true%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-4C%22%20x%3D%220%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-28%22%20x%3D%22681%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3C9%22%20x%3D%221071%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-29%22%20x%3D%221693%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-3D%22%20x%3D%222360%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3Cg%20transform%3D%22translate(3417%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-6C%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-6F%22%20x%3D%22278%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-67%22%20x%3D%22779%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-28%22%20x%3D%224696%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-31%22%20x%3D%225086%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-2B%22%20x%3D%225808%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3Cg%20transform%3D%22translate(6809%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-65%22%20x%3D%220%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3Cg%20transform%3D%22translate(466%2C412)%22%3E%0A%20%3Cuse%20transform%3D%22scale(0.707)%22%20xlink%3Ahref%3D%22%23E1-MJMAIN-2212%22%20x%3D%220%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3Cg%20transform%3D%22translate(550%2C0)%22%3E%0A%20%3Cuse%20transform%3D%22scale(0.707)%22%20xlink%3Ahref%3D%22%23E1-MJMATHI-79%22%20x%3D%220%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20transform%3D%22scale(0.574)%22%20xlink%3Ahref%3D%22%23E1-MJMATHI-69%22%20x%3D%22604%22%20y%3D%22-304%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%20%3Cuse%20transform%3D%22scale(0.707)%22%20xlink%3Ahref%3D%22%23E1-MJMATHI-77%22%20x%3D%221649%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3Cg%20transform%3D%22translate(1673%2C0)%22%3E%0A%20%3Cuse%20transform%3D%22scale(0.707)%22%20xlink%3Ahref%3D%22%23E1-MJMATHI-78%22%20x%3D%220%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20transform%3D%22scale(0.574)%22%20xlink%3Ahref%3D%22%23E1-MJMATHI-69%22%20x%3D%22705%22%20y%3D%22-238%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-29%22%20x%3D%229722%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E#card=math&code=L%28%5Comega%29%3D%5Clog%281%2Be%5E%7B-y_iwx_i%7D%29%20%0A&id=xBSax)
为什么LR可以用来做CTR预估?满足什么条件的数据用LR最好?
LR如何进行并行计算?
LR的梯度下降可以并行。分别从行和列两个角度。(因为每个特征维度的梯度下降互相独立
逻辑回归的优缺点
- 优点
- LR的可解释性强、可控度高、训练速度快
- 缺点
- 对模型中自变量多重共线性较为敏感
例如两个高度相关自变量同时放入模型,可能导致较弱的一个自变量回归符号不符合预期,符号被扭转。需要利用因子分析或者变量聚类分析等手段来选择代表性的自变量,以减少候选变量之间的相关性; - 预测结果呈型,因此从#card=math&code=log%28odds%29&id=aynBV)向概率转化的过程是非线性的,在两端随着#card=math&code=log%28odds%29&id=bjQQo)值的变化,概率变化很小,slope太小,而中间概率的变化很大,很敏感。 导致很多区间的变量变化对目标概率的影响没有区分度,无法确定阀值。
- 准确率不是很高。因为形式非常的简单(非常类似线性模型),很难去拟合数据的真实分布
- 逻辑回归本身无法筛选特征。有时候,我们会用gbdt来筛选特征,然后再上逻辑回归。
- 对模型中自变量多重共线性较为敏感
线性回归与逻辑回归的区别?
- 线性回归主要解决回归任务,逻辑回归主要解决分类问题。
- 线性回归的输出一般是连续的,逻辑回归的输出一般是离散的。
- 逻辑回归的输入是线性回归的输出,将Sigmoid函数作用于线性回归的输出得到输出结果。
- 线性回归的损失函数是MSE,逻辑回归中,采用的是极大似然然后取负对数。
参考资料:https://blog.csdn.net/ddydavie/article/details/82668141
为什么逻辑回归比线性回归要好?
逻辑回归在线性回归的基础上,在特征到结果的映射中加入了一层sigmoid函数(非线性)映射,即先把特征线性求和,然后使用sigmoid函数来预测。另外线性回归在整个实数域范围内进行预测,敏感度一致,而分类范围,需要在0,1间的一种回归模型,因而对于这类问题来说,逻辑回归的鲁棒性比线性回归的要好。
参考资料:https://www.deeplearn.me/1788.html
LR和SVM的联系
- 相同点
- LR和SVM都可以处理分类问题,且一般都用于处理线性二分类问题(在改进的情况下可以处理多分类问题)
- 两个方法都可以增加不同的正则化项,如l1、l2等等。所以在很多实验中,两种算法的结果是很接近的。
- 区别
- LR是参数模型,SVM是非参数模型。
- 从目标函数来看,区别在于逻辑回归采用的是交叉熵损失函数,SVM采用的是hinge loss,这两个损失函数的目的都是增加对分类影响较大的数据点的权重,减少与分类关系较小的数据点的权重。
- SVM的处理方法是只考虑support vectors,也就是和分类最相关的少数点,去学习分类器。而逻辑回归通过非线性映射,大大减小了离分类平面较远的点的权重,相对提升了与分类最相关的数据点的权重。
- 逻辑回归相对来说模型更简单,好理解,特别是大规模线性分类时比较方便。而SVM的理解和优化相对来说复杂一些,SVM转化为对偶问题后,分类只需要计算与少数几个支持向量的距离,这个在进行复杂核函数计算时优势很明显,能够大大简化模型和计算。
- LR能做的 SVM能做,但可能在准确率上有问题,SVM能做的LR有的做不了。