1.普通线性回归
2.正则化线性回归
3.代码
加载数据
数据集划分
标准化数据
模型构建
模型保存
模型评估

1.普通线性回归

使用最小二乘法计算误差，使用方程法（准确求解，慢，不适用于高维）或梯度下降法（近似求解，快，适用高维）优化
2.正则化线性回归

Ridge Regression 岭回归
- L2正则化，系数前加平方项，α越小数越大
- sklearn.linear_model.Ridge(alpha=1, fit_intercept=True, solver=”auto”, normalize=False)
  - alpha（0-10）越大，正则化力度越大，权重系数越小
  - normalize：是否标准化
- sklearn.linear_model.RidgeCV(_BaseRidgeCV, RegressorMixin)
  - 可交叉验证
Lasso回归
- L1正则化，绝对值函数在零点不可导，最终生成稀疏矩阵
Elastic Net 弹性网络
- 前两者的综合，r1=0退化为岭回归，r=1退化为Lasso回归
Early stopping
- 通过限制错误率的阈值，进行停止
  3.代码
```python from sklearn.datasets import load_boston from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LinearRegression, SGDRegressor, Ridge, RidgeCV from sklearn.metrics import mean_squared_error import joblib import os

加载数据

boston = load_boston()

数据集划分

x_train, x_test, y_train, y_test = train_test_split(boston.data, boston.target, test_size=0.2, random_state=0)

标准化数据

transfer = StandardScaler() x_train = transfer.fit_transform(x_train) x_test = transfer.fit_transform(x_test)

模型构建

if os.path.exists(‘./test.pkl’): estimator = joblib.load(‘test.pkl’) else: estimator = Ridge() estimator.fit(xtrain, y_train) print(“这个模型的偏置是：”, estimator.intercept) # 这个模型的偏置是： 22.6212871287129

模型保存

joblib.dump(estimator, ‘test.pkl’)

模型评估

y_pre = estimator.predict(x_test) print(“预测值是：”, y_pre) print(“准确率是：”, estimator.score(x_test, y_test)) # 准确率是： 0.6057861591433175 print(“均方误差是：”, mean_squared_error(y_test, y_pre)) # 均方误差是： 32.10021771755563 ```

线性回归