1.普通线性回归
- Ridge Regression 岭回归
- L2正则化,系数前加平方项,α越小数越大
- sklearn.linear_model.Ridge(alpha=1, fit_intercept=True, solver=”auto”, normalize=False)
- alpha(0-10)越大,正则化力度越大,权重系数越小
- normalize:是否标准化
- sklearn.linear_model.RidgeCV(_BaseRidgeCV, RegressorMixin)
- 可交叉验证
- Lasso回归
- L1正则化,绝对值函数在零点不可导,最终生成稀疏矩阵
- Elastic Net 弹性网络
- 前两者的综合,r1=0退化为岭回归,r=1退化为Lasso回归
- Early stopping
- 通过限制错误率的阈值,进行停止
3.代码
```python from sklearn.datasets import load_boston from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LinearRegression, SGDRegressor, Ridge, RidgeCV from sklearn.metrics import mean_squared_error import joblib import os
- 通过限制错误率的阈值,进行停止
加载数据
boston = load_boston()
数据集划分
x_train, x_test, y_train, y_test = train_test_split(boston.data, boston.target, test_size=0.2, random_state=0)
标准化数据
transfer = StandardScaler() x_train = transfer.fit_transform(x_train) x_test = transfer.fit_transform(x_test)
模型构建
if os.path.exists(‘./test.pkl’): estimator = joblib.load(‘test.pkl’) else: estimator = Ridge() estimator.fit(xtrain, y_train) print(“这个模型的偏置是:”, estimator.intercept) # 这个模型的偏置是: 22.6212871287129
模型保存
joblib.dump(estimator, ‘test.pkl’)
模型评估
y_pre = estimator.predict(x_test) print(“预测值是:”, y_pre) print(“准确率是:”, estimator.score(x_test, y_test)) # 准确率是: 0.6057861591433175 print(“均方误差是:”, mean_squared_error(y_test, y_pre)) # 均方误差是: 32.10021771755563 ```
