参考资料
https://www.kaggle.com/code/frtgnn/pycaret-introduction-classification-regression/notebook
https://blog.csdn.net/weixin_42608414/article/details/123822096
https://blog.csdn.net/weixin_35888603/article/details/112663600
https://www.kaggle.com/code/frtgnn/pycaret-introduction-classification-regression/notebook

1、PyCaret介绍

PyCaret是Python中的一个开源、低代码的机器学习库,它自动化了机器学习工作流。它是一个端到端的机器学习和模型管理工具,可以加快机器学习实验的周期,并使你更有效率。
与其他开放源代码机器学习库相比,PyCaret是一个低代码库,可以用很少的代码来替换数百行代码。这使得实验具有指数级的速度和效率开发。
官方:https://www.pycaret.org
文档:https://pycaret.readthedocs.io/en/latest/
git:https://www.github.com/pycaret/pycaret

2、分类案例

读取数据集:

  1. train = pd.read_csv('../input/titanic/train.csv')
  2. test = pd.read_csv('../input/titanic/test.csv')
  3. sub = pd.read_csv('../input/titanic/gender_submission.csv')

导入分类模型:

  1. from pycaret.classification import *

查看数据类型:

  1. clf1 = setup(data = train,
  2. target = 'Survived',
  3. numeric_imputation = 'mean',
  4. categorical_features = ['Sex','Embarked'],
  5. ignore_features = ['Name','Ticket','Cabin'],
  6. silent = True)

Pycaret低代码实现回归和分类 - 图1
运行 & 对比精度:

  1. compare_models()

Pycaret低代码实现回归和分类 - 图2
创建单个模型:

  1. lgbm = create_model('lightgbm')

对单个模型进行调参:

  1. tuned_lightgbm = tune_model(lgbm)
  2. plot_model(estimator = tuned_lightgbm, plot = 'learning')

Pycaret低代码实现回归和分类 - 图3
打印重要性:

  1. plot_model(estimator = tuned_lightgbm, plot = 'feature')

Pycaret低代码实现回归和分类 - 图4
对测试集进行预测:

  1. predict_model(tuned_lightgbm, data=test)

3、回归案例

读取数据集:

  1. train = pd.read_csv('../input/house-prices-advanced-regression-techniques/train.csv')
  2. test = pd.read_csv('../input/house-prices-advanced-regression-techniques/test.csv')
  3. sample= pd.read_csv('../input/house-prices-advanced-regression-techniques/sample_submission.csv')

导入回归模型:

  1. from pycaret.regression import *

运行 & 对比精度:

  1. compare_models()

Pycaret低代码实现回归和分类 - 图5
创建单个模型:

  1. lgbm = create_model('lightgbm')

Pycaret低代码实现回归和分类 - 图6

4、数据及代码

https://github.com/SeafyLiang/machine_learning_study/blob/master/practical_project/04pycaret低代码实现分类和回归.ipynb

https://www.kaggle.com/code/frtgnn/pycaret-introduction-classification-regression/notebook

5、可能的Bugs

5.1 compare_models()返回结果为空数组

解决方案:添加参数查看报错信息
compare_models(errors = “raise”)
若报错信息为:UnicodeEncodeError: ‘ascii’ codec can’t encode characters
则找到对应的源码文件,将ascii改为utf-8
image.png