算法
1.黑色星期五
https://www.kesci.com/mw/project/5f608358ae300e004601a386【仅仅还有一些可视化】pass
https://www.kesci.com/mw/project/5c6e79f7336a0d002c19a25c 【优秀的可视化和分析 https://www.kaggle.com/dabate/black-friday-examined-eda-apriori
https://www.kesci.com/mw/project/5d71b4bd8499bc002c0ad4ff 【借鉴1:优秀的可视化经验(多饼图),有关联分析】
https://www.kesci.com/mw/project/5f967f35e0eb3e003be4d452 【优秀的问题解答型数据分析经验】
2.黑色星期五 - 回归
https://www.kaggle.com/shivamsingh96/sales-prediction-xgb-regressor
https://www.kaggle.com/mayurdangar/blackfriday-insights-and-model
https://www.kaggle.com/mooventhchiyan/rf-gb-xg-models
https://www.kaggle.com/suryatejach/sales-prediction-black-friday
https://www.kaggle.com/annkurillose/insights-and-sales-prediction
https://www.kaggle.com/muhammadayman/indistinct-features-explanation-lr-with-pytorch
3.黑色星期五 - 聚类
https://www.kesci.com/mw/project/5fd489751a34b90030b85a74
可视化
1.Kaggle黑色星期五交易数据及客户分析—Mysql和Tableau
https://blog.csdn.net/Violazou/article/details/105058542
2.知乎 kaggle黑色星期五
https://zhuanlan.zhihu.com/p/51576253
序号 |
网址 |
内容 |
备注 |
1 |
https://www.kaggle.com/vinaypratap/black-friday-sale-prediction-xgb-lgb-rf-stacked |
少量数据分析 2.众数填充+编码策略+标准化+特征选择 模型:随机森林、XGB、LGB、集成模型 df[‘Age’] = df[‘Age’].map({‘0-17’:0,’18-25’:0,’26-35’:1,’36-45’:1,’46-50’:1,’51-55’:2,’55+’:2}) |
可用,2号文件 |
2 |
https://www.kaggle.com/ankitapaithankar/black-friday-sales-prediction |
RF+XGB+LGB+Catboost+stack model |
|
3 |
https://www.kaggle.com/shamalip/black-friday-data-exploration |
只有数据可视化 |
可用,英文多 |
4 |
https://www.kaggle.com/mayurdangar/blackfriday-insights-and-model |
1.train_df.describe(include=’all’)
2.针对问题,进行优秀的可视化分析 |
可用,由英文问题。 |
5 |
https://www.kaggle.com/shivamsingh96/sales-prediction-xgb-regressor |
LR+RF+XGB df[‘Age’] = df[‘Age’].map({‘0-17’:17,’18-25’:25,…
train[‘Product_ID’]=train[‘Product_ID’].str.slice(2).astype(int) test[‘Product_ID’]=test[‘Product_ID’].str.slice(2).astype(int)
corr=train.corr() plt.figure(figsize=(20,12)) sns.heatmap(corr,annot=True) |
新的替换方式 新的相关性分析 |
6 |
https://www.kaggle.com/meghakanojia/black-friday-eda |
1.先筛选出来完整的商品列表,计算线性回归分数,使用特征交叉计算线性回归和Ridge的分数 ## Best fit: Polynomial+Ridge —> degree=10 , alpha=8.0 2.KNN |
不错的想法 |
7 |
https://www.kaggle.com/suryatejach/sales-prediction-black-friday |
from sklearn.preprocessing import LabelEncoder train[‘User_ID’] = train[‘User_ID’] - 1000000 le = LabelEncoder() train[‘User_ID’] = le.fit_transform(train[‘User_ID’]) train[‘Product_ID’] = train[‘Product_ID’].str.replace(‘P00’, ‘’) ss = StandardScaler() train[‘Product_ID’] = ss.fit_transform(train[‘Product_ID’].values.reshape(-1, 1)) 相关性分析表 XGB |
新的替换策略 |
8 |
https://www.kaggle.com/kushagrakinjawadekar/black-friday-data |
一般 |
|
9 |
https://www.kaggle.com/mooventhchiyan/rf-gb-xg-models |
一般 |
结论 USer_ID和product_ID在领域中起着重要的作用,但是如果没有它们,模型的性能会更好(似乎令人困惑) |
10 |
https://www.kaggle.com/annkurillose/insights-and-sales-prediction |
ensemble voting r2 0.74566 |
|
11 |
https://www.kaggle.com/rimjimrazdan/black-friday |
1.年龄取均值 r2 0.74 rmse 2500 |
|
12 |
https://www.kaggle.com/harkiratvasir/black-friday-practice |
正常rsme 3000左右 |
|
13 |
https://www.kaggle.com/deeppatel23/black-friday |
1.新的p2,p3填充策略 |
|
14 |
https://www.kaggle.com/muhammadayman/indistinct-features-explanation-lr-with-pytorch |
大佬的漂亮可视化 |
|
15 |
https://www.kaggle.com/vishnu691999/black-friday-sales-prediction-analytics-vidhya |
XGB 2585 |
|
16 |
https://www.kaggle.com/simrangujrati/predicting-black-friday-sales |
清晰 |
结构不错 |
17 |
https://www.kaggle.com/aye2121/black-friday-purchase-prediction |
可以 |
逻辑结构值得借鉴 随机森林深度优化 |
18 |
https://www.kaggle.com/gabrielloye/bt2101-project-notebook |
R语言 |
|
19 |
https://www.kaggle.com/zaimeali1997/black-friday-sales-analysis |
无了 |
|
20 |
https://www.kaggle.com/zaimeali1997/black-friday-v2-prediction |
特征选择的方式不错,筛选某个因素 |
|
21 |
https://www.kaggle.com/deeprajbasu/blackfriday-genderclassification-knn |
KNN 83分 |
可用研究研究 |
22 |
https://www.kaggle.com/jvbj11/black-friday-kim |
一般,仅有可视化和24同一个 |
|
23 |
https://www.kaggle.com/iamhungundji/lasso-regression-analysis |
代码看不懂 |
预测购买的方法有: 多元线性回归:验证集的RMSE = 6416.093 变量的向后子集:验证集的RMSE = 3230.107 套索回归:验证集的RMSE = 2970.445 排行榜准确性(用于测试集):2992.02138074499 |
24 |
https://www.kaggle.com/pedrokim/black-friday-kim |
一般,仅有可视化 |
|
25 |
https://www.kaggle.com/yaheaal/deep-learning-using-keras |
大佬笔记 |
牛皮 |
26 |
https://www.kaggle.com/mohamedabdullah/beautiful-insights-into-black-friday-data |
数据可视化大佬 |
值得借鉴 |
27 |
https://www.kaggle.com/sharmistha96/black-friday |
仅有一些数据预处理 |
|
28 |
https://www.kaggle.com/prashant111/comprehensive-data-analysis-with-pandas |
pandas研究 |
|
29 |
https://www.kaggle.com/nimisha21/black-friday-regression-analysis |
可视化很有逻辑 |
值得借鉴 |
30 |
https://www.kaggle.com/saifali2998/black-fridaytask-saif-credits-sirfawad |
没看懂 |
|
31 |
https://www.kaggle.com/vikash1a2b3c/black-friday-using-only-user-and-product-data |
矩阵分解 |
|
32 |
https://www.kaggle.com/florianrougier/big-data-black-friday |
不错,逻辑清晰,有结论 |
|
33 |
|
|