PPT 地址
https://c.d2l.ai/stanford-cs329p/_static/notebooks/cs329p_notebook_eda.slides.html#/
工具、库

EDA
First import libraries and data
# !pip install seaborn pandas matplotlib numpyimport numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsfrom IPython import displaydisplay.set_matplotlib_formats('svg')# Alternative to set svg for newer versions# import matplotlib_inline# matplotlib_inline.backend_inline.set_matplotlib_formats('svg')data = pd.read_csv('house_sales.zip')

csv文件存下来相对比较大,可以先压缩成一个zip或一个tar,主流的读取文件都可以从压缩文件中读取。建议存成压缩文件,在传输存储都会比较好,甚至还会比直接读取还要好(这个方法可用于文本)
Let’s check the data shape and the first a few examples
[
](https://c.d2l.ai/stanford-cs329p/_static/notebooks/cs329p_notebook_eda.slides.html#/)
