PPT 地址

https://c.d2l.ai/stanford-cs329p/_static/notebooks/cs329p_notebook_eda.slides.html#/

工具、库

image.png

1.5 探索性数据分析(EDA) - 图2

EDA


1.5 探索性数据分析(EDA) - 图3

First import libraries and data

  1. # !pip install seaborn pandas matplotlib numpy
  2. import numpy as np
  3. import pandas as pd
  4. import matplotlib.pyplot as plt
  5. import seaborn as sns
  6. from IPython import display
  7. display.set_matplotlib_formats('svg')
  8. # Alternative to set svg for newer versions
  9. # import matplotlib_inline
  10. # matplotlib_inline.backend_inline.set_matplotlib_formats('svg')
  11. data = pd.read_csv('house_sales.zip')

image.png
csv文件存下来相对比较大,可以先压缩成一个zip或一个tar,主流的读取文件都可以从压缩文件中读取。建议存成压缩文件,在传输存储都会比较好,甚至还会比直接读取还要好(这个方法可用于文本)

Let’s check the data shape and the first a few examples

[

](https://c.d2l.ai/stanford-cs329p/_static/notebooks/cs329p_notebook_eda.slides.html#/)