折线图/Line Charts


在这个教程中,你将学习仅用Python去绘制专业效果的折线图/line charts,之后,在下列的练习中,你将在真实世界中的数据集中展示你的新技能。



In [1]:

  1. import pandas as pd
  2. import matplotlib.pyplot as plt
  3. %matplotlib inline
  4. import seaborn as sns
  5. print("Setup Complete")
  1. Setup Complete


本教程的数据集跟踪了音乐流服务 Spotify 上的全球每日播放量,我们先专注于2017年和2018年的五首流行歌曲:

  1. “Shape of You”, by Ed Sheeran (link)
  2. “Despacito”, by Luis Fonzi (link)
  3. “Something Just Like This”, by The Chainsmokers and Coldplay (link)
  4. “HUMBLE.”, by Kendrick Lamar (link))
  5. “Unforgettable”, by French Montana (link))

imag 2.1

注意第一个日期是2017年1月6日,对应的是Ed Sheeran歌曲”The SHape of You”的发行日期。通过此表,你能看到”The Shape of You”在发布当天全球播放量是12,287,078。注意到其他歌曲在第一行没有值,那是因为它们还没有发行。


跟你前面学到的一样,咱们还是用 pd.read_csv 加载数据。

In [2]:

  1. # Path of the file to read
  2. spotify_filepath = "../input/spotify.csv"
  3. # Read the file into a variable spotify_data
  4. spotify_data = pd.read_csv(spotify_filepath, index_col="Date", parse_dates=True)

运行完上面的两行代码后,咱们就可以通过 spotify_data 获取数据。


咱们可以用 head 方法打印出数据的前5行

In [3]:

  1. # Print the first 5 rows of the data
  2. spotify_data.head()

Out [3]:

Shape of You Despacito Something Just Like This HUMBLE. Unforgettable
2017-01-06 12287078 NaN NaN NaN NaN
2017-01-07 13190270 NaN NaN NaN NaN
2017-01-08 13099919 NaN NaN NaN NaN
2017-01-09 14506351 NaN NaN NaN NaN
2017-01-10 14275628 NaN NaN NaN NaN


空记录上会出现 NaN,它是“Not a Number”的缩写。

咱们当然也能看看最后5行啦,只需要一小点变化(把 .head() 改成 .tail()):

In [4]:

  1. # Print the last five rows of the data
  2. spotify_data.tail()

Out [4]:

Shape of You Despacito Something Just Like This HUMBLE. Unforgettable
2018-01-05 4492978 3450315.0 2408365.0 2685857.0 2869783.0
2018-01-06 4416476 3394284.0 2188035.0 2559044.0 2743748.0
2018-01-07 4009104 3020789.0 1908129.0 2350985.0 2441045.0
2018-01-08 4135505 2755266.0 2023251.0 2523265.0 2622693.0
2018-01-09 4168506 2791601.0 2058016.0 2727678.0 2627334.0




In [5]:

  1. # Line chart showing daily global streams of each song
  2. sns.lineplot(data=spotify_data)

Out [5]:

  1. /opt/conda/lib/python3.6/site-packages/pandas/plotting/_matplotlib/converter.py:102: FutureWarning: Using an implicitly registered datetime converter for a matplotlib plotting method. The converter was registered by pandas on import. Future versions of pandas will require you to explicitly register matplotlib converters.
  2. To register the converters:
  3. >>> from pandas.plotting import register_matplotlib_converters
  4. >>> register_matplotlib_converters()
  5. warnings.warn(msg, FutureWarning)
  6. <matplotlib.axes._subplots.AxesSubplot at 0x7f5329aa3940>

imag 2.2


  • sns.lineplot 告诉notebook我们想要绘制一张这戏那图。

    • 你在这个课程学到的每个命令都会以 sns 开头,这指出了这些命令都是来自 seaborn 库。举例,咱们用 sns.lineplot 画折线图,很快你将学到用 sns.barplotsns.heatmap 去画柱图跟热力图。
  • data=spotify_data 用来选择会被用来画图的数据。

留意下你以后绘制图表时总会用相似的格式,而且唯一更改数据的方式就是数据集的名字。因此,假如你正在使用一个不用的数据集 financial_data,举例,要用如下代码:



In [6]:

  1. # Set the width and height of the figure
  2. plt.figure(figsize=(14,6))
  3. # Add title
  4. plt.title("Daily Global Streams of Popular Songs in 2017-2018")
  5. # Line chart showing daily global streams of each song
  6. sns.lineplot(data=spotify_data)

Out [6]:

  1. <matplotlib.axes._subplots.AxesSubplot at 0x7f532994d940>

imag 2.3

第一行代码设置了图形大小 14 英寸宽 6 英寸高,就靠这行代码你能设置任何图形的大小。接着,你要是想用一个定制大小,可以改变146的值成你希望的宽度和高度。





In [7]:

  1. list(spotify_data.columns)

Out [7]:

  1. ['Shape of You',
  2. 'Despacito',
  3. 'Something Just Like This',
  4. 'HUMBLE.',
  5. 'Unforgettable']


In [8]:

  1. # Set the width and height of the figure
  2. plt.figure(figsize=(14,6))
  3. # Add title
  4. plt.title("Daily Global Streams of Popular Songs in 2017-2018")
  5. # Line chart showing daily global streams of 'Shape of You'
  6. sns.lineplot(data=spotify_data['Shape of You'], label="Shape of You")
  7. # Line chart showing daily global streams of 'Despacito'
  8. sns.lineplot(data=spotify_data['Despacito'], label="Despacito")
  9. # Add label for horizontal axis
  10. plt.xlabel("Date")

Out [8]:

  1. Text(0.5, 0, 'Date')

imag 2.4


下面两行没行都添加一条线到折线图里。例如,考虑添加了”Shape of You”这条线的第一行。

  1. # Line chart showing daily global streams of 'Shape of You'
  2. sns.lineplot(data=spotify_data['Shape of You'], label="Shape of You")


  • 我们用了 data=spotify_data['Shape of You] 代替 data=spotify_data。通常,要只绘制一列,我们使用此格式将列的名称放在单个引号中,并将其括在方括号中。(为了确保正确指定列的名称,可以使用上面学习的命令打印所有列名的列表。)

  • 咱们还添加了 label="Shape of You" 以让该线出现在图例中并设置了它相应的标签。


