第七周：Python - 7.22 Python Pandas 多重索引 - 《数据分析学习笔记》

多重索引
多重索引切片方式
设置多重索引.set_index([索引1，索引2])
重置索引.reset_index()

多重索引

df.groupby(by=['city','education']).mean()

多重索引切片方式

#错误方法：因为city和education是索引不是列，索引不能直接切片，否则报错
df.groupby(by=['city','education']).mean()['北京']
KeyError: '北京'

#方法1：
先将数据框切为series，再对series切片，注意先切第一重索引，再切第二重索引。
df.groupby(by=['city','education']).mean().avg['北京']['硕士']
19.51063829787234
#方法2：也可直接使用loc[]，对索引进行筛选。多重索引时，注意loc中列表的顺序，先第一重再第二重
df.groupby(['city','education']).mean().loc['北京','硕士']
companyId     5.579256e+04
positionId    2.188171e+06
bottom        1.440426e+01
top           2.461702e+01
avg           1.951064e+01
Name: (北京, 硕士), dtype: float64

设置多重索引.set_index([索引1，索引2])

#set_index(),将列设置为索引
df1=df.set_index(['city','education'])
df1

重置索引.reset_index()

#.reset_index()重置索引，将索引转化为列
df1.reset_index()