1. 行标签与列标签获取

  1. print(cities)
  2. print(cities.index)
  3. print(cities.columns)
  4. area bb
  5. a 12 12
  6. b 25 25
  7. c 56 56
  8. d 67 67
  9. e 42 42
  10. Index(['a', 'b', 'c', 'd', 'e'], dtype='object')
  11. Index(['area', 'bb'], dtype='object')

2. 使用类名取值

  1. population = {'a': 12, 'b': 25, 'c': 56, 'd': 67, 'e': 42}
  2. area = pd.Series({'a': 11, 'b': 22, 'c': 55, 'd': 66, 'e': 44})
  3. data = pd.DataFrame({'population': population, 'area': area})
  4. # data = pd.DataFrame({1.1: population_dict, 2.2: area})
  5. print(data)
  6. print('用area的类名取值')
  7. print(data['area'])
  8. population area
  9. a 12 11
  10. b 25 22
  11. c 56 55
  12. d 67 66
  13. e 42 44
  14. 用类名取值
  15. a 11
  16. b 22
  17. c 55
  18. d 66
  19. e 44
  20. Name: area, dtype: int64

3. 使用属性形式取值

  1. print('使用属性形式取值')
  2. print(data.area)
  3. a 11
  4. b 22
  5. c 55
  6. d 66
  7. e 44
  8. Name: area, dtype: int64

4. 增加一列

  1. data['density'] = data['population'] / data['area']
  2. print(data)
  3. population area density
  4. a 12 11 1.090909
  5. b 25 22 1.136364
  6. c 56 55 1.018182
  7. d 67 66 1.015152
  8. e 42 44 0.954545

5. 使用values属性按行查看数据

  1. print(data.values)
  2. [[12. 11. 1.09090909]
  3. [25. 22. 1.13636364]
  4. [56. 55. 1.01818182]
  5. [67. 66. 1.01515152]
  6. [42. 44. 0.95454545]]

6. 数据转置

转置时并不会改变原值。

  1. print(data.T)
  2. print(data)
  3. a b c d e
  4. population 12.000000 25.000000 56.000000 67.000000 42.000000
  5. area 11.000000 22.000000 55.000000 66.000000 44.000000
  6. density 1.090909 1.136364 1.018182 1.015152 0.954545
  7. population area density
  8. a 12 11 1.090909
  9. b 25 22 1.136364
  10. c 56 55 1.018182
  11. d 67 66 1.015152
  12. e 42 44 0.954545

7. 切片时隐式与显式

  1. print('进行切片,选择一定范围内,显式索引')
  2. print(data.loc[:'b', :'area'])
  3. print(data.loc['b']['area'])
  4. print('隐式索引')
  5. print(data.iloc[:3, :2])
  6. population area
  7. a 12 11
  8. b 25 22
  9. 22.0
  10. 隐式索引
  11. population area
  12. a 12 11
  13. b 25 22
  14. c 56 55
  1. print(data['a': 'd'])
  2. print('上式相当与用了')
  3. print(data.loc['a': 'd'])
  4. population area density
  5. a 12 11 1.090909
  6. b 25 22 1.136364
  7. c 56 55 1.018182
  8. d 67 66 1.015152
  9. 上式相当与用了
  10. population area density
  11. a 12 11 1.090909
  12. b 25 22 1.136364
  13. c 56 55 1.018182
  14. d 67 66 1.015152
  1. print(data.loc['a': 'd', :])
  2. print('上式相当与用了')
  3. print(data[:4]) # 隐式索引
  4. population area density
  5. a 12 11 1.090909
  6. b 25 22 1.136364
  7. c 56 55 1.018182
  8. d 67 66 1.015152
  9. 上式相当与用了
  10. population area density
  11. a 12 11 1.090909
  12. b 25 22 1.136364
  13. c 56 55 1.018182
  14. d 67 66 1.015152

8. 使用numpy的方法取值

  1. print(data.loc[data.density > 1, ['population', 'density']])
  2. print(data.iloc[0, 2])
  3. print(data.iloc[0][2])
  4. population density
  5. a 12 1.090909
  6. b 25 1.136364
  7. c 56 1.018182
  8. d 67 1.015152
  9. 1.0909090909090908
  10. 1.0909090909090908

9. 单个或多个标签取值

  1. print('单个标签取值')
  2. print(data['area'])
  3. print('多个标签取值')
  4. print(data.loc[:, 'population':'area'])
  5. 单个标签取值
  6. a 11
  7. b 22
  8. c 55
  9. d 66
  10. e 44
  11. Name: area, dtype: int64
  12. 多个标签取值
  13. population area
  14. a 12 11
  15. b 25 22
  16. c 56 55
  17. d 67 66
  18. e 42 44

10. 特定列与某个值比大小

判断data中每一行density是否大于1

  1. print(data.density > 1)
  2. a True
  3. b True
  4. c True
  5. d True
  6. e False
  7. Name: density, dtype: bool