Object selection has had a number of user-requested additions in order to support more explicit location based indexing. Pandas now supports three types of multi-axis indexing.

    • .loc 显示行索引,首选基于标签is primarily label based, but may also be used with a boolean array. .loc will raise KeyError when the items are not found. Allowed inputs are:

      • 单独的标签,例如 5'a' (注意 5 会被解释为一个标签,不是一个表示位置的整数).
      • 一个标签的列表或数组 ['a', 'b', 'c'].
      • 一个标签的切片对象 'a':'f' (一般是指python里的标签, 包括起始和结束位置 )
      • 一个布尔型的数组 任何 NA 值被当成 False).
      • 一个带一个参数的可调用函数可返回一个有效的输出做行索引
    • .iloc 隐式的行索引,主要是基于从0到长度-1的整数,也可用一个布尔数组,可以用以下方式:

      • 一个整数,例如 5.
      • 一个整数的列表或数组 [4, 3, 0].
      • 一个整数的切片对象 1:7.
      • 一个布尔数组
      • 一个带一个参数的可调用函数可返回一个有效的输出做行索引

    • .loc, .iloc, and also [] indexing can accept a callable as indexer.
    Object Type Selection Return Value Type
    Series series[label] scalar value
    DataFrame frame[colname] Series corresponding to colname
    1. import pandas as pd
    2. import numpy as np
    3. dates = pd.date_range('1/1/2000', periods=8) # 20000.1.1 起按日期产生8个数据
    4. print(dates)

    DatetimeIndex([‘2000-01-01’, ‘2000-01-02’, ‘2000-01-03’, ‘2000-01-04’,
    ‘2000-01-05’, ‘2000-01-06’, ‘2000-01-07’, ‘2000-01-08’],
    dtype=’datetime64[ns]’, freq=’D’)

    1. df = pd.DataFrame(np.random.randn(8, 4), columns=['A', 'B', 'C', 'D'])
    2. print(df)

    默认行索引为0,1,2,…

    1. A B C D
    2. 0 0.040279 1.530194 -0.839738 1.769272
    3. 1 1.364984 -1.607406 0.045436 1.390306
    4. 2 1.064731 0.852654 0.741311 0.488171
    5. 3 -1.193471 -0.205841 0.224680 1.799955
    6. 4 0.156241 -0.151240 -1.336287 -0.102478
    7. 5 -1.152899 0.497563 0.789621 -0.780824
    8. 6 2.301429 -0.711661 0.394633 -0.009994
    9. 7 0.509731 0.187269 0.134205 1.489733

    可以用index关键字指定行索引

    1. df = pd.DataFrame(np.random.randn(8, 4),index=dates, columns=['A', 'B', 'C', 'D'])
    2. print(df)
    1. A B C D
    2. 2000-01-01 0.542793 1.045460 -0.942148 0.187426
    3. 2000-01-02 0.516108 0.821478 0.227624 1.503220
    4. 2000-01-03 1.558611 1.042741 0.116858 -0.848084
    5. 2000-01-04 -0.228758 -0.935041 1.318462 0.002611
    6. 2000-01-05 0.420747 3.439259 0.912372 1.345009
    7. 2000-01-06 -0.597713 1.039117 -0.235674 0.010001
    8. 2000-01-07 0.380847 1.370491 0.715843 -1.356307
    9. 2000-01-08 0.114837 0.770705 -0.865508 -0.073762

    可以根据dateframe的列索引列数据

    s = df['A']
    print(s)
    
    2000-01-01    0.542793
    2000-01-02    0.516108
    2000-01-03    1.558611
    2000-01-04   -0.228758
    2000-01-05    0.420747
    2000-01-06   -0.597713
    2000-01-07    0.380847
    2000-01-08    0.114837
    Freq: D, Name: A, dtype: float64
    

    根据dateframe的多列索引数据时,多列置于一个列表中,各列不要求连续。

    s = df[['A','B']]
    print(s)
    
                       A         B 
    2000-01-01  0.542793  1.045460 
    2000-01-02  0.516108  0.821478 
    2000-01-03  1.558611  1.042741 
    2000-01-04 -0.228758 -0.935041 
    2000-01-05  0.420747  3.439259  
    2000-01-06 -0.597713  1.039117 
    2000-01-07  0.380847  1.370491  
    2000-01-08  0.114837  0.770705
    
    print(s[dates[5]])        # s = df['A'],s[dates[5]]等价于df['A'][dates[5]]
    print(df['A'][dates[5]])  # 同上
    

    -0.5977126285793471

    df[[‘B’, ‘A’]] = df[[‘A’, ‘B’]] _# 交换AB 列_print(df)

                       A         B         C         D
    2000-01-01  1.045460  0.542793 -0.942148  0.187426
    2000-01-02  0.821478  0.516108  0.227624  1.503220
    2000-01-03  1.042741  1.558611  0.116858 -0.848084
    2000-01-04 -0.935041 -0.228758  1.318462  0.002611
    2000-01-05  3.439259  0.420747  0.912372  1.345009
    2000-01-06  1.039117 -0.597713 -0.235674  0.010001
    2000-01-07  1.370491  0.380847  0.715843 -1.356307
    2000-01-08  0.770705  0.114837 -0.865508 -0.073762