DataFrame.filter

DataFrame.filter(items=None, like=None, regex=None, axis=None)
根据指定的索引标签取得DataFrame的行或列的子集。
过滤器应用于索引标签,但不会应用于DataFrame中的数据。

Parameters

items 要保留的标签
like like方式保留标签
regex regex方式保留标签
axis 0 or index:筛选行,1 or columns:筛选列

Example

  1. import pandas as pd
  2. df = pd.DataFrame({'site':['google', 'baidu', 'wiki'],
  3. 'age':[18, 39, 22],
  4. 'price': [1.0, 2.0, 3.0],
  5. 'color': ['red', 'black', None]},index=['first','second','third'])
  6. df.filter(items=['site', 'age'])
  7. -----------------------------------------------
  8. site age
  9. first google 18
  10. second baidu 39
  11. third wiki 22

Example

  1. import pandas as pd
  2. df = pd.DataFrame({'site':['google', 'baidu', 'wiki'],
  3. 'age':[18, 39, 22],
  4. 'price': [1.0, 2.0, 3.0],
  5. 'color': ['red', 'black', None]},index=['first','second','third'])
  6. df.filter(items=['first', 'second'], axis=0)
  7. -------------------------------------------------------
  8. site age price color
  9. first google 18 1.0 red
  10. second baidu 39 2.0 black

Example

  1. import pandas as pd
  2. df = pd.DataFrame({'site':['google', 'baidu', 'wiki'],
  3. 'age':[18, 39, 22],
  4. 'price': [1.0, 2.0, 3.0],
  5. 'color': ['red', 'black', None]},index=['first','second','third'])
  6. df.filter(regex='e$', axis=1)
  7. --------------------------------------------------------
  8. site age price
  9. first google 18 1.0
  10. second baidu 39 2.0
  11. third wiki 22 3.0

Example

  1. import pandas as pd
  2. df = pd.DataFrame({'site':['google', 'baidu', 'wiki'],
  3. 'age':[18, 39, 22],
  4. 'price': [1.0, 2.0, 3.0],
  5. 'color': ['red', 'black', None]},index=['first','second','third'])
  6. df.filter(like='on', axis=0)
  7. -------------------------------------------
  8. site age price color
  9. second baidu 39 2.0 black

DataFrameGroupBy.filter

DataFrameGroupBy.filter(func, dropna=True, *args, kwargs)**
返回DataFrame的副本,不包括过滤后的元素。
如果组中的元素不满足func指定的布尔标准,则过滤它们。

Parameters

func 函数应用于每个子帧,应该返回TrueFalse
dropna False:用nan填充
  1. import pandas as pd
  2. df = pd.DataFrame({'site':['google', 'baidu', 'wiki'],
  3. 'age':[18, 39, 22],
  4. 'price': [1.0, 2.0, 3.0],
  5. 'color': ['red', 'black', None]},index=['first','second','third'])
  6. df.groupby('color').filter(lambda x: x['age'].mean() > 30)
  7. ---------------------------------------------------------
  8. site age price color
  9. second baidu 39 2.0 black