取出列2每组对应的最小值

  1. data = pd.DataFrame({
  2. 'aa':['a','b','c','b','c']
  3. ,'b':['a','b','c','b','c']
  4. ,'c':['a','b','c','b','c']
  5. ,'d':['a','b','c','b','c']
  6. ,'e':['2021-10-01','2021-10-06','2021-10-08','2021-10-07','2021-10-07']
  7. })
  8. data['e']=pd.to_datetime(data['e']) # 将注册日期转换为日期格式
  9. data1 = data.groupby("aa").agg({'e':[np.min]})
  10. ad_min = data.groupby('aa').apply(lambda t: t[t.e==t.e.min()])
  11. ad_min.reset_index(drop=True, inplace=True)

image.png
image.png

取出列2的最大值与最小值

  1. import numpy as np
  2. df = pd.DataFrame({'item':['A','A','A','B','B','B']
  3. ,'year':[2010,2011,2012,2016,2019,2018]
  4. ,'value':[20,25,32,20,40,50]})
  5. df.groupby("item").agg({'year':[np.min, np.max]})

image.png

以列2排序后再找出列3极值的差值

  1. import numpy as np
  2. df = pd.DataFrame({'item':['A','A','A','B','B','B']
  3. ,'year':[2010,2011,2012,2016,2019,2018]
  4. ,'value':[20,25,32,20,40,50]})
  5. g = df.sort_values('year').groupby('item')
  6. out = g['value'].last() - g['value'].first()
  7. out

image.png