时间差

Timedelta,时间差,即时间之间的差异,用 日、时、分、秒 等时间单位表示,时间单位可为正,也可为负。

Timedeltadatetime.timedelta 的子类,两者的操作方式相似,但 Timedelta 兼容 np.timedelta64 等数据类型,还支持自定义表示形式、能解析多种类型的数据,并支持自有属性。

解析数据,生成时间差

Timedelta() 支持用多种参数生成时间差:

  1. In [1]: import datetime
  2. # 字符串
  3. In [2]: pd.Timedelta('1 days')
  4. Out[2]: Timedelta('1 days 00:00:00')
  5. In [3]: pd.Timedelta('1 days 00:00:00')
  6. Out[3]: Timedelta('1 days 00:00:00')
  7. In [4]: pd.Timedelta('1 days 2 hours')
  8. Out[4]: Timedelta('1 days 02:00:00')
  9. In [5]: pd.Timedelta('-1 days 2 min 3us')
  10. Out[5]: Timedelta('-2 days +23:57:59.999997')
  11. # datetime.timedelta
  12. # 注意:必须指定关键字参数
  13. In [6]: pd.Timedelta(days=1, seconds=1)
  14. Out[6]: Timedelta('1 days 00:00:01')
  15. # 用整数与时间单位生成时间差
  16. In [7]: pd.Timedelta(1, unit='d')
  17. Out[7]: Timedelta('1 days 00:00:00')
  18. # datetime.timedelta 与 np.timedelta64
  19. In [8]: pd.Timedelta(datetime.timedelta(days=1, seconds=1))
  20. Out[8]: Timedelta('1 days 00:00:01')
  21. In [9]: pd.Timedelta(np.timedelta64(1, 'ms'))
  22. Out[9]: Timedelta('0 days 00:00:00.001000')
  23. # 用字符串表示负数时间差
  24. # 更接近 datetime.timedelta
  25. In [10]: pd.Timedelta('-1us')
  26. Out[10]: Timedelta('-1 days +23:59:59.999999')
  27. # 时间差缺失值
  28. In [11]: pd.Timedelta('nan')
  29. Out[11]: NaT
  30. In [12]: pd.Timedelta('nat')
  31. Out[12]: NaT
  32. # ISO8601 时间格式字符串
  33. In [13]: pd.Timedelta('P0DT0H1M0S')
  34. Out[13]: Timedelta('0 days 00:01:00')
  35. In [14]: pd.Timedelta('P0DT0H0M0.000000123S')
  36. Out[14]: Timedelta('0 days 00:00:00.000000')

0.23.0 版新增:增加了用 ISO8601 时间格式生成时间差。

DateOffsetsDayHourMinuteSecondMilliMicroNano)也可以用来生成时间差。

  1. In [15]: pd.Timedelta(pd.offsets.Second(2))
  2. Out[15]: Timedelta('0 days 00:00:02')

标量运算生成的也是 Timedelta 标量。

  1. In [16]: pd.Timedelta(pd.offsets.Day(2)) + pd.Timedelta(pd.offsets.Second(2)) +\
  2. ....: pd.Timedelta('00:00:00.000123')
  3. ....:
  4. Out[16]: Timedelta('2 days 00:00:02.000123')

to_timedelta

pd.to_timedelta() 可以把符合时间差格式的标量、数组、列表、序列等数据转换为Timedelta。输入数据是序列,输出的就是序列,输入数据是标量,输出的就是标量,其它形式的输入数据则输出 TimedeltaIndex

to_timedelta() 可以解析单个字符串:

  1. In [17]: pd.to_timedelta('1 days 06:05:01.00003')
  2. Out[17]: Timedelta('1 days 06:05:01.000030')
  3. In [18]: pd.to_timedelta('15.5us')
  4. Out[18]: Timedelta('0 days 00:00:00.000015')

还能解析字符串列表或数组:

  1. In [19]: pd.to_timedelta(['1 days 06:05:01.00003', '15.5us', 'nan'])
  2. Out[19]: TimedeltaIndex(['1 days 06:05:01.000030', '0 days 00:00:00.000015', NaT], dtype='timedelta64[ns]', freq=None)

unit 关键字参数指定时间差的单位:

  1. In [20]: pd.to_timedelta(np.arange(5), unit='s')
  2. Out[20]: TimedeltaIndex(['00:00:00', '00:00:01', '00:00:02', '00:00:03', '00:00:04'], dtype='timedelta64[ns]', freq=None)
  3. In [21]: pd.to_timedelta(np.arange(5), unit='d')
  4. Out[21]: TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None)

时间差界限

Pandas 时间差的纳秒解析度是 64 位整数,这就决定了 Timedelta 的上下限。

  1. In [22]: pd.Timedelta.min
  2. Out[22]: Timedelta('-106752 days +00:12:43.145224')
  3. In [23]: pd.Timedelta.max
  4. Out[23]: Timedelta('106751 days 23:47:16.854775')

运算

以时间差为数据的 SeriesDataFrame 支持各种运算,datetime64 [ns] 序列或 Timestamps 减法运算生成的是timedelta64 [ns] 序列。

  1. In [24]: s = pd.Series(pd.date_range('2012-1-1', periods=3, freq='D'))
  2. In [25]: td = pd.Series([pd.Timedelta(days=i) for i in range(3)])
  3. In [26]: df = pd.DataFrame({'A': s, 'B': td})
  4. In [27]: df
  5. Out[27]:
  6. A B
  7. 0 2012-01-01 0 days
  8. 1 2012-01-02 1 days
  9. 2 2012-01-03 2 days
  10. In [28]: df['C'] = df['A'] + df['B']
  11. In [29]: df
  12. Out[29]:
  13. A B C
  14. 0 2012-01-01 0 days 2012-01-01
  15. 1 2012-01-02 1 days 2012-01-03
  16. 2 2012-01-03 2 days 2012-01-05
  17. In [30]: df.dtypes
  18. Out[30]:
  19. A datetime64[ns]
  20. B timedelta64[ns]
  21. C datetime64[ns]
  22. dtype: object
  23. In [31]: s - s.max()
  24. Out[31]:
  25. 0 -2 days
  26. 1 -1 days
  27. 2 0 days
  28. dtype: timedelta64[ns]
  29. In [32]: s - datetime.datetime(2011, 1, 1, 3, 5)
  30. Out[32]:
  31. 0 364 days 20:55:00
  32. 1 365 days 20:55:00
  33. 2 366 days 20:55:00
  34. dtype: timedelta64[ns]
  35. In [33]: s + datetime.timedelta(minutes=5)
  36. Out[33]:
  37. 0 2012-01-01 00:05:00
  38. 1 2012-01-02 00:05:00
  39. 2 2012-01-03 00:05:00
  40. dtype: datetime64[ns]
  41. In [34]: s + pd.offsets.Minute(5)
  42. Out[34]:
  43. 0 2012-01-01 00:05:00
  44. 1 2012-01-02 00:05:00
  45. 2 2012-01-03 00:05:00
  46. dtype: datetime64[ns]
  47. In [35]: s + pd.offsets.Minute(5) + pd.offsets.Milli(5)
  48. Out[35]:
  49. 0 2012-01-01 00:05:00.005
  50. 1 2012-01-02 00:05:00.005
  51. 2 2012-01-03 00:05:00.005
  52. dtype: datetime64[ns]

timedelta64 [ns] 序列的标量运算:

  1. In [36]: y = s - s[0]
  2. In [37]: y
  3. Out[37]:
  4. 0 0 days
  5. 1 1 days
  6. 2 2 days
  7. dtype: timedelta64[ns]

时间差序列支持 NaT 值:

  1. In [38]: y = s - s.shift()
  2. In [39]: y
  3. Out[39]:
  4. 0 NaT
  5. 1 1 days
  6. 2 1 days
  7. dtype: timedelta64[ns]

datetime 类似,np.nan 把时间差设置为 NaT

  1. In [40]: y[1] = np.nan
  2. In [41]: y
  3. Out[41]:
  4. 0 NaT
  5. 1 NaT
  6. 2 1 days
  7. dtype: timedelta64[ns]

运算符也可以显示为逆序(序列与单个对象的运算):

  1. In [42]: s.max() - s
  2. Out[42]:
  3. 0 2 days
  4. 1 1 days
  5. 2 0 days
  6. dtype: timedelta64[ns]
  7. In [43]: datetime.datetime(2011, 1, 1, 3, 5) - s
  8. Out[43]:
  9. 0 -365 days +03:05:00
  10. 1 -366 days +03:05:00
  11. 2 -367 days +03:05:00
  12. dtype: timedelta64[ns]
  13. In [44]: datetime.timedelta(minutes=5) + s
  14. Out[44]:
  15. 0 2012-01-01 00:05:00
  16. 1 2012-01-02 00:05:00
  17. 2 2012-01-03 00:05:00
  18. dtype: datetime64[ns]

DataFrame 支持 minmaxidxminidxmax 运算:

  1. In [45]: A = s - pd.Timestamp('20120101') - pd.Timedelta('00:05:05')
  2. In [46]: B = s - pd.Series(pd.date_range('2012-1-2', periods=3, freq='D'))
  3. In [47]: df = pd.DataFrame({'A': A, 'B': B})
  4. In [48]: df
  5. Out[48]:
  6. A B
  7. 0 -1 days +23:54:55 -1 days
  8. 1 0 days 23:54:55 -1 days
  9. 2 1 days 23:54:55 -1 days
  10. In [49]: df.min()
  11. Out[49]:
  12. A -1 days +23:54:55
  13. B -1 days +00:00:00
  14. dtype: timedelta64[ns]
  15. In [50]: df.min(axis=1)
  16. Out[50]:
  17. 0 -1 days
  18. 1 -1 days
  19. 2 -1 days
  20. dtype: timedelta64[ns]
  21. In [51]: df.idxmin()
  22. Out[51]:
  23. A 0
  24. B 0
  25. dtype: int64
  26. In [52]: df.idxmax()
  27. Out[52]:
  28. A 2
  29. B 0
  30. dtype: int64

Series 也支持minmaxidxminidxmax 运算。标量计算结果为 Timedelta

  1. In [53]: df.min().max()
  2. Out[53]: Timedelta('-1 days +23:54:55')
  3. In [54]: df.min(axis=1).min()
  4. Out[54]: Timedelta('-1 days +00:00:00')
  5. In [55]: df.min().idxmax()
  6. Out[55]: 'A'
  7. In [56]: df.min(axis=1).idxmin()
  8. Out[56]: 0

时间差支持 fillna 函数,参数是 Timedelta,用于指定填充值。

  1. In [57]: y.fillna(pd.Timedelta(0))
  2. Out[57]:
  3. 0 0 days
  4. 1 0 days
  5. 2 1 days
  6. dtype: timedelta64[ns]
  7. In [58]: y.fillna(pd.Timedelta(10, unit='s'))
  8. Out[58]:
  9. 0 0 days 00:00:10
  10. 1 0 days 00:00:10
  11. 2 1 days 00:00:00
  12. dtype: timedelta64[ns]
  13. In [59]: y.fillna(pd.Timedelta('-1 days, 00:00:05'))
  14. Out[59]:
  15. 0 -1 days +00:00:05
  16. 1 -1 days +00:00:05
  17. 2 1 days 00:00:00
  18. dtype: timedelta64[ns]

Timedelta 还支持取反、乘法及绝对值(Abs)运算:

  1. In [60]: td1 = pd.Timedelta('-1 days 2 hours 3 seconds')
  2. In [61]: td1
  3. Out[61]: Timedelta('-2 days +21:59:57')
  4. In [62]: -1 * td1
  5. Out[62]: Timedelta('1 days 02:00:03')
  6. In [63]: - td1
  7. Out[63]: Timedelta('1 days 02:00:03')
  8. In [64]: abs(td1)
  9. Out[64]: Timedelta('1 days 02:00:03')

归约

timedelta64 [ns] 数值归约运算返回的是 Timedelta 对象。 一般情况下,NaT 不计数。

  1. In [65]: y2 = pd.Series(pd.to_timedelta(['-1 days +00:00:05', 'nat',
  2. ....: '-1 days +00:00:05', '1 days']))
  3. ....:
  4. In [66]: y2
  5. Out[66]:
  6. 0 -1 days +00:00:05
  7. 1 NaT
  8. 2 -1 days +00:00:05
  9. 3 1 days 00:00:00
  10. dtype: timedelta64[ns]
  11. In [67]: y2.mean()
  12. Out[67]: Timedelta('-1 days +16:00:03.333333')
  13. In [68]: y2.median()
  14. Out[68]: Timedelta('-1 days +00:00:05')
  15. In [69]: y2.quantile(.1)
  16. Out[69]: Timedelta('-1 days +00:00:05')
  17. In [70]: y2.sum()
  18. Out[70]: Timedelta('-1 days +00:00:10')

频率转换

时间差除法把 Timedelta 序列、TimedeltaIndexTimedelta 标量转换为其它“频率”,astype 也可以将之转换为指定的时间差。这些运算生成的是序列,并把 NaT 转换为 nan。 注意,NumPy 标量除法是真除法,astype 则等同于取底整除(Floor Division)。

::: tip 说明

Floor Division ,即两数的商为向下取整,如,9 / 2 = 4。又译作地板除或向下取整除,本文译作取底整除

扩展知识:

Ceiling Division,即两数的商为向上取整,如,9 / 2 = 5。又译作屋顶除或向上取整除,本文译作取顶整除

:::

  1. In [71]: december = pd.Series(pd.date_range('20121201', periods=4))
  2. In [72]: january = pd.Series(pd.date_range('20130101', periods=4))
  3. In [73]: td = january - december
  4. In [74]: td[2] += datetime.timedelta(minutes=5, seconds=3)
  5. In [75]: td[3] = np.nan
  6. In [76]: td
  7. Out[76]:
  8. 0 31 days 00:00:00
  9. 1 31 days 00:00:00
  10. 2 31 days 00:05:03
  11. 3 NaT
  12. dtype: timedelta64[ns]
  13. # 转为日
  14. In [77]: td / np.timedelta64(1, 'D')
  15. Out[77]:
  16. 0 31.000000
  17. 1 31.000000
  18. 2 31.003507
  19. 3 NaN
  20. dtype: float64
  21. In [78]: td.astype('timedelta64[D]')
  22. Out[78]:
  23. 0 31.0
  24. 1 31.0
  25. 2 31.0
  26. 3 NaN
  27. dtype: float64
  28. # 转为秒
  29. In [79]: td / np.timedelta64(1, 's')
  30. Out[79]:
  31. 0 2678400.0
  32. 1 2678400.0
  33. 2 2678703.0
  34. 3 NaN
  35. dtype: float64
  36. In [80]: td.astype('timedelta64[s]')
  37. Out[80]:
  38. 0 2678400.0
  39. 1 2678400.0
  40. 2 2678703.0
  41. 3 NaN
  42. dtype: float64
  43. # 转为月 (此处用常量表示)
  44. In [81]: td / np.timedelta64(1, 'M')
  45. Out[81]:
  46. 0 1.018501
  47. 1 1.018501
  48. 2 1.018617
  49. 3 NaN
  50. dtype: float64

timedelta64 [ns] 序列与整数或整数序列相乘或相除,生成的也是 timedelta64 [ns] 序列。

  1. In [82]: td * -1
  2. Out[82]:
  3. 0 -31 days +00:00:00
  4. 1 -31 days +00:00:00
  5. 2 -32 days +23:54:57
  6. 3 NaT
  7. dtype: timedelta64[ns]
  8. In [83]: td * pd.Series([1, 2, 3, 4])
  9. Out[83]:
  10. 0 31 days 00:00:00
  11. 1 62 days 00:00:00
  12. 2 93 days 00:15:09
  13. 3 NaT
  14. dtype: timedelta64[ns]

timedelta64 [ns] 序列与 Timedelta 标量相除的结果为取底整除的整数序列。

  1. In [84]: td // pd.Timedelta(days=3, hours=4)
  2. Out[84]:
  3. 0 9.0
  4. 1 9.0
  5. 2 9.0
  6. 3 NaN
  7. dtype: float64
  8. In [85]: pd.Timedelta(days=3, hours=4) // td
  9. Out[85]:
  10. 0 0.0
  11. 1 0.0
  12. 2 0.0
  13. 3 NaN
  14. dtype: float64

Timedelta 的求余(mod(%))与除余(divmod)运算,支持时间差与数值参数。

  1. In [86]: pd.Timedelta(hours=37) % datetime.timedelta(hours=2)
  2. Out[86]: Timedelta('0 days 01:00:00')
  3. # 除余运算的参数为时间差时,返回一对值(int, Timedelta)
  4. In [87]: divmod(datetime.timedelta(hours=2), pd.Timedelta(minutes=11))
  5. Out[87]: (10, Timedelta('0 days 00:10:00'))
  6. # 除余运算的参数为数值时,也返回一对值(Timedelta, Timedelta)
  7. In [88]: divmod(pd.Timedelta(hours=25), 86400000000000)
  8. Out[88]: (Timedelta('0 days 00:00:00.000000'), Timedelta('0 days 01:00:00'))

属性

TimedeltaTimedeltaIndex 的组件可以直接访问 dayssecondsmicrosecondsnanoseconds 等属性。这些属性与datetime.timedelta 的返回值相同,例如,.seconds 属性表示大于等于 0 天且小于 1 天的秒数。带符号的 Timedelta 返回的值也带符号。

Series.dt 属性也可以直接访问这些数据。

::: tip 注意

这些属性不是 Timedelta 显示的值。.components 可以提取显示的值。

:::

对于 Series

  1. In [89]: td.dt.days
  2. Out[89]:
  3. 0 31.0
  4. 1 31.0
  5. 2 31.0
  6. 3 NaN
  7. dtype: float64
  8. In [90]: td.dt.seconds
  9. Out[90]:
  10. 0 0.0
  11. 1 0.0
  12. 2 303.0
  13. 3 NaN
  14. dtype: float64

直接访问 Timedelta 标量字段值。

  1. In [91]: tds = pd.Timedelta('31 days 5 min 3 sec')
  2. In [92]: tds.days
  3. Out[92]: 31
  4. In [93]: tds.seconds
  5. Out[93]: 303
  6. In [94]: (-tds).seconds
  7. Out[94]: 86097

.components 属性可以快速访问时间差的组件,返回结果是 DataFrame。 下列代码输出 Timedelta 的显示值。

  1. In [95]: td.dt.components
  2. Out[95]:
  3. days hours minutes seconds milliseconds microseconds nanoseconds
  4. 0 31.0 0.0 0.0 0.0 0.0 0.0 0.0
  5. 1 31.0 0.0 0.0 0.0 0.0 0.0 0.0
  6. 2 31.0 0.0 5.0 3.0 0.0 0.0 0.0
  7. 3 NaN NaN NaN NaN NaN NaN NaN
  8. In [96]: td.dt.components.seconds
  9. Out[96]:
  10. 0 0.0
  11. 1 0.0
  12. 2 3.0
  13. 3 NaN
  14. Name: seconds, dtype: float64

.isoformat 方法可以把 Timedelta 转换为 ISO8601 时间格式字符串。

0.20.0 版新增。

  1. In [97]: pd.Timedelta(days=6, minutes=50, seconds=3,
  2. ....: milliseconds=10, microseconds=10,
  3. ....: nanoseconds=12).isoformat()
  4. ....:
  5. Out[97]: 'P6DT0H50M3.010010012S'

TimedeltaIndex

TimedeltaIndextimedelta_range() 可以生成时间差索引。

TimedeltaIndex 支持字符串型的 Timedeltatimedeltanp.timedelta64对象。

np.nanpd.NaTnat 代表缺失值。

  1. In [98]: pd.TimedeltaIndex(['1 days', '1 days, 00:00:05', np.timedelta64(2, 'D'),
  2. ....: datetime.timedelta(days=2, seconds=2)])
  3. ....:
  4. Out[98]:
  5. TimedeltaIndex(['1 days 00:00:00', '1 days 00:00:05', '2 days 00:00:00',
  6. '2 days 00:00:02'],
  7. dtype='timedelta64[ns]', freq=None)

freq 关键字参数为 infer 时,TimedeltaIndex 可以自行推断时间频率:

  1. In [99]: pd.TimedeltaIndex(['0 days', '10 days', '20 days'], freq='infer')
  2. Out[99]: TimedeltaIndex(['0 days', '10 days', '20 days'], dtype='timedelta64[ns]', freq='10D')

生成时间差范围

date_range() 相似,timedelta_range() 可以生成定频 TimedeltaIndextimedelta_range 的默认频率是日历日:

  1. In [100]: pd.timedelta_range(start='1 days', periods=5)
  2. Out[100]: TimedeltaIndex(['1 days', '2 days', '3 days', '4 days', '5 days'], dtype='timedelta64[ns]', freq='D')

timedelta_range 支持 startendperiods 三个参数:

  1. In [101]: pd.timedelta_range(start='1 days', end='5 days')
  2. Out[101]: TimedeltaIndex(['1 days', '2 days', '3 days', '4 days', '5 days'], dtype='timedelta64[ns]', freq='D')
  3. In [102]: pd.timedelta_range(end='10 days', periods=4)
  4. Out[102]: TimedeltaIndex(['7 days', '8 days', '9 days', '10 days'], dtype='timedelta64[ns]', freq='D')

freq 参数支持各种频率别名

  1. In [103]: pd.timedelta_range(start='1 days', end='2 days', freq='30T')
  2. Out[103]:
  3. TimedeltaIndex(['1 days 00:00:00', '1 days 00:30:00', '1 days 01:00:00',
  4. '1 days 01:30:00', '1 days 02:00:00', '1 days 02:30:00',
  5. '1 days 03:00:00', '1 days 03:30:00', '1 days 04:00:00',
  6. '1 days 04:30:00', '1 days 05:00:00', '1 days 05:30:00',
  7. '1 days 06:00:00', '1 days 06:30:00', '1 days 07:00:00',
  8. '1 days 07:30:00', '1 days 08:00:00', '1 days 08:30:00',
  9. '1 days 09:00:00', '1 days 09:30:00', '1 days 10:00:00',
  10. '1 days 10:30:00', '1 days 11:00:00', '1 days 11:30:00',
  11. '1 days 12:00:00', '1 days 12:30:00', '1 days 13:00:00',
  12. '1 days 13:30:00', '1 days 14:00:00', '1 days 14:30:00',
  13. '1 days 15:00:00', '1 days 15:30:00', '1 days 16:00:00',
  14. '1 days 16:30:00', '1 days 17:00:00', '1 days 17:30:00',
  15. '1 days 18:00:00', '1 days 18:30:00', '1 days 19:00:00',
  16. '1 days 19:30:00', '1 days 20:00:00', '1 days 20:30:00',
  17. '1 days 21:00:00', '1 days 21:30:00', '1 days 22:00:00',
  18. '1 days 22:30:00', '1 days 23:00:00', '1 days 23:30:00',
  19. '2 days 00:00:00'],
  20. dtype='timedelta64[ns]', freq='30T')
  21. In [104]: pd.timedelta_range(start='1 days', periods=5, freq='2D5H')
  22. Out[104]:
  23. TimedeltaIndex(['1 days 00:00:00', '3 days 05:00:00', '5 days 10:00:00',
  24. '7 days 15:00:00', '9 days 20:00:00'],
  25. dtype='timedelta64[ns]', freq='53H')

0.23.0 版新增

startendperiod 可以生成等宽时间差范围,其中,startend(含)是起止两端的时间,periodsTimedeltaIndex 里的元素数量:

  1. In [105]: pd.timedelta_range('0 days', '4 days', periods=5)
  2. Out[105]: TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None)
  3. In [106]: pd.timedelta_range('0 days', '4 days', periods=10)
  4. Out[106]:
  5. TimedeltaIndex(['0 days 00:00:00', '0 days 10:40:00', '0 days 21:20:00',
  6. '1 days 08:00:00', '1 days 18:40:00', '2 days 05:20:00',
  7. '2 days 16:00:00', '3 days 02:40:00', '3 days 13:20:00',
  8. '4 days 00:00:00'],
  9. dtype='timedelta64[ns]', freq=None)

TimedeltaIndex 应用

DatetimeIndexPeriodIndexdatetime 型索引类似,TimedeltaIndex 也可当作 pandas 对象的索引。

  1. In [107]: s = pd.Series(np.arange(100),
  2. .....: index=pd.timedelta_range('1 days', periods=100, freq='h'))
  3. .....:
  4. In [108]: s
  5. Out[108]:
  6. 1 days 00:00:00 0
  7. 1 days 01:00:00 1
  8. 1 days 02:00:00 2
  9. 1 days 03:00:00 3
  10. 1 days 04:00:00 4
  11. ..
  12. 4 days 23:00:00 95
  13. 5 days 00:00:00 96
  14. 5 days 01:00:00 97
  15. 5 days 02:00:00 98
  16. 5 days 03:00:00 99
  17. Freq: H, Length: 100, dtype: int64

选择操作也差不多,可以强制转换字符串与切片:

  1. In [109]: s['1 day':'2 day']
  2. Out[109]:
  3. 1 days 00:00:00 0
  4. 1 days 01:00:00 1
  5. 1 days 02:00:00 2
  6. 1 days 03:00:00 3
  7. 1 days 04:00:00 4
  8. ..
  9. 2 days 19:00:00 43
  10. 2 days 20:00:00 44
  11. 2 days 21:00:00 45
  12. 2 days 22:00:00 46
  13. 2 days 23:00:00 47
  14. Freq: H, Length: 48, dtype: int64
  15. In [110]: s['1 day 01:00:00']
  16. Out[110]: 1
  17. In [111]: s[pd.Timedelta('1 day 1h')]
  18. Out[111]: 1

TimedeltaIndex 还支持局部字符串选择,并且可以推断选择范围:

  1. In [112]: s['1 day':'1 day 5 hours']
  2. Out[112]:
  3. 1 days 00:00:00 0
  4. 1 days 01:00:00 1
  5. 1 days 02:00:00 2
  6. 1 days 03:00:00 3
  7. 1 days 04:00:00 4
  8. 1 days 05:00:00 5
  9. Freq: H, dtype: int64

TimedeltaIndex 运算

TimedeltaIndexDatetimeIndex 运算可以保留 NaT 值:

  1. In [113]: tdi = pd.TimedeltaIndex(['1 days', pd.NaT, '2 days'])
  2. In [114]: tdi.to_list()
  3. Out[114]: [Timedelta('1 days 00:00:00'), NaT, Timedelta('2 days 00:00:00')]
  4. In [115]: dti = pd.date_range('20130101', periods=3)
  5. In [116]: dti.to_list()
  6. Out[116]:
  7. [Timestamp('2013-01-01 00:00:00', freq='D'),
  8. Timestamp('2013-01-02 00:00:00', freq='D'),
  9. Timestamp('2013-01-03 00:00:00', freq='D')]
  10. In [117]: (dti + tdi).to_list()
  11. Out[117]: [Timestamp('2013-01-02 00:00:00'), NaT, Timestamp('2013-01-05 00:00:00')]
  12. In [118]: (dti - tdi).to_list()
  13. Out[118]: [Timestamp('2012-12-31 00:00:00'), NaT, Timestamp('2013-01-01 00:00:00')]

转换

Series 频率转换类似,可以把 TimedeltaIndex 转换为其它索引。

  1. In [119]: tdi / np.timedelta64(1, 's')
  2. Out[119]: Float64Index([86400.0, nan, 172800.0], dtype='float64')
  3. In [120]: tdi.astype('timedelta64[s]')
  4. Out[120]: Float64Index([86400.0, nan, 172800.0], dtype='float64')

与标量操作类似,会返回不同类型的索引。

  1. # 时间差与日期相加,结果为日期型索引(DatetimeIndex)
  2. In [121]: tdi + pd.Timestamp('20130101')
  3. Out[121]: DatetimeIndex(['2013-01-02', 'NaT', '2013-01-03'], dtype='datetime64[ns]', freq=None)
  4. # 日期与时间戳相减,结果为日期型数据(Timestamp)
  5. # note that trying to subtract a date from a Timedelta will raise an exception
  6. In [122]: (pd.Timestamp('20130101') - tdi).to_list()
  7. Out[122]: [Timestamp('2012-12-31 00:00:00'), NaT, Timestamp('2012-12-30 00:00:00')]
  8. # 时间差与时间差相加,结果还是时间差索引
  9. In [123]: tdi + pd.Timedelta('10 days')
  10. Out[123]: TimedeltaIndex(['11 days', NaT, '12 days'], dtype='timedelta64[ns]', freq=None)
  11. # 除数是整数,则结果为时间差索引
  12. In [124]: tdi / 2
  13. Out[124]: TimedeltaIndex(['0 days 12:00:00', NaT, '1 days 00:00:00'], dtype='timedelta64[ns]', freq=None)
  14. # 除数是时间差,则结果为 Float64Index
  15. In [125]: tdi / tdi[0]
  16. Out[125]: Float64Index([1.0, nan, 2.0], dtype='float64')

重采样

时间序列重采样一样,TimedeltaIndex 也支持重采样。

  1. In [126]: s.resample('D').mean()
  2. Out[126]:
  3. 1 days 11.5
  4. 2 days 35.5
  5. 3 days 59.5
  6. 4 days 83.5
  7. 5 days 97.5
  8. Freq: D, dtype: float64