Derivative Aggregation(导数聚合)
译文链接 : Derivative Aggregation(导数聚合)
警告
此功能是实验性的,可能会在将来的版本中完全更改或删除。Elastic将采取最大的努力来解决此问题,但实验功能不受SLA官方功能的支持。
导数管道聚合,其计算父直方图(或日期 - 图形)聚合中指定度量的导数。指定的度量必须是数字,并且必须设置直方图min_doc_count为0(默认为直方图聚合)。
语法
derivative(导数) 聚合结构如下:
"derivative": {"buckets_path": "the_sum"}
derivative(导数) 参数如下:
| 参数名称 | 描述 | 是否必填 | 默认值 |
|---|---|---|---|
| buckets_path | 想要计算导数值的桶路径,点击 the section called “buckets_path Syntaxedit”查看更多细节 |
必填 | |
| gap_policy | 当数据缺口出现时采用的策略,点击the section called “Dealing with gaps in the dataedit”查看更多细节 | 可选 | skip |
| format | 用于规范聚合输出值的格式 | 可选 | null |
一级导数
以下代码段计算每月总销售额的导数:
POST /sales/_search{"size": 0,"aggs" : {"sales_per_month" : {"date_histogram" : {"field" : "date","interval" : "month"},"aggs": {"sales": {"sum": {"field": "price"}},"sales_deriv": {"derivative": {"buckets_path": "sales" #1}}}}}}
| 1 | buckets_path指示这个derivative聚合是想要得到sales_per_month日期直方图聚合中sales聚合值的导数。 |
响应可能如下所示:
{"took": 11,"timed_out": false,"_shards": ...,"hits": ...,"aggregations": {"sales_per_month": {"buckets": [{"key_as_string": "2015/01/01 00:00:00","key": 1420070400000,"doc_count": 3,"sales": {"value": 550.0} #1},{"key_as_string": "2015/02/01 00:00:00","key": 1422748800000,"doc_count": 2,"sales": {"value": 60.0},"sales_deriv": {"value": -490.0 #2}},{"key_as_string": "2015/03/01 00:00:00","key": 1425168000000,"doc_count": 2, #3"sales": {"value": 375.0},"sales_deriv": {"value": 315.0}}]}}}
| 1 | 由于我们至少需要2个数据点来计算导数,因此第一个桶没有值 | | 2 | 导数的单位默认和sales聚合以及父直方图相同。所以在这种情况下,如果价格字段的单位是美元,导数的单位就是美元/月 | | 3 | doc_count表示桶中的文档数 |
二级导数
可以把导数管道聚合链接到另一个管道聚合的结果,计算二级导数。如以下示例所示,它将计算总月销售额的第一和第二阶导数:
POST /sales/_search{"size": 0,"aggs" : {"sales_per_month" : {"date_histogram" : {"field" : "date","interval" : "month"},"aggs": {"sales": {"sum": {"field": "price"}},"sales_deriv": {"derivative": {"buckets_path": "sales"}},"sales_2nd_deriv": {"derivative": {"buckets_path": "sales_deriv" #1}}}}}}
| 1 | 二阶导数的buckets_path指向一阶导数的名称 |
响应可能如下所示:
{"took": 50,"timed_out": false,"_shards": ...,"hits": ...,"aggregations": {"sales_per_month": {"buckets": [{"key_as_string": "2015/01/01 00:00:00","key": 1420070400000,"doc_count": 3,"sales": {"value": 550.0} #1},{"key_as_string": "2015/02/01 00:00:00","key": 1422748800000,"doc_count": 2,"sales": {"value": 60.0},"sales_deriv": {"value": -490.0} #2},{"key_as_string": "2015/03/01 00:00:00","key": 1425168000000,"doc_count": 2,"sales": {"value": 375.0},"sales_deriv": {"value": 315.0},"sales_2nd_deriv": {"value": 805.0}}]}}}
| 1 | 由于我们至少需要2个数据点,所以前两个桶没有二级导数 | | 2 | 一级导数计算二级导数 |
Units(单位)
导数聚合允许指定导数值的单位。这将在响应normalized_value中返回一个额外的字段,汇报在X轴单位下的导数。在下面的例子中,我们计算出每月销售额的导数,但要求销售的导数按天计算:
POST /sales/_search{"size": 0,"aggs" : {"sales_per_month" : {"date_histogram" : {"field" : "date","interval" : "month"},"aggs": {"sales": {"sum": {"field": "price"}},"sales_deriv": {"derivative": {"buckets_path": "sales","unit": "day" #1}}}}}}
| 1 | unit指定用于导数计算的X轴的单位 |
响应可能如下所示:
{"took": 50,"timed_out": false,"_shards": ...,"hits": ...,"aggregations": {"sales_per_month": {"buckets": [{"key_as_string": "2015/01/01 00:00:00","key": 1420070400000,"doc_count": 3,"sales": {"value": 550.0} #1},{"key_as_string": "2015/02/01 00:00:00","key": 1422748800000,"doc_count": 2,"sales": {"value": 60.0},"sales_deriv": {"value": -490.0, #2"normalized_value": -15.806451612903226 #3}},{"key_as_string": "2015/03/01 00:00:00","key": 1425168000000,"doc_count": 2,"sales": {"value": 375.0},"sales_deriv": {"value": 315.0,"normalized_value": 11.25}}]}}}
| 1,2 | value值是原单位:按月 |
| 3 | normalized_value值是请求的单位:按天 |
