把Elasticsearch作为时间序列数据库使用
bing200767
9年前
来自: http://blog.csdn.net//jiao_fuyou/article/details/49663687
这篇文章算是对另一篇《Elasticsearch as a Time Series Data Store》的简单翻译吧,自己的理解吧。
- 首先_source被关闭了,这样原始的json文档不会被重复存储一遍。
- 其次_all也被关闭了。而且每个字段的store都是False,也就是不会单独被存储。
- 这些都关掉了,那么数据存哪里了?存在doc_values里。doc_values用于在做聚合运算的时候,根据一批文档id快速找到对应的列的值。doc_values在磁盘上一个按列压缩存储的文件,非常高效。
curl -XPOST http://172.16.18.116:9200/test -d ' { "settings": { "number_of_shards": 1, "number_of_replicas": 0, "index.query.default_field": "timestamp", "index.mapping.ignore_malformed": false, "index.mapping.coerce": false, "index.query.parse.allow_unmapped_fields": false }, "mappings": { "test": { "_source": {"enabled": false}, "_all": {"enabled": false}, "properties": { "timestamp": { "type": "date", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } }, "appid": { "type": "string", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } }, "result": { "type": "string", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } }, "cmdid": { "type": "string", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } }, "optime": { "type": "integer", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } }, "total_count": { "type": "integer", "index": "no", "store": false, "dynamic": "strict", "doc_values": true, "fielddata": { "format": "doc_values" } } } } } }'
增加一条数据:
curl -XPOST http://172.16.18.116:9200/test/test/1 -d ' { "timestamp": 53534543, "appid": 1, "result": "test", "cmdid": "test", "optime": 53534543, "total_count": 100 } '
查询一下:
curl -XGET http://172.16.18.116:9200/test/test/_search { "took": 1, "timed_out": false, "_shards": { "total": 1, "successful": 1, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "test", "_type": "test", "_id": "1", "_score": 1 } ] } }
能查到数据,但是看不到原始字段内容,因为没存储也没索引,但是doc_values=true,实际上是保存到了磁盘上的
下面做一下聚合操作:
curl -XPOST http://172.16.18.116:9200/test/test/_search { "aggs": { "timestamp": { "terms": { "field": "timestamp" }, "aggs": { "total_count": {"sum": {"field": "total_count"}} } } } }
结果:
{ "took": 2, "timed_out": false, "_shards": { "total": 1, "successful": 1, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "test", "_type": "test", "_id": "1", "_score": 1 } ] }, "aggregations": { "timestamp": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": 53534543, "key_as_string": "1970-01-01T14:52:14.543Z", "doc_count": 1, "total_count": { "value": 100 } } ] } } }
可以看到聚合操作可以获取到total_count值。