I feel stuck trying to get the pipeline aggregation to do what I want in practice. I will post what I have, but the idea:
- Make a date range and make buckets for each month for the last 10 months in that range. Got it.
- Get the Min and Max of each bucket's "magnitude" field. I can only figure out how to do that with "stats" agg because I get duplicate error if I try to do both as separate aggs. Yet, I do not want the other stats. Can I avoid the stats agg to this?
- Sum the scores. How in the world do I do that? That is kicking my tail. I do not know if you can sum the _score field.
So here is the index I made up to practice based on the common earthquakes concept to practice on:
PUT _bulk
{ "index" : { "_index" : "earthquakes", "_id" : "1" } }
{ "date": "30-09-2020", "magnitude": "3.4", "lon": "74.12", "lat": "43.67" }
{ "index" : { "_index" : "earthquakes", "_id" : "2" } }
{ "date": "30-09-2020", "magnitude": "1.2", "lon": "78.02", "lat": "103.07" }
{ "index" : { "_index" : "earthquakes", "_id" : "3" } }
{ "date": "15-10-2020", "magnitude": "2.5", "lon": "178.02", "lat": "98.41" }
{ "index" : { "_index" : "earthquakes", "_id" : "4" } }
{ "date": "19-11-2020", "magnitude": "1.9", "lon": "14.67", "lat": "100.35" }
{ "index" : { "_index" : "earthquakes", "_id" : "5" } }
{ "date": "13-12-2020", "magnitude": "6.2", "lon": "123.93", "lat": "56.05" }
{ "index" : { "_index" : "earthquakes", "_id" : "6" } }
{ "date": "21-12-2020", "magnitude": "0.2", "lon": "130.31", "lat": "83.41" }
{ "index" : { "_index" : "earthquakes", "_id" : "7" } }
{ "date": "17-01-2021", "magnitude": "0.2", "lon": "10.31", "lat": "98.00" }
{ "index" : { "_index" : "earthquakes", "_id" : "8" } }
{ "date": "23-01-2021", "magnitude": "4.6", "lon": "112.31", "lat": "69.96" }
{ "index" : { "_index" : "earthquakes", "_id" : "9" } }
{ "date": "31-01-2021", "magnitude": "0.4", "lon": "79.43", "lat": "72.14" }
{ "index" : { "_index" : "earthquakes", "_id" : "10" } }
{ "date": "03-02-2021", "magnitude": "7.1", "lon": "120.80", "lat": "50.22" }
Here is what I put together for the aggregation. Note: I had put the hits at 10 before trying to sum the _score field...which did not happen:
GET earthquakes/_search
{
"size": 0,
"aggs": {
"range_mag": {
"date_range": {
"field": "date",
"ranges": [
{
"from": "now-10M",
"to": "now"
}
]
},
"aggs": {
"by_month_mag": {
"date_histogram": {
"field": "date",
"calendar_interval": "month"
},
"aggs": {
"stat_mag": {
"stats": {
"field": "magnitude"
}
}
}
}
}
}
}
}
^ That works but to get min and max, but adds in data I do not need. I didn't put my sum of the score because it was driving me nuts. Is there a better way to get after what I am trying to do? Anyway, thanks. Out of all the things I can type away with ease or work through with the documentation, aggregations are just the one thing I thought I'd get but somehow feel stuck on.
Read more here: https://stackoverflow.com/questions/66483176/get-min-and-max-of-a-range-without-stats-agg-or-duplicate-aggs-error
Content Attribution
This content was originally published by Shane at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.