groupbycount()

groupbycount(col [, ...cols]) counts the number of occurences of each distinct value of the provided column(s). If multiple columns are provided, their values are treated as a tuple.

Note that groupbycount col is effectively the same as stats by col.

Technical Notes

  • If there are fewer than 1000 distinct groups, groupbycount() will return those groups and their exact counts.

  • If there are >1000 distinct groups, it will return the 1000 most numerous groups, and will approximate their counts using a variant of the count-min sketch—specifically the count-mean-min sketch detailed here.

    • For most inputs, this has a p99 error of 0.1%.

Returns

A table with one row for each distinct value of col [, ...cols], with the following columns:

  • One column for each col provided, containing the value of col.

  • @q.count, containing the number of occurrences of that value.

Example

Analyze AWS CloudTrail logs to find the most frequently called AWS APIs.

%ingest.source_type: "aws:cloudtrail"
| groupbycount(eventSource)

Last updated