I have found in the documentation a description to create a custom user defined function (UDF). However, I could not find any reference to create User Defined Aggregate Function (UDAF), is this possible? Where can I find documentation about it?
Although CrateDB have a very complete list of aggregate function, it can never cover all cases. UDAF would be very useful for its users to adapt the tool to their needs.
Currently CrateDB doesn’t support User Defined Aggregates, however we have certainly considered adding the functionality at some point. Do you have any specific use cases, where you would think that UDA would be helpful? Good examples always help our product management team to better understand the need an prioritize features.
The use case that I had in mind is an aggregate function to aggregate (SUM) a column of type array(integer). This column exists in my case to optimize storage, but I need to do an aggregation that adds elements sharing the same position of the array.
[1, 3, 2]
[5, 2, 4]
------------
[1+5, 3+2, 2+4]
I know this is very specific to my use case, the reason why I would need a UDAF to cover it.
I see … for the use case an unnest might be a workaround
CREATE TABLE agg (
int_arr ARRAY(LONG)
);
INSERT INTO agg VALUES ([1, 3, 2]),([5, 2, 4]);
SELECT array_agg(val_agg) FROM (
SELECT idx, SUM(val) val_agg FROM
(SELECT generate_subscripts(int_arr,1) idx, unnest(int_arr) val
from agg) as unnested
GROUP BY idx
) as unnested_agg