Hi,
still trying to get me feet wet with CrateDB as a Prometheus long term storage back-end. As I was running into timeout problems when querying via Prometheus I switched to “Grafana SQL” to interact directly with CrateDB.
But given that there are 1.5B records in the metrics table after just 2 months, even limiting to very few instances I constantly run into SQL query limits due to way too many rows being returned.
I found Hernan’s nice tutorial quick bite, loaded the user defined functions into our 3 node test cluster, but I am non the wiser, probably due to wrong function prototypes:
crash
yields
cr> SELECT routine_schema, specific_name
FROM information_schema.routines
WHERE routine_name='lttb_with_parallel_arrays';
+----------------+---------------------------------------------------------------+
| routine_schema | specific_name |
+----------------+---------------------------------------------------------------+
| doc | lttb_with_parallel_arrays(integer_array, real_array, integer) |
+----------------+---------------------------------------------------------------+
SELECT 1 row in set (0.011 sec)
preliminary queries seem to work just fine:
SELECT count(timestamp) FROM metrics WHERE timestamp BETWEEN '2023-09-19T05:46:07.927Z' AND '2023-09-20T05:46:07.929Z' AND labels['inst
ance'] = 'a5602' ;
+------------------+
| count(timestamp) |
+------------------+
| 29607 |
+------------------+
SELECT 1 row in set (2.948 sec)
SELECT count(value) FROM metrics WHERE timestamp BETWEEN '2023-09-19T05:46:07.927Z' AND '2023-09-20T05:46:07.929Z' AND labels['instance
'] = 'a5602' ;
+--------------+
| count(value) |
+--------------+
| 29607 |
+--------------+
SELECT 1 row in set (0.049 sec)
but when I fire up the down sampling query it stops short of not finding a proper function (with or without prefixing doc.
:
SELECT lttb_with_parallel_arrays(array(SELECT timestamp FROM metrics WHERE timestamp BETWEEN '2023-09-19T05:46:07.927Z' AND '2023-09-20
T05:46:07.929Z' AND labels['instance'] = 'a5602' ORDER BY timestamp), array(select value from metrics WHERE timestamp BETWEEN '2023-09-
19T05:46:07.927Z' AND '2023-09-20T05:46:07.929Z' AND labels['instance'] = 'a5602' order by timestamp),100);
UnsupportedFunctionException[Unknown function: lttb_with_parallel_arrays((SELECT timestamp FROM (doc.metrics)), (SELECT value FROM (doc.metrics)), 100), no overload found for matching argument types: (timestamp without time zone_array, double precision_array, integer). Possible candidates: doc.lttb_with_parallel_arrays(array(integer), array(real), integer):object]
io.crate.exceptions.UnsupportedFunctionException: Unknown function: lttb_with_parallel_arrays((SELECT timestamp FROM (doc.metrics)), (SELECT value FROM (doc.metrics)), 100), no overload found for matching argument types: (timestamp without time zone_array, double precision_array, integer). Possible candidates: doc.lttb_with_parallel_arrays(array(integer), array(real), integer):object
at io.crate.metadata.Functions.raiseUnknownFunction(Functions.java:357)
[...]
I think I somehow need to convince CrateDB/cast the arrays into integer/real one’s but how?
Thanks a lot in advance
Carsten