Looking for advice on how to implement time-weighted averaging

Hi everyone,

sorry for another greenhorn question.

I am collecting data from energy meters which are oftentimes missing the total used or produced energy (in kWh), so I have to do the aggregation / averaging myself.
Having initially started with TimeScaleDB, I found this nice tutorial and explanation of the theory behind time-weighted averaging:


Is there any equivalent available on CrateDB (tbh I have looked for but not found anything yet), maybe with an user defined function?

Thanks for any clues,


This might be helpful :stuck_out_tongue: Interpolating missing time-series values let us know if that covers your use case.

thanks for the pointer! Though I still have to take a look at that in depth.

So far I have adapted an SQL example I found on the TimescaleDB Blog. This is for adding up the reported measurements of a Shelly 3em Energy Meter.
I am completely unsure if this gives correct results.


WITH setup AS (
    lag(phase_0_momentary) OVER (PARTITION BY date_bin('1 day'::interval, time,0) ORDER BY time) as prev_val0,
    lag(phase_1_momentary) OVER (PARTITION BY date_bin('1 day'::interval, time,0) ORDER BY time) as prev_val1,
    lag(phase_2_momentary) OVER (PARTITION BY date_bin('1 day'::interval, time,0) ORDER BY time) as prev_val2,
		extract('epoch' FROM time) as time_e, 
		extract('epoch' FROM lag(time) OVER (PARTITION BY date_bin('1 day'::interval, time,0) ORDER BY time)) as prev_time_e, 
	FROM  power_consumption), 
nextstep AS (
    CASE WHEN prev_val0 is NULL THEN NULL ELSE (prev_val0 + phase_0_momentary) / 2 * (time_e - prev_time_e) END as weighted_sum0,
    CASE WHEN prev_val1 is NULL THEN NULL ELSE (prev_val1 + phase_1_momentary) / 2 * (time_e - prev_time_e) END as weighted_sum1,
    CASE WHEN prev_val2 is NULL THEN NULL ELSE (prev_val2 + phase_2_momentary) / 2 * (time_e - prev_time_e) END as weighted_sum2,
	FROM setup)
SELECT date_bin('1 day'::interval, time,0) as timebucket,
    (avg(phase_0_momentary)*24)::Numeric(5, 2) as "avg(phase0)", -- the regular average
	  ((sum(weighted_sum0) / (max(time_e) - min(time_e)))*24)::Numeric(5, 2) as time_weighted_average0, -- our derived average
    (avg(phase_1_momentary)*24)::Numeric(5, 2) as "avg(phase1)",
    ((sum(weighted_sum1) / (max(time_e) - min(time_e)))*24)::Numeric(5, 2) as time_weighted_average1,
    (avg(phase_2_momentary)*24)::Numeric(5, 2) as "avg(phase2)",
    ((sum(weighted_sum2) / (max(time_e) - min(time_e)))*24)::Numeric(5, 2) as time_weighted_average2,
    ((avg(phase_0_momentary)+avg(phase_1_momentary)+avg(phase_2_momentary))*24)::Numeric(5, 2) as "avg_total",
    ( ( sum(weighted_sum0) / (max(time_e) - min(time_e)) + sum(weighted_sum1) / (max(time_e) - min(time_e)) + sum(weighted_sum2) / (max(time_e) - min(time_e)) ) *24)::Numeric(5, 2) as twa_total
FROM nextstep
GROUP BY timebucket order by timebucket;

The results seem to be in the general ballpark of the simple averaging, some times a bit more, some times a bit less. It feels about right, but I am not understanding everything this SQL code involves.

timebucket	avg(phase0)	time_weighted_average0	avg(phase1)	time_weighted_average1	avg(phase2)	time_weighted_average2	avg_total	twa_total

1709164800000 (2024-02-29T00:00:00.000Z)  -514.21   -525.76   2972.4   2999.7   2959.3   2918.9   5417.5   5392.8