I would like to check a table for unexpected data time-gaps, grouped by a column.
For example is there a query for an input table:
Datetime, Sym
(timestamp) (symbol)
2024-10-01 08:00:00 AA
2024-10-01 08:00:00 CC
2024-10-01 08:00:00 BB
...
2024-10-01 08:01:00 AA
2024-10-01 08:01:00 BB
2024-10-01 08:01:00 CC
...
2024-10-01 08:02:00 AA
2024-10-01 08:03:00 BB
2024-10-01 08:04:00 CC
which yields result:
Datetime, Sym, Time_gap
2024-10-01 08:04:00 CC 3
2024-10-01 08:03:00 BB 2
2024-10-01 08:02:00 AA 1
...
2024-10-01 08:01:00 AA 1
2024-10-01 08:01:00 BB 1
2024-10-01 08:01:00 CC 1
For each āSymā the time-gap between consecutive rows is calculated as āTime_gapā, with the timestamp of the event, and sorted DESC by āTime_gapā.
Note I showed āTime_gapā as an integer (minutes), but any numeric value (seconds, timestamp) would suffice.
You can use the first_value() window function to get the timestamp from the row before partitioned by Sym. At the moment we need a workaround, as the timestamp data type is not supported for this window function, but please do vote the github issue to raise awareness
You could do this to check gaps in all the rows for today. I am displaying the gap in both seconds and millis for completeness. Then you could add a condition to show only those above a certain threshold.
WITH time_and_prev AS (
SELECT Datetime, Sym, first_value(Datetime::long) OVER (PARTITION BY Sym ORDER BY Datetime ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING ) AS prevTimestamp
FROM table where timestamp IN today()
)
SELECT Datetime, Sym, prevTimestamp::timestamp, datediff('s', Datetime, prevTimestamp::timestamp) as gap_s, (Datetime - prevTimestamp::timestamp) / 1000 as gap_ms FROM time_and_prev ;
Thanks Javier so much for the quick & thorough response!
For this section of your query (I forgot to specify my tableās name, so your query has it as ātableā) :
FROM table where timestamp IN today()
I assume you meant:
FROM table where Datetime IN today()
Unfortunately this gives an error: āunknown function name: today()ā.
Iām not clear why today() didnāt work, but I just replaced the today() call with a str constant (eg: ā2024ā).
You are correct, sorry I messed up the column names!
The today() function was recently added. It might be the case you are running an older version of questdb. In any case, the important part is on the window function, for the where filter you can just use anything you might need.