My current concern is to about generating many time series based on time range from different table. Sample pseudo-query might look like
select a.ts1, a.ts2, x.ts
from a cross join timestamp_sequence_generator(a.ts1, a.ts2, 's') x
where table a
would contain 2 columns ts1, ts2
for the beginning and the end of the series. For example with table a
like:
| ts1 | ts2 |
|-----|-----|
| 1 | 2 |
| 4 | 7 |
results would be like:
| ts1 | ts2 | x.ts |
|-----|-----|------|
| 1 | 2 | 1 |
| 1 | 2 | 2 |
| 4 | 7 | 4 |
| 4 | 7 | 5 |
| 4 | 7 | 6 |
| 4 | 7 | 7 |
Is it possible to generate such multi series with current sql dialect and available functions?
Hi @archmag,
Here is one way to do it, example given in days.
CREATE TABLE archmag (
ts1 timestamp,
ts2 timestamp
) timestamp(ts1) PARTITION BY DAY WAL;
INSERT INTO archmag (ts1, ts2) VALUES ('2030-01-01', '2030-01-02'), ('2030-01-04', '2030-01-07')
SELECT ts1, ts2, ts3 FROM
(
(
SELECT ts1 AS ts3 FROM archmag
SAMPLE BY 1d FROM '2030-01-01' TO '2030-01-08' FILL(NULL)
)
ASOF JOIN archmag
)
WHERE ts3 >= ts1 AND ts3 <= ts2
Start by generating the timestamps using a new feature, SAMPLE BY FROM-TO. This can be used to create arbitrary calendar aligned timestamps. We simply select the timestamp alone, instead of the null filled value. This gives us the timestamps from 1-7, which we name ts3.
Next, we ASOF JOIN this to ts1. I am assuming that ts1 is sorted in ascending order, as in your example. Therefore, the nearest timestamps less than ts3 in the ts1 columns will be matched.
This will match ts1, ts2 as (1,2) against ts3 as (1,2,3).
We then apply a WHERE filter to ensure that the ts3 timestamp is within the (ts1, ts2) range. This gives the result you were hoping for.
Hope this helps, please let me know if you have any further questions.
correct me if I’m wrong but it seems that I need to compute min(ts)
and max(ts)
on application level? Is there a way to compute range '2030-01-01' TO '2030-01-08'
within query context? Like: SAMPLE BY 1d FROM min(ts) TO max(ts) FILL(NULL)
Yes, you will have to predetermine the range constants.
There is some support for ‘effectively constant’ expressions i.e expressions that can be evaluated before reading any rows. But this is different to max(ts)
which acts on the rows directly and returns a single result.
Effectively constant is something like date_trunc('day', dateadd('d', -7, now()))
.