QuestDB stopped writing data to the table, ILP gives no errors, table is not suspended

Leo_at_trace · October 1, 2024, 5:30pm

Hi!
I’m really new to questdb and we started to use it for 2 weeks now, so forgive me for asking newbie questions.

I’m inserting data in object_bboxes_prod table via ILP with @questdb/nodejs-client.
Everything worked for several days but then I noticed that I don’t see the data in the table that was inserted. ILP and flush give no errors on the client.

When I checked wal_tables(); I see that no table is suspended but writerTxn is below sequencerTxn and they are not increasing at all.

name	suspended	writerTxn	sequencerTxn
object_world_points_prod	false	595380	598399
object_bboxes_prod	false	507437	601784
video_streams_prod	false	183361	1826425

The questdb consumes 95% of CPU together with zfs and there are no other processes running on this server.

In the logs I see this:

2024-10-01T17:10:37.249202Z I i.q.c.p.WriterPool >> [table=`object_bboxes_prod~16`, thread=21]
2024-10-01T17:10:37.250691Z I i.q.c.TableWriter processing WAL [path=/object_bboxes_prod~16/wal259/14, roLo=102513, roHi=102716, seqTxn=507415, tsMin=2024-07-06T11:18:50.800000Z, tsMax=2024-07-06T11:19:32.500000Z, commitToTs=294247-01-10T04:00:54.775807Z]
2024-10-01T17:10:37.267032Z I i.q.c.TableWriter o3 partition task [table=object_bboxes_prod, partitionTs=2024-07-06T00:00:00.000000Z, partitionIndex=4, last=false, append=false, ro=false, srcOooLo=102513, srcOooHi=102715, srcOooMax=102716, o3RowCount=203, o3LagRowCount=0, srcDataMax=110736346, o3Ts=2024-07-06T11:18:50.800000Z, newSize=110736549, maxTs=2024-09-28T20:30:46.963000Z, pCount=1, flattenTs=false, memUsed=3.368 GiB, rssMemUsed=163.695 MiB]
2024-10-01T17:10:48.536014Z I i.q.c.TableWriter merged partition [table=`object_bboxes_prod`, ts=2024-07-06T00:00:00.000000Z, txn=51097]
2024-10-01T17:10:49.477685Z I i.q.c.w.ApplyWal2TableJob job ejected [table=object_bboxes_prod~16, seqTxn=507415, transactions=1, rows=203, time=12227ms, rate=16rows/s, physicalWrittenRowsMultiplier=545500.24]
2024-10-01T17:10:49.477738Z I i.q.c.p.WriterPool << [table=`object_bboxes_prod~16`, thread=21]
2024-10-01T17:10:49.477745Z I i.q.c.CairoEngine cannot publish WAL notifications, queue is full [current=145521, table=object_bboxes_prod~16]

Any help is really appreciated.

P.S: Using QuestDB 8.1.1 running inside docker on AWS EC2 instance.

javier · October 4, 2024, 7:50am

What’s the schema of your table? Could it be the case you have a symbol column with the default capacity of 256 but you are ingesting high cardinality values?

I’ve seen a behaviour similar to what you describe in that case. If that’s what’s going on here, the fix would be to create the symbol with a capacity matching the cardinality you expect for the column.

Leo_at_trace · October 5, 2024, 6:23pm

Thank you for the advice, I didn’t know that I need to set the capacity.
I created the tables without capacity NNNN for symbols and some of them have thousands of values per day, so probably that’s the issue.
I recreated the tables and set the proper expected capacity x2, restarted the data imports. We’ll see if it changes things as first 2 weeks the db with the default capacity worked just fine.

A question about the indexes for symbols:
Let’s say I have 1000 values per day for object_id symbol, table partitioning is set to per day and I want to store 90 days of data and then delete the older partitions. So basically this gives me 90*1000 → 90k object_id values.

Will the index on object id set to 200k be enough?
Will it clean itself up of the values that are no longer in the table after I delete the partitions?

Object ids are growing incrementally so it will be like 90k sliding window of ids for this index.

Leo_at_trace · October 6, 2024, 1:11pm

Adding the capacity settings didn’t move the needle. After 12 hours of importing the questdb_wal_apply_rows_per_second dropped from 4000 to 612 and the difference between writerTxn and sequencerTxn started to grow steadily.

I have EC2 r6i.xlarge instance with 4 CPUs, 32GB RAM and EBS volume with 6000 iops.
Monitoring shows that disk max performance has not been reached and we are somewhere between 1000 - 2000 iops, so the problem is not in the disk throughput.

Looking at the data that we import I see that we have around 60M rows per day and probably it is too much for the partition per DAY.

Recreated the tables with partitioning per HOUR, should be around 3M records per partition. Restarted the import, will take a look in another 12 hours.

        CREATE TABLE object_bboxes_prod
        (
          timestamp timestamp,
          object_type symbol,
          division_id symbol capacity 50000,
          camera_id symbol capacity 100000,
          facility_id symbol capacity 20000,
          game_id symbol capacity 2000000,
          session_id symbol capacity 2000000,
          group_id symbol capacity 1000000,
          enabled symbol,
          object_id varchar,
          u1 int,
          v1 int,
          u2 int,
          v2 int
        ) timestamp(timestamp)
        PARTITION BY HOUR;

nwoolmer · October 7, 2024, 12:31pm

Hi @Leo_at_trace

50-100m rows per partition is reasonable.

Is your data pre-sorted? If you are sending the data unsorted, then the partitions will be split, merged and re-written on every transaction. Whereas data sorted in ascending order by timestamp will simply be appended to the table.

Partitioning by hour can improve this, but if you store many months of data, you may take a hit in query speed. This is because each partition is a separate folder with separate files, and the overhead of opening and closing these files becomes relevant.

Leo_at_trace · October 7, 2024, 4:41pm

Hi @nwoolmer,
Unfortunately the data comes from multiple processing units working in parallel and there is no way to pre-sort it.

After switching to the partitioning by HOUR I was finally able to process 1 day of data without DB getting stuck. But the speed of data importing is about 1500 rows/sec, the same as questdb_wal_apply_rows_per_second shows.

Is it a resonable speed for r6i.xlarge instance with 4 CPUs, 32GB RAM and EBS volume with 6000 iops when using ILP?
What other steps do you suggest to increase the ingestion speed? Bigger instance? Cluster? Not sure that adding more CPUs to an instance will speed it up.

P.S: It took 16 hours for the instance to import 90M rows.

nwoolmer · October 7, 2024, 4:43pm

That speed is not reasonable. Are you submitting a single row at a time, per processing unit? Are your rows huge?

I benched m6i.xlarge with the TSBS dataset, and that handles ~750k rows/sec with minor O3 writes from parallel senders.

Leo_at_trace · October 7, 2024, 4:51pm

@nwoolmer thank you for the quick reply, appreciate it.

This is what I do in js to send the data to this object_bboxes table:

                await tsdb.sender
                    .table("object_bboxes_" + tsdb.env)
                    .symbol("enabled", "1")
                    .symbol("object_type", row.object_type.toString())
                    .symbol("division_id", row.division_id.toString())
                    .symbol("camera_id", row.camera_id.toString())
                    .symbol("facility_id", row.facility_id.toString())
                    .symbol("game_id", row.game_id.toString())
                    .symbol("session_id", row.processing_session_id.toString())
                    .symbol("group_id", row.group_id.toString())
                    .stringColumn(
                        "object_id",
                        row.session_result_object_id.toString()
                    )
                    .intColumn("u1", bbox[1])
                    .intColumn("v1", bbox[2])
                    .intColumn("u2", bbox[3])
                    .intColumn("v2", bbox[4])
                    .at(startTime + bbox[0], "ms")
            }

And then I call await sender.flush() on every 1000 rows.

I had 3 workers running the imports of different chunks of data from the same day. Everything was running inside the AWS in the same region and availability zone, so there should be no network slowdown or other such issues. But as I said all the data is from just 1 day and it’s not presorted. CPU load was around 90% for all 4 CPUs.

nwoolmer · October 9, 2024, 3:19pm

Next thoughts - are you hitting disk bandwidth limits? Since CPU is not completely maxed out.

Its one thing to not hit IOPS limits, but EBS also has disk bandwidth limits. For example, gp3 can do 16k IOPS with 1000 MiB/s disk.

Also, you can probably up the batch size client-side. For ILP/HTTP, by default, we flush every 75,000 rows or 1 second interval.

Leo_at_trace · October 10, 2024, 3:41pm

The disk throughput is set to 250Mib/s and it’s not reached, so it wasn’t a disk issue.
Seems like I was sending too many too small transactions, i.e I flushed the sender every 1000 rows + auto flush was also turned on.
I turned off auto flush and do manual flush on every 50000 rows, now the speed is about 15k rows per second which is way better than 1500.
I think the most of the time spent now is in merging the data as our data is often inserted in the middle of the existing time range. Unfortunately this is how our system works and there is nothing I can do to force the insertions be to the last chunk.

2024-10-10T15:37:35.871996Z I i.q.c.TableWriter squashing partitions [table=object_world_points_testing, target=2024-08-03T00.13695, targetSize=8948005, source=2024-08-03T005100-000001.15412, sourceSize=1889993]
2024-10-10T15:37:35.884242Z I i.q.c.TableWriter merged partition [table=`object_bboxes_testing`, ts=2024-08-02T23:00:00.000000Z, txn=29344]