Automatically set .batchSize based on the primary key #128

msmygit · 2023-04-27T16:05:40Z

Automatically set .batchSize to 1 if a table has a primary key that is also the partition key. Example tables as follows:

Example 1:

CREATE TABLE IF NOT EXISTS ks1.tbl1 (
  pk1 int,
  pk2 long,
  c1 text,
  c2 uuid,
  PRIMARY KEY ((pk1,pk2))
);

Example 2:

CREATE TABLE IF NOT EXISTS ks1.tbl1 (
  c1 text PRIMARY KEY,
  c2 uuid
);

or use the default.

The text was updated successfully, but these errors were encountered:

mieslep · 2023-06-08T09:46:47Z

@msmygit can I suggest we go one further...we know the partition key, and we're selecting by token range so should get the origin records in partition key order. Maybe we make this be a "max batch size" and allow it to be a really big number, and then automatically send the batch when the partition key changes?

Which is to say, the batch will be the smaller of "records in the partition" or .batchSize configuration setting? In this way, we would avoid multi-partition batches.

msmygit added this to the Version 4.0 milestone Apr 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically set .batchSize based on the primary key #128

Automatically set .batchSize based on the primary key #128

msmygit commented Apr 27, 2023

mieslep commented Jun 8, 2023

Automatically set .batchSize based on the primary key #128

Automatically set .batchSize based on the primary key #128

Comments

msmygit commented Apr 27, 2023

mieslep commented Jun 8, 2023