- More then 10m records
- Test table has 109 987 200 (~110m) records
- Traditional (rails way)
- Keyset pagination or pagination with relative cursors
- Using Limit-Offset-Count
- Works fine for the first pages (~1000-2000)
- Works fine for small tables (most of railsway/startup cases)
- Get all data
- first page 109 987 200 records
caused by count
=== Render data: index === (20.9ms) === Render pagination: traditional index === (25756.6ms)
- first page (w/o count) 109 987 200 records
=== Render data: wo_count === (4.9ms) === Render pagination: traditional wo_count === (0.6ms)
- page #500000 109 987 200 records
caused by count and offset
=== Render data: index === (1135.1ms) === Render pagination: traditional index === (26448.8ms)
- page #500000 (w/o count) 109 987 200 records
caused by offset
=== Render data: wo_count === (871.4ms) === Render pagination: traditional wo_count === (1.2ms)
- first page 109 987 200 records
- Search by fields
- indexed fields 44 records (by created_at)
thanks to index
=== Render data: index === (6.2ms) === Render pagination: traditional index === (4.0ms)
- non-indexed fields 44 records (by updated_at)
caused by count and offset, and lack of index
=== Render data: index === (30307.6ms) === Render pagination: traditional index === (30863.7ms)
- indexed fields 44 records (by created_at)
- When scope is big but w/ index
- first page 15 811 244 records
=== Render data: index === (34.8ms) === Render pagination: traditional index === (884.9ms)
- page #500000 15 811 244 records
caused by offset
=== Render data: index === (1345.3ms) === Render pagination: traditional index === (647.4ms)
- first page 15 811 244 records
- When scope is big but w/o index
- first page 15 811 244 records
=== Render data: index === (270.2ms) === Render pagination: traditional index === (25992.6ms)
- page #500000 15 811 244 records
caused by offset
=== Render data: index === (1309.3ms) === Render pagination: traditional index === (25278.8ms)
- first page 15 811 244 records
- Get all data
- first page 109 987 200 records
=== Render data: index === (0.5ms) === Render pagination: keyset index === (0.7ms)
- page #500000 109 987 200 records
=== Render data: index === (1.6ms) === Render pagination: keyset index === (0.3ms)
- first page 109 987 200 records
- When scope is big but w/ index
- first page 15 811 244 records
=== Render data: index === (0.7ms) === Render pagination: keyset index === (0.4ms)
- page #500000 15 811 244 records
=== Render data: index === (0.9ms) === Render pagination: keyset index === (0.4ms)
- first page 15 811 244 records
- When scope is big but w/o index
- first page
=== Render data: index === (1831.6ms) === Render pagination: keyset index === (1.5ms)
- second page
=== Render data: index === (1939.9ms) === Render pagination: keyset index === (1.5ms)
- first page
- https://use-the-index-luke.com/no-offset
- https://use-the-index-luke.com/blog/2013-07/pagination-done-the-postgresql-way
- https://www.citusdata.com/blog/2016/03/30/five-ways-to-paginate/
- https://shopify.engineering/pagination-relative-cursors
- https://slack.engineering/evolving-api-pagination-at-slack/
page 500001
- http://localhost:3000/traditional/wo_count?created_from=2016-01-01+00%3A00%3A00&created_to=2017-01-01+00%3A01%3A26&page=500001
- http://localhost:3000/keyset?created_from=2016-01-01+00%3A00%3A00&created_to=2017-01-01+00%3A01%3A26&cursor_next=12500000
-
Don't count
-
it works for the first pages (~1000-2000)
-
it works unless searching on non-indexed fields w/ index http://localhost:3000/traditional?created_from=2016-01-01+00%3A00%3A00&created_to=2016-01-01+00%3A01%3A26 w/o index http://localhost:3000/traditional?updated_from=2016-01-01+00%3A00%3A00&updated_to=2016-01-01+00%3A01%3A26
- indexed field
EXPLAIN ANALYZE SELECT "miner_data".* FROM "miner_data" WHERE (created_at >= '2017-01-01 00:00:00') AND (created_at <= '2017-01-01 23:01:26') ORDER BY "miner_data"."id" ASC limit 25 "Limit (cost=0.57..7.89 rows=25 width=61) (actual time=2167.103..2167.108 rows=25 loops=1)" " -> Index Scan using miner_data_pkey on miner_data (cost=0.57..4655891.37 rows=15893334 width=61) (actual time=2167.101..2167.106 rows=25 loops=1)" " Filter: ((created_at >= '2017-01-01 00:00:00'::timestamp without time zone) AND (created_at <= '2018-01-01 00:01:26'::timestamp without time zone))" " Rows Removed by Filter: 15811200" "Planning Time: 6.494 ms" "Execution Time: 2167.160 ms"
- non indexed field
EXPLAIN ANALYZE SELECT "miner_data".* FROM "miner_data" WHERE (updated_at >= '2017-01-01 00:00:00') AND (updated_at <= '2017-01-01 23:01:26') ORDER BY "miner_data"."id" ASC limit 25 "Limit (cost=0.57..2774.57 rows=25 width=61) (actual time=2383.758..2383.765 rows=25 loops=1)" " -> Index Scan using miner_data_pkey on miner_data (cost=0.57..4655891.37 rows=41960 width=61) (actual time=2383.757..2383.762 rows=25 loops=1)" " Filter: ((updated_at >= '2017-01-01 00:00:00'::timestamp without time zone) AND (updated_at <= '2017-01-01 23:01:26'::timestamp without time zone))" " Rows Removed by Filter: 15811200" "Planning Time: 0.343 ms" "Execution Time: 2383.820 ms"
- indexed field
EXPLAIN ANALYZE SELECT "miner_data".* FROM "miner_data" WHERE (created_at >= '2017-01-01 00:00:00') AND (created_at <= '2017-01-01 22:01:26') ORDER BY "miner_data"."id" ASC limit 25 "Limit (cost=147.83..147.89 rows=25 width=61) (actual time=7.288..7.297 rows=25 loops=1)" " -> Sort (cost=147.83..152.87 rows=2017 width=61) (actual time=7.287..7.290 rows=25 loops=1)" " Sort Key: id" " Sort Method: top-N heapsort Memory: 28kB" " -> Index Scan using index_miner_data_on_created_at on miner_data (cost=0.57..90.91 rows=2017 width=61) (actual time=2.381..6.060 rows=1844 loops=1)" " Index Cond: ((created_at >= '2017-01-01 00:00:00'::timestamp without time zone) AND (created_at <= '2017-01-01 01:01:26'::timestamp without time zone))" "Planning Time: 0.338 ms" "Execution Time: 7.364 ms"
- non indexed field
EXPLAIN ANALYZE SELECT "miner_data".* FROM "miner_data" WHERE (updated_at >= '2017-01-01 00:00:00') AND (updated_at <= '2017-01-01 22:01:26') ORDER BY "miner_data"."id" ASC limit 25 "Limit (cost=0.57..2894.44 rows=25 width=61) (actual time=2064.903..2064.909 rows=25 loops=1)" " -> Index Scan using miner_data_pkey on miner_data (cost=0.57..4655891.37 rows=40222 width=61) (actual time=2064.902..2064.906 rows=25 loops=1)" " Filter: ((updated_at >= '2017-01-01 00:00:00'::timestamp without time zone) AND (updated_at <= '2017-01-01 22:01:26'::timestamp without time zone))" " Rows Removed by Filter: 15811200" "Planning Time: 0.359 ms" "Execution Time: 2064.947 ms"
- Reduce nubmer of records :)
- Count is expensive procedure
- Offset is expensive procedure
SELECT count(*) FROM "miner_data";
=> 109987200
29.075 sec
SELECT (reltuples / relpages * (pg_relation_size(oid) / 8192))::bigint
FROM pg_class
WHERE oid = 'miner_data'::regclass;
=> 109987240
00.087 sec
SELECT (CASE WHEN c.reltuples < 0 THEN NULL -- never vacuumed
WHEN c.relpages = 0 THEN float8 '0' -- empty table
ELSE c.reltuples / c.relpages END
* (pg_catalog.pg_relation_size(c.oid)
/ pg_catalog.current_setting('block_size')::int)
)::bigint
FROM pg_catalog.pg_class c
WHERE c.oid = 'public.miner_data'::regclass;
=> 109987240
0.052 sec