feat(pacer): add command to fetch docs filtered by page count from PACER #4901

elisa-a-v · 2025-01-08T21:35:39Z

Implement a Django command to fetch docs from PACER within a given range of page_count. This of course means we can only fetch docs for which we already know the page count for.

This script implements two types of throttling:

Uses CeleryThrottle to avoid the queue from getting too full.
Uses built-in Celery Task rate_limiting to enforce a minimum wait between executions of PACER fetch task.

Given these use different mechanisms and serve different purposes, in theory they shouldn't conflict.

Note:

This PR also refactors do_pacer_fetch so the logic to fetch PDFs is abstracted and then also used in the script.

…etch - introduces new build_pdf_retrieval_task_chain method - refactors do_pacer_fetch to now use that new method instead

elisa-a-v added 9 commits January 6, 2025 21:41

feat(pacer): add command to fetch docs filtered by page count from PACER

14c48e8

test(pacer): introduce tests for command to fetch docs from PACER

b85778a

refactor(recap): abstract PACER doc fetch chain build from do_pacer_f…

9dca383

…etch - introduces new build_pdf_retrieval_task_chain method - refactors do_pacer_fetch to now use that new method instead

refactor(pacer): bulk fetch command now uses CeleryThrottle

6e4b14b

test(pacer): update tests for new implementation

88df431

fix(pacer): enable rate limiting for task to fetch doc from PACER

2915cb9

test(pacer): test rate limiting

e7e7d14

test(pacer): enhance test by adding subTest for round-robin asserts

00eabb5

Merge branch 'main' into 4839-known-big-docs-retrieval

bd3ce86

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(pacer): add command to fetch docs filtered by page count from PACER #4901

feat(pacer): add command to fetch docs filtered by page count from PACER #4901

elisa-a-v commented Jan 8, 2025 •

edited

Loading

feat(pacer): add command to fetch docs filtered by page count from PACER #4901

Are you sure you want to change the base?

feat(pacer): add command to fetch docs filtered by page count from PACER #4901

Conversation

elisa-a-v commented Jan 8, 2025 • edited Loading

Note:

elisa-a-v commented Jan 8, 2025 •

edited

Loading