-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bootstrap.sh improvements #10490
base: main
Are you sure you want to change the base?
bootstrap.sh improvements #10490
Conversation
Signed-off-by: Nir Ben-Or <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅ |
# Start a new process group and detach from terminal | ||
set -m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this, it causes problems with spotless.
@@ -394,7 +398,7 @@ process_manifest() { | |||
|
|||
if [[ -f "$file_path" ]]; then | |||
# Skip validation if file is already imported successfully | |||
if grep -q "^$file_path IMPORTED" "$TRACKING_FILE" 2>/dev/null; then | |||
if [[ "$(read_tracking_status "$(basename "$file_path")")" == "IMPORTED" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read_tracking_status internally calls basename so this can be omitted, I believe.
current_status=$(read_tracking_status "$file") | ||
if [[ "$current_status" != "IMPORTED" ]]; then | ||
write_tracking_file "$file" "IMPORTED" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it really matter if you check the status before write? It doesn't harm it to always set status IMPORTED like before.
fi | ||
done | ||
|
||
# Check the exit status of the count query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the count query fails after 3 times we should fallback to it returning true after logging an error. In theory, the import should've returned failure if any rows were missed and this is just a safety check. This change alone would've unblocked the customer.
actual_count=$(psql -v ON_ERROR_STOP=1 -q -Atc "SELECT COUNT(*) FROM ${table};") | ||
psql_status=$? | ||
else | ||
actual_count=$(psql -v ON_ERROR_STOP=1 -q -Atc "SELECT COUNT(*) FROM ${table} WHERE consensus_timestamp BETWEEN '$start_ts' AND '$end_ts';") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also fallback to a quicker query like where consensus_timestamp in ($start_ts, $end_ts)
and verify the first and last are present.
Description:
processing_files
associative array and relying on the pids array for job trackingvalidate_special_files
function with additional validation logicimport_file
function with additional error handling and verification stepsNotes for reviewer:
The changes to the script's code are due to an issue a customer ran into. I was not able to reproduce the issue they ran into (valid file being imported but post-import row-count returns incorrectly then counts again and increments by the amount of rows it counts over and over).
I have focused on potential weaknesses I've identified that could potentially result in such edge-cases, and have refactored and improved their code to be more robust and safer in order to avoid such issues in the future.
I have repeat-tested the script using a dataset that included two large table file parts of each large table, and all small tables in the following two scnearios:
All repeat tests in both scenarios finished successfully.
I have then started a local mirror-importer against the bootstrapped DB and it started cleanly and resumed sync from September 18th timestamps, which is the date this data export was taken on (0.113.2).
Checklist