Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootstrap.sh improvements #10490

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nirbosl
Copy link
Contributor

@nirbosl nirbosl commented Feb 25, 2025

Description:

  • Simplified process tracking by removing the processing_files associative array and relying on the pids array for job tracking
  • Fixed an issue where all child processes would print out the same PID as theirs, which was in fact the script parent-group-ID; each process now properly prints its own job PID, also resulting in PGAPPNAME reflecting the correct PID in the DB COPY jobs.
  • Extended the validate_special_files function with additional validation logic
    • Fixes an issue of an error thrown for special files validation when re-running the script
  • Improved robustness of file validation checks
  • Enhanced the import_file function with additional error handling and verification steps
  • Improved import reliability
  • Improved code organization
  • Maintained the same overall script structure and core functionality

Notes for reviewer:
The changes to the script's code are due to an issue a customer ran into. I was not able to reproduce the issue they ran into (valid file being imported but post-import row-count returns incorrectly then counts again and increments by the amount of rows it counts over and over).

I have focused on potential weaknesses I've identified that could potentially result in such edge-cases, and have refactored and improved their code to be more robust and safer in order to avoid such issues in the future.

I have repeat-tested the script using a dataset that included two large table file parts of each large table, and all small tables in the following two scnearios:

  • Single run import with no interruptions (good path)
  • Started an import, let it complete several files successfully then stop the script and re-run it (to validate resumption logic and proper skip of already-imported files)

All repeat tests in both scenarios finished successfully.

I have then started a local mirror-importer against the bootstrapped DB and it started cleanly and resumed sync from September 18th timestamps, which is the date this data export was taken on (0.113.2).

Checklist

  • Documented (Code comments)
  • Tested (Actual runs of the script, described above in the Notes section.)

Signed-off-by: Nir Ben-Or <[email protected]>
@nirbosl nirbosl requested a review from a team as a code owner February 25, 2025 18:02
Copy link

codecov bot commented Feb 25, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

@steven-sheehy steven-sheehy added bug Type: Something isn't working database Area: Database enhancement Type: New feature and removed bug Type: Something isn't working labels Feb 25, 2025
@steven-sheehy steven-sheehy added this to the 0.125.0 milestone Feb 25, 2025
Comment on lines +5 to +6
# Start a new process group and detach from terminal
set -m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this, it causes problems with spotless.

@@ -394,7 +398,7 @@ process_manifest() {

if [[ -f "$file_path" ]]; then
# Skip validation if file is already imported successfully
if grep -q "^$file_path IMPORTED" "$TRACKING_FILE" 2>/dev/null; then
if [[ "$(read_tracking_status "$(basename "$file_path")")" == "IMPORTED" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_tracking_status internally calls basename so this can be omitted, I believe.

Comment on lines +497 to +500
current_status=$(read_tracking_status "$file")
if [[ "$current_status" != "IMPORTED" ]]; then
write_tracking_file "$file" "IMPORTED"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really matter if you check the status before write? It doesn't harm it to always set status IMPORTED like before.

fi
done

# Check the exit status of the count query
Copy link
Contributor

@steven-sheehy steven-sheehy Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the count query fails after 3 times we should fallback to it returning true after logging an error. In theory, the import should've returned failure if any rows were missed and this is just a safety check. This change alone would've unblocked the customer.

actual_count=$(psql -v ON_ERROR_STOP=1 -q -Atc "SELECT COUNT(*) FROM ${table};")
psql_status=$?
else
actual_count=$(psql -v ON_ERROR_STOP=1 -q -Atc "SELECT COUNT(*) FROM ${table} WHERE consensus_timestamp BETWEEN '$start_ts' AND '$end_ts';")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also fallback to a quicker query like where consensus_timestamp in ($start_ts, $end_ts) and verify the first and last are present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database Area: Database enhancement Type: New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants