Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: improve bootstrap performance hash bins #655

Merged

Conversation

kairoaraujo
Copy link
Member

@kairoaraujo kairoaraujo commented Dec 18, 2024

Performance improvement: Use threads to create all delegated roles

Before : Added 2048 hash bins in 119.36732602119446 seconds
Current: Added 2048 hash bins in 1.8565280437469482 seconds

PyPI (PEP 458) and RubyGems size creation:

Added 16384 hash bins in 15.13554310798645 seconds

@kairoaraujo kairoaraujo force-pushed the performance_bootstrap branch 2 times, most recently from e6a4551 to ea658e6 Compare December 18, 2024 09:17
Performance improvement: Use threads to create all delegated roles

Before : Added 2048 hash bins in 119.36732602119446 seconds
Current: Added 2048 hash bins in 1.8565280437469482 seconds

PyPI (PEP 458) and RubyGems size creation:

Added 16384 hash bins in 15.13554310798645 seconds

Signed-off-by: Kairo Araujo <[email protected]>
@kairoaraujo kairoaraujo force-pushed the performance_bootstrap branch from ea658e6 to fa8f242 Compare December 18, 2024 09:20
Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 19.04762% with 17 lines in your changes missing coverage. Please review.

Project coverage is 74.41%. Comparing base (714a29d) to head (f42ae94).
Report is 137 commits behind head on main.

Files with missing lines Patch % Lines
repository_service_tuf_worker/repository.py 19.04% 17 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (714a29d) and HEAD (f42ae94). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (714a29d) HEAD (f42ae94)
2 1
Additional details and impacted files
@@             Coverage Diff              @@
##              main     #655       +/-   ##
============================================
- Coverage   100.00%   74.41%   -25.59%     
============================================
  Files           15       14        -1     
  Lines         1071     1317      +246     
============================================
- Hits          1071      980       -91     
- Misses           0      337      +337     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@MVrachev MVrachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It this tested througfully?

I see that we are changing two objects inside process_delegated_role:

  • targets
  • db_target_roles

Have you considered race conditions?

repository_service_tuf_worker/repository.py Outdated Show resolved Hide resolved
repository_service_tuf_worker/repository.py Show resolved Hide resolved
repository_service_tuf_worker/repository.py Show resolved Hide resolved
@kairoaraujo
Copy link
Member Author

It this tested througfully?

I see that we are changing two objects inside process_delegated_role:

  • targets
  • db_target_roles

Have you considered race conditions?

Yes, I have considered it, including using multiple Workers

  1. There is no race condition from task perspective once it has only one task for Bootstrap, which calls this
  2. From the multithread, each thread will pick one hash bin, and there is the await for that before finishing the upper level roles with this information

Here is the FT running two wokers

Bootstrap

Please enter password to encrypted private key 'JimiHendrix': 
Signed metadata with key 'JimiHendrix'
Metadata is fully signed.
Saved result to 'ceremony-payload.json'
Bootstrap status: ACCEPTED (cd79db2c443344c4ba25097f548ec6fa)
Bootstrap status:  STARTED
Bootstrap status:  SUCCESS

FT

=================================================================================================== test session starts ===================================================================================================
platform linux -- Python 3.12.7, pytest-8.0.2, pluggy-1.5.0 -- /usr/local/bin/python3.12
cachedir: .pytest_cache
metadata: {'Python': '3.12.7', 'Platform': 'Linux-6.10.14-linuxkit-aarch64-with-glibc2.36', 'Packages': {'pytest': '8.0.2', 'pluggy': '1.5.0'}, 'Plugins': {'bdd-html': '0.1.14a0', 'metadata': '3.1.1', 'html': '4.1.1', 'split': '0.10.0', 'bdd': '8.1.0'}}
rootdir: /rstuf-runner/rstuf-umbrella
configfile: pytest.ini
plugins: bdd-html-0.1.14a0, metadata-3.1.1, html-4.1.1, split-0.10.0, bdd-8.1.0
collected 17 items                                                                                                                                                                                                        

tests/functional/artifacts/test_add_artifacts.py::test_adding_an_artifact_using_rstuf_api[630-"716f6e863f744b9ac22c97ec2f677b76ea5f5908bc5bc61510bfc4751384ea7a"-{"key": "value"}-"file1.tar.gz"] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Adding artifacts in Repository Service for TUF (RSTUF)
    Scenario Outline: Adding an artifact using RSTUF API
        When the API requester adds a new artifact with 630, "716f6e863f744b9ac22c97ec2f677b76ea5f5908bc5bc61510bfc4751384ea7a", {"key": "value"} and "file1.tar.gz"
        Then the API requester should get status code '202' with 'task_id'
        Then the API requester gets from endpoint 'GET /api/v1/task' status 'Task finished' within 90 seconds
        Then the user downloads the new artifact "file1.tar.gz" using TUF client from the metadata repository
    PASSED


tests/functional/artifacts/test_add_artifacts.py::test_adding_an_artifact_using_rstuf_api[2024-"93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc853eaa0ac8fa3b46f6e863f74"-None-"a/file2.tar.gz"] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Adding artifacts in Repository Service for TUF (RSTUF)
    Scenario Outline: Adding an artifact using RSTUF API
        When the API requester adds a new artifact with 2024, "93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc853eaa0ac8fa3b46f6e863f74", None and "a/file2.tar.gz"
        Then the API requester should get status code '202' with 'task_id'
        Then the API requester gets from endpoint 'GET /api/v1/task' status 'Task finished' within 90 seconds
        Then the user downloads the new artifact "a/file2.tar.gz" using TUF client from the metadata repository
    PASSED


tests/functional/artifacts/test_add_artifacts.py::test_adding_an_artifact_using_rstuf_api[532-"d9f34f8cd5cb3b3eb79b3e4b5dae3a16df499a70d8a053eaa0ac8fa5f5908bc5"-{"key": "value"}-"a/b/file3.tar.gz"] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Adding artifacts in Repository Service for TUF (RSTUF)
    Scenario Outline: Adding an artifact using RSTUF API
        When the API requester adds a new artifact with 532, "d9f34f8cd5cb3b3eb79b3e4b5dae3a16df499a70d8a053eaa0ac8fa5f5908bc5", {"key": "value"} and "a/b/file3.tar.gz"
        Then the API requester should get status code '202' with 'task_id'
        Then the API requester gets from endpoint 'GET /api/v1/task' status 'Task finished' within 90 seconds
        Then the user downloads the new artifact "a/b/file3.tar.gz" using TUF client from the metadata repository
    PASSED


tests/functional/artifacts/test_performance.py::test_api_requester_multiple_request_and_artifacts[2-2-20] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Performance and Consistence adding and removing artifacts
    Scenario Outline: Multiple requests with multiple artifacts and timeout threshold
        Given the API requester sends 2 requests with 2 artifacts to RSTUF
        When the API requester expects task 'SUCCESS' and status as 'True' before 20 seconds
        Then the downloader using TUF client expects artifacts available in the Metadata Repository
    PASSED


tests/functional/artifacts/test_performance.py::test_api_requester_multiple_request_and_artifacts[2-100-60] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Performance and Consistence adding and removing artifacts
    Scenario Outline: Multiple requests with multiple artifacts and timeout threshold
        Given the API requester sends 2 requests with 100 artifacts to RSTUF
        When the API requester expects task 'SUCCESS' and status as 'True' before 60 seconds
        Then the downloader using TUF client expects artifacts available in the Metadata Repository
    PASSED


tests/functional/artifacts/test_performance.py::test_api_requester_multiple_request_and_artifacts[5-10-60] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Performance and Consistence adding and removing artifacts
    Scenario Outline: Multiple requests with multiple artifacts and timeout threshold
        Given the API requester sends 5 requests with 10 artifacts to RSTUF
        When the API requester expects task 'SUCCESS' and status as 'True' before 60 seconds
        Then the downloader using TUF client expects artifacts available in the Metadata Repository
    PASSED


tests/functional/artifacts/test_performance.py::test_api_requester_multiple_request_and_artifacts[100-2-350] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Performance and Consistence adding and removing artifacts
    Scenario Outline: Multiple requests with multiple artifacts and timeout threshold
        Given the API requester sends 100 requests with 2 artifacts to RSTUF
        When the API requester expects task 'SUCCESS' and status as 'True' before 350 seconds
        Then the downloader using TUF client expects artifacts available in the Metadata Repository
    PASSED


tests/functional/artifacts/test_performance.py::test_api_requester_multiple_request_and_artifacts[50-50-600] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Performance and Consistence adding and removing artifacts
    Scenario Outline: Multiple requests with multiple artifacts and timeout threshold
        Given the API requester sends 50 requests with 50 artifacts to RSTUF
        When the API requester expects task 'SUCCESS' and status as 'True' before 600 seconds
        Then the downloader using TUF client expects artifacts available in the Metadata Repository
    PASSED


tests/functional/artifacts/test_remove_artifacts.py::test_removing_an_artifact_using_rstuf_api[["file1.tar.gz"]] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Adding artifacts in Repository Service for TUF (RSTUF)
    Scenario Outline: Removing artifacts using RSTUF api
        Given there are artifacts ["file1.tar.gz"] available for download using TUF client from the metadata repository
        When the API requester deletes all of the following artifacts ["file1.tar.gz"]
        Then the API requester should get status code '202' with 'task_id'
        And the API requester gets from endpoint 'GET /api/v1/task' status 'Task finished' within 90 seconds
        And all of the artifacts ["file1.tar.gz"] should not be available for download using TUF client from the metadata repository
    PASSED


tests/functional/artifacts/test_remove_artifacts.py::test_removing_an_artifact_using_rstuf_api[["file1.tar.gz", "a/file2.tar.gz"]] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Adding artifacts in Repository Service for TUF (RSTUF)
    Scenario Outline: Removing artifacts using RSTUF api
        Given there are artifacts ["file1.tar.gz", "a/file2.tar.gz"] available for download using TUF client from the metadata repository
        When the API requester deletes all of the following artifacts ["file1.tar.gz", "a/file2.tar.gz"]
        Then the API requester should get status code '202' with 'task_id'
        And the API requester gets from endpoint 'GET /api/v1/task' status 'Task finished' within 90 seconds
        And all of the artifacts ["file1.tar.gz", "a/file2.tar.gz"] should not be available for download using TUF client from the metadata repository
    PASSED


tests/functional/artifacts/test_remove_artifacts.py::test_removing_an_artifact_using_rstuf_api[["file1.tar.gz", "a/file2.tar.gz", "c/d/file3.tar.gz"]] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Adding artifacts in Repository Service for TUF (RSTUF)
    Scenario Outline: Removing artifacts using RSTUF api
        Given there are artifacts ["file1.tar.gz", "a/file2.tar.gz", "c/d/file3.tar.gz"] available for download using TUF client from the metadata repository
        When the API requester deletes all of the following artifacts ["file1.tar.gz", "a/file2.tar.gz", "c/d/file3.tar.gz"]
        Then the API requester should get status code '202' with 'task_id'
        And the API requester gets from endpoint 'GET /api/v1/task' status 'Task finished' within 90 seconds
        And all of the artifacts ["file1.tar.gz", "a/file2.tar.gz", "c/d/file3.tar.gz"] should not be available for download using TUF client from the metadata repository
    PASSED


tests/functional/artifacts/test_remove_artifacts.py::test_removing_artifacts_that_does_exist_and_ignoring_the_rest[["file1.tar.gz"]-["foo"]] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Adding artifacts in Repository Service for TUF (RSTUF)
    Scenario Outline: Removing artifacts that does exist and ignoring the rest
        Given there are artifacts ["file1.tar.gz"] available for download using TUF client from the metadata repository
        When the API requester tries to delete all of the following artifacts ["file1.tar.gz"] and ["foo"]
        Then the API requester should get status code '202' with 'task_id'
        And the API requester gets from endpoint 'GET /api/v1/task' status 'Task finished' within 90 seconds
        And the API requester should get a lists of deleted artifacts containing ["file1.tar.gz"] and of not found artifacts containing ["foo"]
        And all of the artifacts ["file1.tar.gz"] should not be available for download using TUF client from the metadata repository
    PASSED


tests/functional/artifacts/test_remove_artifacts.py::test_removing_artifacts_that_does_exist_and_ignoring_the_rest[[]-["foo", "bar"]] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Adding artifacts in Repository Service for TUF (RSTUF)
    Scenario Outline: Removing artifacts that does exist and ignoring the rest
        Given there are artifacts [] available for download using TUF client from the metadata repository
        When the API requester tries to delete all of the following artifacts [] and ["foo", "bar"]
        Then the API requester should get status code '202' with 'task_id'
        And the API requester gets from endpoint 'GET /api/v1/task' status 'Task finished' within 90 seconds
        And the API requester should get a lists of deleted artifacts containing [] and of not found artifacts containing ["foo", "bar"]
        And all of the artifacts [] should not be available for download using TUF client from the metadata repository
    PASSED


tests/functional/artifacts/test_remove_artifacts.py::test_removing_artifacts_that_does_exist_and_ignoring_the_rest[["file1.tar.gz", "a/file2.tar.gz"]-["foo", "bar"]] <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Adding artifacts in Repository Service for TUF (RSTUF)
    Scenario Outline: Removing artifacts that does exist and ignoring the rest
        Given there are artifacts ["file1.tar.gz", "a/file2.tar.gz"] available for download using TUF client from the metadata repository
        When the API requester tries to delete all of the following artifacts ["file1.tar.gz", "a/file2.tar.gz"] and ["foo", "bar"]
        Then the API requester should get status code '202' with 'task_id'
        And the API requester gets from endpoint 'GET /api/v1/task' status 'Task finished' within 90 seconds
        And the API requester should get a lists of deleted artifacts containing ["file1.tar.gz", "a/file2.tar.gz"] and of not found artifacts containing ["foo", "bar"]
        And all of the artifacts ["file1.tar.gz", "a/file2.tar.gz"] should not be available for download using TUF client from the metadata repository
    PASSED


tests/functional/bootstrap/test_bootstrap.py::test_bootstrap_using_rstuf_command_line_interface_cli <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Bootstrap Repository Service for TUF (RSTUF)
    Scenario: Bootstrap using RSTUF Command Line Interface (CLI)
        Given the repository-service-tuf (rstuf) is installed
        When the admin run rstuf for ceremony bootstrap
        Then the admin gets "Bootstrap status: SUCCESS" and "Bootstrap finished." or "System already has a Metadata"
    PASSED


tests/functional/bootstrap/test_bootstrap.py::test_bootstrap_using_rstuf_cli_with_invalid_payload <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
Feature: Bootstrap Repository Service for TUF (RSTUF)
    Scenario: Bootstrap using RSTUF Command Line Interface (CLI) with invalid payload
        Given the repository-service-tuf (rstuf) is installed
        When the admin run rstuf for ceremony bootstrap with invalid payload JSON
        Then the admin gets "Error 422" or "System LOCKED for bootstrap."
    PASSED


tests/functional/metadata/test_update.py::test_sign_root_metadata_updated <- ../../usr/local/lib/python3.12/site-packages/pytest_bdd/scenario.py 
------------------------------------------------------------------------------------------------------ live log call ------------------------------------------------------------------------------------------------------
2024-12-18 11:24:03 [    INFO] Adding artifacts (test_update.py:26)
2024-12-18 11:24:03 [    INFO] Added task_id: cd749d742b374ab690d3587eaffc8a0e (test_update.py:49)
2024-12-18 11:24:05 [    INFO] Adding artifacts (test_update.py:26)
2024-12-18 11:24:05 [    INFO] Added task_id: c697e7023e5f421990ea45931c6969d9 (test_update.py:49)
2024-12-18 11:24:05 [    INFO] [METADATA UPDATE] Submiting Root Metadata Update (test_update.py:94)
2024-12-18 11:24:05 [    INFO] [METADATA UPDATE] Metadata Updated by 8d3969a01c49446eba5b0674d84f79b4 (test_update.py:120)
2024-12-18 11:24:05 [    INFO] [METADATA UPDATE] Task state: {'task_id': '8d3969a01c49446eba5b0674d84f79b4', 'state': 'STARTED', 'result': {}} (test_update.py:137)
2024-12-18 11:24:06 [    INFO] [METADATA UPDATE] Task state: {'task_id': '8d3969a01c49446eba5b0674d84f79b4', 'state': 'STARTED', 'result': {}} (test_update.py:137)
2024-12-18 11:24:06 [    INFO] [METADATA UPDATE] Task state: {'task_id': '8d3969a01c49446eba5b0674d84f79b4', 'state': 'SUCCESS', 'result': {'message': 'Metadata Update Processed', 'status': True, 'task': 'metadata_update', 'last_update': '2024-12-18T11:24:06.412866Z', 'details': {'role': 'root'}}} (test_update.py:137)
2024-12-18 11:24:06 [    INFO] [METADATA UPDATE] Root Full signed (test_update.py:143)
2024-12-18 11:24:06 [    INFO] [METADATA UPDATE] Signing Metadata if available (test_update.py:176)
2024-12-18 11:24:06 [    INFO] Stop adding artifacts. Total requests: 2 (test_update.py:53)
2024-12-18 11:24:06 [    INFO] [METADATA UPDATE] Metadata Update available (2.root.json) (test_update.py:202)
2024-12-18 11:24:12 [    INFO] Task 1/2 finshed! (test_update.py:225)
2024-12-18 11:24:14 [    INFO] Task 2/2 finshed! (test_update.py:225)
2024-12-18 11:24:14 [    INFO] Verifying test/dazzling_tesla-0.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/infallible_babbage-1.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/quirky_ritchie-2.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/angry_meitner-3.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/pensive_pike-4.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/sweet_chandrasekhar-5.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/relaxed_lumiere-6.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/stoic_pascal-7.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/strange_darwin-8.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/fervent_chebyshev-9.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/jovial_johnson-0.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/gifted_moser-1.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/thirsty_moser-2.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/ecstatic_johnson-3.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/strange_poincare-4.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/dazzling_lewin-5.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/blissful_carson-6.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/pensive_benz-7.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/vibrant_solomon-8.tar.gz (test_update.py:232)
2024-12-18 11:24:14 [    INFO] Verifying test/stoic_murdock-9.tar.gz (test_update.py:232)

Feature: Metadata Update
    Scenario: Metadata Update and Signing
        Given RSTUF is running and operational
        Then the RSTUF is receiving multiple requests
        When the RSTUF Admin User sends a metadata update
        Then the API requester should get status code '202' with 'task_id'
        Then the Admin User runs the CLI to sign the metadata
        Then the '2.root.json' will be available in the TUF Metadata
        Then the user downloads will not have inconsistency during this process
    PASSED

@kairoaraujo kairoaraujo requested a review from MVrachev December 18, 2024 11:28
@kairoaraujo kairoaraujo merged commit 7f4513a into repository-service-tuf:main Dec 18, 2024
29 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants