Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: script that discovers minimum tested versions #9418

Closed
wants to merge 11 commits into from

Conversation

emmettbutler
Copy link
Collaborator

@emmettbutler emmettbutler commented May 28, 2024

This script discovers the minimum version of every package referenced in riotfile.py's Venv tree (excluding those envs that do not execute a pytest command) and writes that information to a CSV file. It also takes into account the installation dependencies in pyproject.toml.

For use in #9323

Checklist

  • Change(s) are motivated and described in the PR description
  • Testing strategy is described if automated tests are not included in the PR
  • Risks are described (performance impact, potential for breakage, maintainability)
  • Change is maintainable (easy to change, telemetry, documentation)
  • Library release note guidelines are followed or label changelog/no-changelog is set
  • Documentation is included (in-code, generated user docs, public corp docs)
  • Backport labels are set (if applicable)
  • If this PR changes the public interface, I've notified @DataDog/apm-tees.

Reviewer Checklist

  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Description motivates each change
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Change is maintainable (easy to change, telemetry, documentation)
  • Release note makes sense to a user of the library
  • Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

@emmettbutler emmettbutler added the changelog/no-changelog A changelog entry is not required for this PR. label May 28, 2024
@emmettbutler emmettbutler requested a review from a team as a code owner May 28, 2024 21:09
Copy link
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to do this without riot?

are we able to use riot to generate an artifact during build time? e.g. the Build deploy job could generate sdist, wheels, and some other artifacts like supported library versions

then that can be packaged with the other artifacts into the OCI artifact.

(we could even generate the result as Python and have it just embedded into the sitecustomize.py file if we don't want to read something from disk)

all_requirements.py Outdated Show resolved Hide resolved
all_requirements.py Outdated Show resolved Hide resolved
@datadog-dd-trace-py-rkomorn
Copy link

Datadog Report

Branch report: emmett.butler/min_versions
Commit report: 5ade18a
Test service: dd-trace-py

✅ 0 Failed, 140838 Passed, 36443 Skipped, 7h 50m 47.26s Total duration (2h 36m 3.2s time saved)
❄️ 1 New Flaky

New Flaky Tests (1)

  • test_otel_trace_across_fork - test_context.py - Last Failure

    Expand for error
     At request <Request GET /test/session/snapshot >:
        At snapshot (token='tests.opentelemetry.test_context.test_otel_trace_across_fork'):
         - Directory: /snapshots
         - CI mode: 1
         - Trace File: /snapshots/tests.opentelemetry.test_context.test_otel_trace_across_fork.json
         - Stats File: /snapshots/tests.opentelemetry.test_context.test_otel_trace_across_fork_tracestats.json
         At compare of 1 expected trace(s) to 1 received trace(s):
          At trace 'internal' (2 spans):
     Received fewer spans (1) than expected (2). Expected unmatched spans: 'internal'
    

@emmettbutler emmettbutler requested a review from brettlangdon May 29, 2024 19:10
all_requirements.py Outdated Show resolved Hide resolved
all_requirements.py Outdated Show resolved Hide resolved
@emmettbutler
Copy link
Collaborator Author

@brettlangdon I realized a reason using import riotfile isn't ideal: it would require the script to live in the same directory as the riotfile, because the riotfile does not exist inside of a python module with an __init__ function.

I think this change is stable now.

@emmettbutler emmettbutler requested a review from brettlangdon May 29, 2024 19:28
@brettlangdon
Copy link
Member

@brettlangdon I realized a reason using import riotfile isn't ideal: it would require the script to live in the same directory as the riotfile, because the riotfile does not exist inside of a python module with an __init__ function.

I think this change is stable now.

it just means the script needs to be executed from the root directory no?

we could also just add the root directory to the python path either via shell script or directly in python.

🤷🏻 not trying to argue for one specific way vs the other

@emmettbutler
Copy link
Collaborator Author

@brettlangdon yup, you're right. Updated.

Copy link
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall riot parsing looks good, other than I have a question about how to convert specifiers into actual minimum versions

scripts/min_compatible_versions.py Outdated Show resolved Hide resolved
min_compatible_versions.csv Outdated Show resolved Hide resolved
@emmettbutler
Copy link
Collaborator Author

@brettlangdon I've adjusted the script to avoid stripping out the specifier information (<, >=, etc) from the version string that it writes to the csv file. Because the determination of "minimum" is done based on naive gt/lt comparison there are probably some edge cases where the wrong thing gets written to the file, but I think that in the majority of cases this approach will be solid enough for use in the single-step guardrail logic.

@emmettbutler emmettbutler requested a review from brettlangdon May 31, 2024 15:08
@brettlangdon
Copy link
Member

I think that in the majority of cases this approach will be solid enough for use in the single-step guardrail logic.

@emmettbutler what are you thinking, just parsing the specifiers in SSI and then comparing actual versions against them?

if yes, we need to keep in mind that packaging might not be present on the system.

@emmettbutler
Copy link
Collaborator Author

emmettbutler commented May 31, 2024

@brettlangdon pretty much, yes. It seems to me that including the range markers in the minimum versions file is the best we can do without going to PyPI during that file's creation. I'm trying to avoid having the script go to PyPI because I suspect the large overhead it would add is unnecessary to achieve the guardrails goal.

I think we can compare version specifiers to the degree necessary without the packaging module. That comparison really just boils down to whether or not the minimum version specifier includes a less-than sign. See my other change for an illustration of what I mean.

@emmettbutler emmettbutler enabled auto-merge (squash) June 4, 2024 18:12
@emmettbutler
Copy link
Collaborator Author

@erikayasuda @ZStriker19 @brettlangdon let me know if you'd like any changes to this approach

@pr-commenter
Copy link

pr-commenter bot commented Jun 5, 2024

Benchmarks

Benchmark execution time: 2024-06-07 14:16:41

Comparing candidate commit ade7bec in PR branch emmett.butler/min_versions with baseline commit c035b91 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 221 metrics, 9 unstable metrics.

Copy link
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but I am worried about including out test dependencies, and it isn't clear to me how to identify those from our riotfile.py 🤔

min_compatible_versions.csv Outdated Show resolved Hide resolved
@emmettbutler emmettbutler requested a review from brettlangdon June 7, 2024 12:10
ignore riot environments that do not run pytest. this seems like a reasonable proxy for ignoring packages that are required only for tests
@emmettbutler emmettbutler requested a review from brettlangdon June 7, 2024 13:42
emmettbutler added a commit that referenced this pull request Jun 11, 2024
This pull request adds "guardrails" to the "library injection" process.
These are early exit conditions from the instrumentation process
intended to avoid sending any traces when undefined behavior is likely.
The code makes this determination on the basis of software versions
present in the application environment, both of Python packages and the
Python runtime itself.

The biggest risk here is that instrumentation is disabled when it's not
intended to be. I think existing tests in `tests/lib-injection` cover
this pretty well. There's a new test added that verifies instrumentation
was cancelled when an unsupported package version is present.

Contains changes from #9418
Related RFC: "[RFC] One Step Guardrails"

## Checklist

- [x] minimum package version checks
- [x] Testing
- [x] replace envvars with inject_force
- [x] figure out what to use instead of pkg_resources
- [x] replace local file path with `DD_TELEMETRY_FORWARDER_PATH`
- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Emmett Butler <[email protected]>
Co-authored-by: Emmett Butler <[email protected]>
github-actions bot pushed a commit that referenced this pull request Jun 11, 2024
This pull request adds "guardrails" to the "library injection" process.
These are early exit conditions from the instrumentation process
intended to avoid sending any traces when undefined behavior is likely.
The code makes this determination on the basis of software versions
present in the application environment, both of Python packages and the
Python runtime itself.

The biggest risk here is that instrumentation is disabled when it's not
intended to be. I think existing tests in `tests/lib-injection` cover
this pretty well. There's a new test added that verifies instrumentation
was cancelled when an unsupported package version is present.

Contains changes from #9418
Related RFC: "[RFC] One Step Guardrails"

## Checklist

- [x] minimum package version checks
- [x] Testing
- [x] replace envvars with inject_force
- [x] figure out what to use instead of pkg_resources
- [x] replace local file path with `DD_TELEMETRY_FORWARDER_PATH`
- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Emmett Butler <[email protected]>
Co-authored-by: Emmett Butler <[email protected]>
(cherry picked from commit 0c38e09)
auto-merge was automatically disabled June 11, 2024 14:13

Pull request was closed

@emmettbutler emmettbutler deleted the emmett.butler/min_versions branch June 11, 2024 14:13
emmettbutler pushed a commit that referenced this pull request Jun 11, 2024
…10] (#9512)

Backport 0c38e09 from #9323 to 2.10.

This pull request adds "guardrails" to the "library injection" process.
These are early exit conditions from the instrumentation process
intended to avoid sending any traces when undefined behavior is likely.
The code makes this determination on the basis of software versions
present in the application environment, both of Python packages and the
Python runtime itself.

The biggest risk here is that instrumentation is disabled when it's not
intended to be. I think existing tests in `tests/lib-injection` cover
this pretty well. There's a new test added that verifies instrumentation
was cancelled when an unsupported package version is present.

Contains changes from #9418
Related RFC: "[RFC] One Step Guardrails"

## Checklist

- [x] minimum package version checks
- [x] Testing
- [x] replace envvars with inject_force
- [x] figure out what to use instead of pkg_resources
- [x] replace local file path with `DD_TELEMETRY_FORWARDER_PATH`
- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: Zachary Groves <[email protected]>
emmettbutler pushed a commit that referenced this pull request Sep 9, 2024
This pull request adds "guardrails" to the "library injection" process.
These are early exit conditions from the instrumentation process
intended to avoid sending any traces when undefined behavior is likely.
The code makes this determination on the basis of software versions
present in the application environment, both of Python packages and the
Python runtime itself.

The biggest risk here is that instrumentation is disabled when it's not
intended to be. I think existing tests in `tests/lib-injection` cover
this pretty well. There's a new test added that verifies instrumentation
was cancelled when an unsupported package version is present.

Contains changes from #9418
Related RFC: "[RFC] One Step Guardrails"

- [x] minimum package version checks
- [x] Testing
- [x] replace envvars with inject_force
- [x] figure out what to use instead of pkg_resources
- [x] replace local file path with `DD_TELEMETRY_FORWARDER_PATH`
- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Emmett Butler <[email protected]>
Co-authored-by: Emmett Butler <[email protected]>
(cherry picked from commit 0c38e09)
emmettbutler added a commit that referenced this pull request Sep 9, 2024
…9] (#10563)

This pull request adds "guardrails" to the "library injection" process.
These are early exit conditions from the instrumentation process
intended to avoid sending any traces when undefined behavior is likely.
The code makes this determination on the basis of software versions
present in the application environment, both of Python packages and the
Python runtime itself.

The biggest risk here is that instrumentation is disabled when it's not
intended to be. I think existing tests in `tests/lib-injection` cover
this pretty well. There's a new test added that verifies instrumentation
was cancelled when an unsupported package version is present.

Contains changes from #9418
Related RFC: "[RFC] One Step Guardrails"

- [x] minimum package version checks
- [x] Testing
- [x] replace envvars with inject_force
- [x] figure out what to use instead of pkg_resources
- [x] replace local file path with `DD_TELEMETRY_FORWARDER_PATH`
- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance

policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: Zachary Groves <[email protected]>
Co-authored-by: Taegyun Kim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/no-changelog A changelog entry is not required for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants