Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade typeguard, and add test that would have caught typeguard-related regression #2241

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

mmacpherson
Copy link

@mmacpherson mmacpherson commented Jan 31, 2025

This refers to issue #2235.

The proximal issue is that you can't use a JSONType parameter via the Runner API with recent python versions.

That raises an error from the vendored-in typeguard library, which performs runtime type checks. That error arises in turn from updates to cpython, that made a previously-positional argument required to be specified as a keyword. Typeguard has since been patched to address this issue.

This PR aims to update the typeguard dependency, and to add a test that would have caught the issue when it first arose.

Adding the test is the more challenging problem. The current test runner invokes the API tests like so:

top_level_dict, run_level_dict = construct_arg_dicts_from_click_api()
runner = Runner(
                "test_flow.py", show_output=False, env=env, **top_level_dict
            )
result = runner.run(**run_level_dict)

To test an additional parameter via the path that has failed here, one would need somehow to get it into the run_level_dict.

I'm proposing here to augment the run_level_dict parameters via a RUNTIME_PARAMETERS environment variable that may be set in a test.

Running via:

 PYTHONPATH=`pwd`/../../ python run_tests.py --debug --contexts dev-local --graphs single-linear-step --tests BasicParameterTest

After setting RUNTIME_PARAMETERS now yields a test failure; the same "recursive_guard" issue under discussion.

With that in place, I patched typeguard, and that resolves the test failure.

I also ran PYTHONPATH=pwd/../../ python run_tests.py --debug --contexts dev-local --tests BasicParameterTest, and those passed as well. I might be wrong, but I'm expecting that the GH Actions would be relied upon to test other configurations.

They may well be a cleaner way to achieve test coverage of this code path, and I'm happy to modify/iterate if so. Hopefully this at least helps identify the key issues.

@mmacpherson
Copy link
Author

Refactored this, adding a RuntimeParameters test aot modifying BasicParameters, and testing more parameter types there.

One weakness here is that these tests would all pass if the runtime parameters weren't applied. I thought a little about how to prove that they are being applied. For example, if you could detect from the test context whether we were running via the cli executor vs the api executor, then you could set a parameter differently in PARAMETERS vs RUNTIME_PARAMETERS and test that you've gotten the expected one. But I didn't find a good solution.

@romain-intel
Copy link
Contributor

Thanks for the contribution!

I haven't looked in detail yet but a few high level comments:

  • to update any vendored software (including typeguard), please update using the vendor.py script included (and change the file here: https://github.com/Netflix/metaflow/blob/master/metaflow/_vendor/vendor_any.txt to point to the version you are vendoring)
  • for injecting the parameter into run_level_dict; I think a better approach may be to try to do json.loads on the parameters coming in and passing them as python object. Currently we do test for JSONType (ish) but they are all passed in as strings (iirc).

@mmacpherson
Copy link
Author

mmacpherson commented Feb 1, 2025

Thanks kindly @romain-intel.

Pushed a couple commits:

  • Uses the standard vendor path to upgrade typeguard (and typing_extensions, on which it depends).
  • Undid the RUNTIME_PARAMETERS path, and tried what I think you meant.
  • The way that I've done it only applies to JSONType parameters atm. If you want to test that other parameter types also work as intended when passed in to Runner.run() as python objects, we might refactor? Perhaps there's already a metaflow function that would unpack the default (string) values as encoded in MetaflowTest.PARAMETERS into the corresponding python type, that could be used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants