Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor run system API #173

Merged
merged 27 commits into from
Dec 23, 2022
Merged

refactor run system API #173

merged 27 commits into from
Dec 23, 2022

Conversation

Rykath
Copy link
Collaborator

@Rykath Rykath commented Oct 17, 2022

Goals

simplify run components by:

  • reducing inter-dependencies
  • simpler & cleaner API access
  • easier unit testing
  • better documentation: available options and default values are documented with the API

Major Changes

some run components now have different options
  • Runner
    • run_dirwork_dir (location of run directories, logs and temporary files) - reduce confusion with individual run directories
    • ... todo
the run configuration uses a different structure
run:
    runner:
    interface:
    worker:
    debug: will override "runner.debug" & "worker.debug" when set to "true"

To keep some backwards compatibility the pre and post subsections are also allowed. They are passed to the Worker and should be directly specified there:

run:
    pre: template
    post: numpytxt

is allowed but is better written as:

run:
    worker:
        class: command
        pre: template
        post: numpytxt

The command shorthand will be expanded to a command-worker

run:
    command: ./simulation.py

is expanded to:

run:
    worker:
        class: command
        command: ./simulation.py

Finally the following options are set from other sections of the config:

  • runner.work_dirbase_config.run_dir
  • interface.sizebase_config.ntrain
  • interface.input_configbase_config.input (generated by variables)
  • interface.output_configbase_config.output (generated by variables)

Minor Changes

  • Worker will no longer read the config file, environment variables are used instead
  • if the interface fails, backups are no longer checked for
  • the default Worker is now the CommandWorker
  • custom Workers no longer require Pre- and Postprocessors
  • new Component abstract base class makes registering custom components easier
  • added unit tests for all run components - increases coverage to >82% (except slurm)
  • new searchtype option for simple Active Learning - workaround for Active Learning: search space becomes too large #175

Tasks

  • Worker follows the same component structure as the others: abstract base class
  • Worker does not always create Pre- and Postprocessor
    • the default Worker is the CommandWorker which requires a Pre- and Postprocessor
    • other (custom) Workers work without Pre- and Postprocessor (this means they also create no run directory)
  • use new Component abstract base class for all run components
  • use arguments instead of config
    • options in init are the same as the config options
    • interfacing with Config classes
    • defaults are now found in the class definitions again
  • unit tests
    • Runner- and Worker-Interface
    • Template-Preprocessor and Postprocessors
    • Runner
      • test Slurm-Runner (example & unit test) → coverage of slurm.py: 71% (37/129)
    • Command-Worker
  • update documentation
    • added the two example notebooks to the documentation
  • API example (notebook)
  • ensure compatibility with python 3.7 & 3.8
    • positional only arguments were added in 3.8 - PEP 570
    • the union operator for dictionaries was added in 3.9 - PEP 584
    • defining properties on classmethods was added in 3.9, but removed in 3.11 again
  • add searchtype: halton for Active Learning to allow more dimensions
  • activate black

Ideas

  • split main into different functions, or better yet: create a Study class
  • automatically test all examples
  • the integration of the new run system with the Config is now suboptimal. As the defaults are now processed by the components themselves, the config is not filled anymore.
    • however I see no alternative if the components should be accessible with an easy API (without creating a full study/config first)
    • reinsert the processed config into the config class after the components are instantiated? any better ideas?
    • do we even need/want the config class to be filled with all options?

Rykath added 7 commits July 17, 2022 09:53
Component base class & replace config with arguments
for Runner Worker, Pre-, Postprocessor & Interface
rename Interface to RunnerInterface
Worker follows the same scheme now - abstract base class
default Worker is the CommandWorker with Pre- & Postprocessor - other
Workers no longer require them
format with black

ToDo: config structure, read arguments from config, pass config to
 Worker
add test_interface_resize
refactor using fixtures
parametrize
@Rykath Rykath added enhancement New feature or request project labels Oct 17, 2022
@Rykath Rykath self-assigned this Oct 17, 2022
fix redmod-team#165 by moving pytest config from setup.cfg to pyproject.toml
add dev dependencies
remove pytest alias
test all Postprocessors
test the TemplatePreprocessor

fix redmod-team#154: make TemplatePreprocessor methods to classmethods where possible
 this allows someone to create a subclass of the Template Preprocessor
 and override just some methods
Component.wrap (Pre-, Postprocessor & Worker) now uses functools.wraps
 sets various metadata, e.g. the function name
* test CommandWorker
* with MockWorkerInterface, MockPreprocessor and MockPostprocessor
* set ZeroMQ Interface as default
* a subclass of a Component without an explicit label is no longer
registered
only new Components get autogenerated labels if none are supplied
ToDo: test SlurmRunner

ToDo: integration tests
integrate with config
update integration tests
change github workflow to use the *dev* requirements
 - @classmethod & @Property
 - union of dictionaries
 - positional only arguments
   only needed in __init_subclass__ -> no actual change
@coveralls
Copy link
Collaborator

coveralls commented Oct 31, 2022

Pull Request Test Coverage Report for Build 3663821493

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 1221 of 1933 (63.17%) changed or added relevant lines in 36 files are covered.
  • 345 unchanged lines in 27 files lost coverage.
  • Overall coverage increased (+1.2%) to 68.147%

Changes Missing Coverage Covered Lines Changed/Added Lines %
profit/al/active_learning.py 13 14 92.86%
profit/sur/encoders.py 32 33 96.97%
profit/util/file_handler.py 21 22 95.45%
profit/init.py 0 2 0.0%
profit/run/worker.py 54 56 96.43%
profit/sur/linreg/chaospy_linreg.py 22 24 91.67%
profit/util/component.py 26 28 92.86%
profit/util/util.py 7 10 70.0%
profit/sur/gp/backend/python_kernels.py 9 13 69.23%
profit/run/interface.py 49 54 90.74%
Files with Coverage Reduction New Missed Lines %
profit/run/runner.py 1 86.96%
profit/run/slurm.py 1 15.38%
profit/init.py 2 77.78%
profit/run/worker.py 2 94.19%
profit/util/file_handler.py 2 93.88%
profit/al/simple_al.py 3 83.15%
profit/util/base_class.py 3 84.21%
profit/al/active_learning.py 4 91.53%
profit/sur/gp/backend/python_kernels.py 4 57.14%
profit/sur/gp/gaussian_process.py 4 94.2%
Totals Coverage Status
Change from base Build 2707154646: 1.2%
Covered Lines: 2854
Relevant Lines: 4188

💛 - Coveralls

@redmod-team redmod-team deleted a comment from coveralls Oct 31, 2022
@Rykath Rykath marked this pull request as ready for review November 2, 2022 17:06
@Rykath Rykath requested a review from krystophny November 2, 2022 17:06
@Rykath Rykath linked an issue Nov 13, 2022 that may be closed by this pull request
searchtype `grid` is the current default behaviour of creating a
searchspace using a meshgrid over all input variables ->
`nsearch^ndim` points. This gets too expensive with a moderate number
of input dimensions.
searchtype `halton` is a new option which distributes a fixed number
`nsearch` of samples pseudo-randomly within the searchspace
Copy link
Collaborator

@krystophny krystophny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small changes follow

Copy link
Collaborator

@krystophny krystophny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved before Black

@krystophny krystophny merged commit f6acf68 into redmod-team:master Dec 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request project
Projects
None yet
3 participants