Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to arrays #762

Open
wants to merge 19 commits into
base: develop
Choose a base branch
from
Open

Switch to arrays #762

wants to merge 19 commits into from

Conversation

rcannood
Copy link
Member

Describe your changes

Related issue(s)

Closes #705

Type of Change

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • New functionality (non-breaking change which adds functionality)
  • Major change (non-breaking change which modifies existing functionality)
  • Minor change (non-breaking change which does not modify existing functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • This change requires a documentation update

Checklist

Requirements:

  • I have read the CONTRIBUTING doc.
  • I have performed a self-review of my code by checking the "Changed Files" tab.
  • My code follows the code style of this project.

Tests:

  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.

Documentation:

  • Proposed changes are described in the CHANGELOG.md.
  • I have updated the documentation accordingly.

Test Environment

@rcannood
Copy link
Member Author

rcannood commented Aug 29, 2024

Status update:

What needs to happen in this PR:

  • BashWrapper
    • Passed argument values get stored in a bash array instead of a string-delimited variable
    • While doing the argument parsing, we need to process quoted strings and any escaped values. This also means that an unquoted UNDEFINED needs to be stored in a way that allows us to know later on that the argument was unset and not that the user actually provided a quoted "UNDEFINED" value instead. Same for UNDEFINED_ITEM. I would be ok with just storing something like @@VIASH_UNDEFINED@@ and @@VIASH_UNDEFINED_ITEM@@.
    • Argument values are translated into a yaml to pass to the script (partially done)
    • Script and yaml are written to a temporary directory. Should this be stored as a file inside the overall bash environment, or should we do it inside the engine (bash or docker)? → It would be nice to store the script and the args in the current VIASH_TEMP dir, so we can still figure out what was run.
    • See Proposed flow of parsing in BashWrapper
  • ExecutableRunner:
    • Ensure that any time we generate code that looks at the VIASH_PAR, that this is able to handle arrays
    • Also handle values set to @@VIASH_UNDEFINED@@ or values set to @@VIASH_UNDEFINED_ITEM@@
  • NextflowRunner: Needs to be updated to also create a yaml from the inputs it receives, and store the script in the same tempdir since we don't need to do any script injection
  • Script: All scripting languages need to be updated such that they don't generate dictionaries from the environment variables, but instead are able to parse YAML using the base functions in that scripting language. While we're at it, I would create an io.viash.language or something like that and add all of the supported scripting languages there.

Proposed flow of parsing in BashWrapper:

  • Preparse: do nothing. Make sure the arrays don't already exist.
  • During parse: -- this logic was originally written in scala but is now a Bash function ViashParseArgumentValue
    • If the argument is multiple false and the environment variable already exists, throw an error
    • If the argument is multiple true and the environment variable is not an array and equals @@VIASH_UNDEFINED@@, throw an error
    • If the argument is multiple true and the environment variable doesn't exist yet, create an array
    • If the passed value is equal to UNDEFINED, set the environment variable to @@VIASH_UNDEFINED@@. (Can we later still check if the variable is an array or not?)
    • If the passed value is equal to UNDEFINED_ITEM, change it into @@VIASH_UNDEFINED_ITEM@@
    • Resolve quotes and escaped characters. Split by multiple sep if the multiple sep is not quoted or escaped.
    • If the argument is multiple true, add the value to the array, else set the environment variable to the value
  • Postparse:
    • If the variable equals @@VIASH_UNDEFINED@@, unset the variable.
    • Else, if the variable is unset and the argument has a default, set the variable to the default values. Ensure that defaults are stored as an array.
  • Prerun:
    • Convert argument values into a yaml, which contains a {par: {...}, meta: {...}, dependencies: {...}}.
    • Ensure that the format of the yaml is as follows:
      par:
        arg1: "value1"
        arg2: "value2"
        arg_multiple:
          - "foo"
          - "bar"
          - null # an UNDEFINED_ITEM was passed
        arg_with_default: null # an UNDEFINED was passed
      meta:
        # etc...
    • ↑ This should allow for easy processing with awk. Notes:
      • That is, each argument gets its own row
      • If the argument is multiple true, each item also gets its own row
      • Strings are always escaped
      • @@VIASH_UNDEFINED@@ is translated into null
      • @@VIASH_UNDEFINED_ITEM@@ in [..., null, ...].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant