Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expression fails with extra commas #525

Open
mathause opened this issue Sep 18, 2024 · 6 comments
Open

Expression fails with extra commas #525

mathause opened this issue Sep 18, 2024 · 6 comments

Comments

@mathause
Copy link
Member

mathause commented Sep 18, 2024

I am writing unit tests for Expression and it fails if there are more commas than parameters, as it splits on them. Example:

from mesmer.mesmer_x import Expression

mesmer.mesmer_x.Expression("norm(loc=c1, scale=np.mean([c2, c3])", "name")

The problems happens here:

sub_expressions = str.split(tmp_expression, ",")
self.parameters_expressions = {}
for sub_exp in sub_expressions:
param, sub = str.split(sub_exp, "=")

ValueError                                Traceback (most recent call last)
Cell In[38], line 1
----> 1 mesmer.mesmer_x.Expression("norm(loc=c1, scale=np.mean([c2, c3])", "name")

File ~/code/mesmer/mesmer/mesmer_x/train_utils_mesmerx.py:94, in Expression.__init__(self, expr, expr_name)
     92 # identify parameters
     93 self.find_parameters_list()
---> 94 self.find_expr_parameters()
     96 # identify coefficients
     97 self.find_coefficients()

File ~/code/mesmer/mesmer/mesmer_x/train_utils_mesmerx.py:134, in Expression.find_expr_parameters(self)
    132 self.parameters_expressions = {}
    133 for sub_exp in sub_expressions:
--> 134     param, sub = str.split(sub_exp, "=")
    135     if param in self.parameters_list:
    136         self.parameters_expressions[param] = sub

ValueError: not enough values to unpack (expected 2, got 1)
@mathause
Copy link
Member Author

I hit comment to early. cc @yquilcaille - I am not sure we should try to fix this but think about if there is a different approach we can take. Such a parsing is inherently difficult...

@yquilcaille
Copy link
Collaborator

My naive opinion would be simply to change the character for the parsing, and use something that would not be used normally in python. Maybe $? But i see the issue (#527), see whether it is a better fix.

@mathause
Copy link
Member Author

I would like to avoid inventing our own language 😉 Instead we could

  1. split on = (although that may fail on functions which require params, e.g. axis=0)
  2. use regex but that would be fiendishly difficult (so 👎 from my side)
  3. require each param to have it's own keyword argument, e.g.:
    Expression("norm", loc="c1", scale="np.mean([c2, c3])", name="name")
    • but would have to use **kwargs if we keep the generic Expression (but again we are in the same situation today: the user needs to know the required params for the distribution)

@mathause
Copy link
Member Author

mathause commented Oct 1, 2024

Two Three more issues:

  1. coefficients can only have one digit:
    expr = mesmer.mesmer_x.Expression("norm(loc=c1, scale=c2 + c10 * __T__)", "name")
    expr.coefficients_list
    # ['c1', 'c2']
  2. covariates cannot contain "c + digit"
    expr = mesmer.mesmer_x.Expression("norm(loc=c1, scale=c2 * __Tc3__)", "name")
    expr.coefficients_list
    # ['c1', 'c2', 'c3']
  3. Wrongly delineated covariates don't raise:
    expr = mesmer.mesmer_x.Expression("norm(loc=c1, scale=c2 * __TZ__C__)", "name")
    expr.inputs_list
    # [np.str_('TZ'), np.str_('')]

@veni-vidi-vici-dormivi
Copy link
Collaborator

Found one more:

Expression cannot contain rational numbers:

expr = mesmer.mesmer_x.Expression("norm(loc=c1 * __pred__, scale = 0.1", "name")
# ValueError: Unknown function '.' in expression '0.1' for 'scale'.Do you need to prepend it with 'np.' (or 'math.')? Currently only numpy and math functions are supported.

I think it is because we use . to recognize functions like np.min. However, ints work nicely.

@mathause
Copy link
Member Author

That error message seems off... But anyway

Separate expressions for each param (idea 3 from above) would not help with the rational numbers 😒

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants