Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BREAKING: Change expression types to DynamicExpressions.Expression (from DynamicExpressions.Node) #326

Merged
merged 209 commits into from
Oct 6, 2024

Conversation

MilesCranmer
Copy link
Owner

@MilesCranmer MilesCranmer commented Jun 24, 2024

These new experimental Expression types store both the operators and variable names within the object, rather than the plain Node which only stores the enum information about an expression.

This also adds ParametricExpression to learn basis expressions that have variable constants depending on class:

using SymbolicRegression
using Random: MersenneTwister
using Zygote
using MLJBase: machine, fit!, predict

rng = MersenneTwister(0)
X = NamedTuple{(:x1, :x2, :x3, :x4, :x5)}(ntuple(_ -> randn(rng, Float32, 30), Val(5)))
X = (; X..., classes=rand(rng, 1:2, 30))
p1 = rand(rng, Float32, 2)
p2 = rand(rng, Float32, 2)

y = [
    2 * cos(X.x4[i] + p1[X.classes[i]]) + X.x1[i]^2 - p2[X.classes[i]] for
    i in eachindex(X.classes)
]

model = SRRegressor(;
    niterations=10,
    binary_operators=[+, *, /, -],
    unary_operators=[cos, exp],
    populations=10,
    expression_type=ParametricExpression,  # Subtype of `AbstractExpression`
    expression_options=(; max_parameters=2),
    autodiff_backend=:Zygote,
    parallelism=:multithreading,
)

mach = machine(model, X, y)
fit!(mach)
ypred = predict(mach, X)

so it basically learns $y= 2 \cos(x_4 + \alpha) + x_1^2 - \beta$ for $\alpha$ and $\beta$ parameters (which can be different according to the classes parameter – here there are two classes/types of behavior).

This ParametricExpression is just a single implementation of AbstractExpression but you can see how you can do pretty custom things now.

Fixes #340. Fixes #337. Fixes #336.


TODO:

  • Allow passing a class feature to MLJ which will have special treatment.
  • Debug why some of the tests seem to get stuck and take 3x longer to finish than normal.
  • Consider documenting this, or just leaving it as an experimental undocumented feature until it stabilizes.
  • Add Enzyme backend.
  • Add example to docs.
  • Consider moving to Literate.jl for docs?
  • Fix ResourceMonitor weirdness

@atharvas
Copy link

I was encountering some issues with constraint parsing in Options.jl. Check out the comment. Don't know why the test cases don't catch the issue.

@MilesCranmer
Copy link
Owner Author

Going to punt StructuredExpressions until later. @eelregit let me know if you are at all interested in this! StructuredExpression would let you evolve within a fixed functional form. Seems like there are a couple missing methods that would allow it to work but hopefully won't take too much work. I'll have to pause on this side of things for now.

@MilesCranmer
Copy link
Owner Author

Seems like the garbage collection is going crazy in the tests, which is why they are so slow. The reason why 1.6 and 1.8 are much faster is – I think – because DispatchDoctor.jl is turned off. So something about DispatchDoctor.jl is causing the GC to overwork itself... Possibly related to MilesCranmer/DispatchDoctor.jl#57 and MilesCranmer/DispatchDoctor.jl#58?

@MilesCranmer
Copy link
Owner Author

MilesCranmer commented Oct 6, 2024

Fixed the performance regression in the unittests with SymbolicML/DynamicExpressions.jl@74c8dc1.

Edit: still seems to hang around a bit. It's something to do with DispatchDoctor for sure, from studying the PProf outputs. So it won't affect actual runtime performance, just the testing. So probably fine to merge for now.

@MilesCranmer MilesCranmer merged commit 749cc34 into master Oct 6, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants