Stochastic Custom Loss Function #738
-
Hi all and @MilesCranmer, Thank you again for your work on this package! With my limited compute power available, I am trying to improve efficiency of my custom loss function. Just wondering if this would work with pysr? i.e. if the loss is really low and an equation enters the HoF, then the next loss is high (due to the random nature of the feature selection), would pysr know to remove that equation from the HoF? jl.seval("""
using Statistics, DataFrames
const common_bounds = -8:0.5:9 #standardised data with wide spread, -8 to 9 represent SDs
const feature_length = length(common_bounds)
const num_features = 6
X_fixed = Matrix{Float32}(undef, num_features, feature_length)
""")
elementwise_loss = """
function loss_function(tree::Node, dataset::Dataset{T,L}, options::Options, idx) where {T,L}
# Extract data for the given indices
X = idx === nothing ? dataset.X : view(dataset.X, :, idx)
y = idx === nothing ? dataset.y : view(dataset.y, idx)
weights = idx === nothing ? dataset.weights : view(dataset.weights, idx)
prediction, complete = eval_tree_array(tree, X, options)
if !complete
return L(Inf)
end
penalty = 0
featureCounter = rand(1:5)
for i in 1:5
if i != featureCounter
X_fixed[i, :] .= rand(common_bounds) # Randomly initialise for all features except the one of interest
end
end
X_fixed[6, :] .= rand(0:1) #boolean variable
# Replace the feature of interest with the fixed common bounds
X_fixed[featureCounter, :] .= common_bounds # Vary only the selected feature
s_values, completeSub = eval_tree_array(tree, X_fixed, options)
s_diff = diff(s_values)
if !(all(s_diff .>= 0) || all(s_diff .<= 0))
penalty += 0.5 # Add penalty for non-monotonicity if there is a sign change
end
mse = mean((prediction .- y).^2)
return mse + penalty |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
So PySR expects that the loss function for an expression is deterministic, due to caching, as well as the absolute ordering in the hall of fame. Therefore, if you have randomness, you could either use a fixed seed in the loss (and maybe average the loss over a few different evaluations), or perhaps re-run the search each time (with a warm start) and randomness introduced each time? |
Beta Was this translation helpful? Give feedback.
-
Hi @MilesCranmer, I think I have a working compromise. When batching=true then the loss function is also nondeterministic, but when idx == nothing, at the end of each iteration, the whole data is assessed making it deterministic again. So, the loss function can combine a much faster stochastic loss function when idx=batch_size but when idx==nothing, perform a deterministic loss function, before updating the HoF. If you wanted to do take advantage of this but without batching, could you turn batching=true and batch_size=length(y)? |
Beta Was this translation helpful? Give feedback.
So PySR expects that the loss function for an expression is deterministic, due to caching, as well as the absolute ordering in the hall of fame. Therefore, if you have randomness, you could either use a fixed seed in the loss (and maybe average the loss over a few different evaluations), or perhaps re-run the search each time (with a warm start) and randomness introduced each time?