-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow overriding simulators in .sample()
#312
Comments
Agreed that this would be a valuable feature. What do you think about the naming of the argument @paul-buerkner ? I think we could even allow passing it directly, e.g. simulator.sample(1000, n=50) On the other hand, your usage @Kucharssim could be dangerous if the value of def context():
n = 10
c = np.zeros(n)
return dict(n=n, c=c) Since we cannot overwrite n inside the function. It might also complicate batching. |
yes that is a good point. Though I think we could just let the user be responsible for this (and be very clear that we can intercept the values only in between each simulator fn, but not inside) |
Well actually this feature is already available because kwargs are passed into the simulators during sampling. One just has to explicitly handle the case when the input is manually added def context(batch_size, n=None):
if n is None:
n = np.random.randint(10,100)
return dict(n=n)
def prior():
mu = np.random.normal(0,1)
return dict(mu=mu)
def likelihood(mu, n):
y = np.random.normal(mu,1,n)
return dict(y=y)
simulator = bf.make_simulator([prior, likelihood], meta_fn=context)
data=simulator.sample(100, n=31415)
print(data["n"]) # 31415
print(data["y"].shape) # (100, 31415) Now I am not sure how that plays with batched input, but I think for the majority of use cases this is sufficient already. |
Cool! Which potential issue regarding batching do you see? |
@paul-buerkner When users pass a value like this in the already-batched context of the We could broadcast any outside input against the simulator output, but this has to be clearly communicated somehow. |
I don't think we should auto-broadcast for now because then, the example from above with n being fixed wouldn't work. So I understand, right now, we need (a) that users handle the To make this feature more convenient (no is None and autobroadcasting), we would need to build a variable graph such that the simulator would know which variable is outputed and inputed by which other simulators and how they were broadcasted or meta etc. This may be something for the future of simulators (tagging @daniel-habermann just FYI), but I think not something to tackle right now. What do you think? |
@paul-buerkner I think you misunderstand, there is no significant issue with broadcasting here, as we already know the target shape. Since if "n" in output:
output["n"] = broadcast(input["n"], output["n"].shape) Of course, this pseudo-code only works for Tensor output, so we would skip this for other output types, but this highlights the idea. |
Okay, so you say that this would already work then with minimal changes? If so, perhaps @Kucharssim would like to work on that? |
Say you have a setup like this:
Sometimes (e.g., during validation), it would be handy to generate a batch of data with something not according to the simulator specification, e.g., generate data with fixed context:
Of course one could make a separate simulator to do this, but that is at times a bit cumbersome and leads to code duplication. Looking at the code it should be relatively straightforward to implement this, I am curious to hear if there are any downsides, @stefanradev93, @LarsKue, @paul-buerkner?
The text was updated successfully, but these errors were encountered: