You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’d like to open a discussion on whether and how to integrate active learning into NVIDIA Modulus as a more “standard” component. Currently, Modulus offers powerful PDE solvers and neural operator frameworks (like the FNO implementations), but does not include a built-in mechanism for iterative data acquisition informed by model uncertainty (a.k.a. active learning).
Background
Use Case: In many real-world PDE settings (e.g., Darcy Flow or other high-dimensional systems), running full-fidelity simulations can be costly. Active learning (AL) helps by identifying where the model is most uncertain and focusing additional simulation efforts on those inputs, reducing overall cost.
Lack of Built-In Support: Although users can manually script an AL loop (train → estimate uncertainty → pick new samples → retrain), there’s no off-the-shelf feature or integrated example in Modulus that demonstrates this workflow.
Community Interest: Given the push towards more data-efficient PDE surrogate modeling, I suspect other users might benefit from having a streamlined active learning example or an optional interface in Modulus.
Possible Approaches
Add a Single Example
For instance, an active learning variant of the Darcy Flow FNO tutorial.
Could demonstrate how to measure uncertainty (e.g., MC-Dropout, ensembles) and choose which points to simulate next.
General Toolkit
A more general design to handle selection criteria (uncertainty-based or diversity-based), batch simulation scheduling, iterative retraining, etc.
Possibly a new module or library that interacts seamlessly with Modulus’s PDE data pipelines.
Lightweight Integration
Provide an “active learning loop” script/class that wraps around existing Modulus workflows.
Keep it minimal—just the logic of:
Evaluate model
Sort by uncertainty
Simulate top-K
Update training set
Retrain
Example (Pseudo-Code)
(Note: Purely illustrative and not tested.)
forround_idxinrange(num_rounds):
# 1. Generate or sample candidate inputscandidate_inputs=sample_candidate_permeability_fields()
# 2. Estimate uncertainty (MC-Dropout or ensembles)uncertainties= []
forcandincandidate_inputs:
mean_pred, var_pred=estimate_uncertainty(model, cand)
uncertainties.append(var_pred.mean().item())
# 3. Pick top uncertain pointsselected=pick_top_k(candidate_inputs, uncertainties, k=samples_per_round)
# 4. Run PDE solver for new datanew_data=run_pde_solutions(selected)
# 5. Add to dataset, retraintrain_data+=new_datatrain_model(model, train_data)
Extensibility for Different UQ Methods
One idea is to define a generic UncertaintyEstimator interface that can work with any neural operator (FNO, AFNO, etc.) and can be swapped out for different UQ approaches (e.g., MC-Dropout, ensembles, or a Bayesian library). This keeps the AL loop itself (the “Orchestrator”) relatively unchanged:
MC-Dropout: Insert dropout layers, run multiple forward passes in train() mode.
Ensembles: Keep multiple model instances, measure variance across them.
Bayesian: Potentially integrate external libraries like fortuna, as long as they expose a .forward(...) or similar.
By making the AL orchestrator agnostic to how uncertainty is computed, Modulus could offer a flexible path for advanced users to plug in new approaches with minimal friction.
Discussion Points
Scope: Should AL be showcased as a single example (e.g., Darcy Flow + MC-Dropout) or should it be a more general feature with an extensible interface?
UQ Methods: MC-Dropout is straightforward to add in PyTorch; ensembles might also be feasible if resources permit. Do we want a flexible interface to accommodate more advanced or custom methods?
Integration: Should this live in the main codebase (e.g., a new modulus.active_learning submodule) or start as a separate examples/active_learning folder?
Community Interest: Are PDE simulation costs a primary concern for most Modulus users, or is the typical usage more about smaller-scale PDEs?
I’d love to hear thoughts from the Modulus team and the community regarding:
Would a built-in or officially supported active learning feature benefit enough users?
If so, which approach (lightweight example vs. deeper integration) seems most appropriate?
Are there any known plans or partial implementations already in the works?
Thanks in advance for your insights!
Best,
Y Georg Maerz
(Again, the pseudo-code above is for demonstration only and is not tested. I’m happy to iterate or help contribute if there’s interest in making this more official.)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi everyone,
I’d like to open a discussion on whether and how to integrate active learning into NVIDIA Modulus as a more “standard” component. Currently, Modulus offers powerful PDE solvers and neural operator frameworks (like the FNO implementations), but does not include a built-in mechanism for iterative data acquisition informed by model uncertainty (a.k.a. active learning).
Background
Use Case: In many real-world PDE settings (e.g., Darcy Flow or other high-dimensional systems), running full-fidelity simulations can be costly. Active learning (AL) helps by identifying where the model is most uncertain and focusing additional simulation efforts on those inputs, reducing overall cost.
Lack of Built-In Support: Although users can manually script an AL loop (train → estimate uncertainty → pick new samples → retrain), there’s no off-the-shelf feature or integrated example in Modulus that demonstrates this workflow.
Community Interest: Given the push towards more data-efficient PDE surrogate modeling, I suspect other users might benefit from having a streamlined active learning example or an optional interface in Modulus.
Possible Approaches
Add a Single Example
General Toolkit
Lightweight Integration
Example (Pseudo-Code)
(Note: Purely illustrative and not tested.)
Extensibility for Different UQ Methods
One idea is to define a generic
UncertaintyEstimator
interface that can work with any neural operator (FNO, AFNO, etc.) and can be swapped out for different UQ approaches (e.g., MC-Dropout, ensembles, or a Bayesian library). This keeps the AL loop itself (the “Orchestrator”) relatively unchanged:train()
mode.fortuna
, as long as they expose a.forward(...)
or similar.By making the AL orchestrator agnostic to how uncertainty is computed, Modulus could offer a flexible path for advanced users to plug in new approaches with minimal friction.
Discussion Points
modulus.active_learning
submodule) or start as a separateexamples/active_learning
folder?I’d love to hear thoughts from the Modulus team and the community regarding:
Thanks in advance for your insights!
Best,
Y Georg Maerz
(Again, the pseudo-code above is for demonstration only and is not tested. I’m happy to iterate or help contribute if there’s interest in making this more official.)
Beta Was this translation helpful? Give feedback.
All reactions