Stabilizing Training with Small Batch Sizes using Exponential Moving Average (EMA) #217

Franklalalala · 2024-11-07T09:23:02Z

Background

Small batch sizes are commonly used in Hamiltonian AI training, as seen in the universal DeepH. However, this approach can lead to violent updates of model parameters.

Specific Observation (DeepTB on QH9 Dataset with Batch Size=1):

Describe the solution you'd like

Address the existing TODO regarding EMA implementation to ensure consistent training behavior across related projects.

Additional Context

This approach is already integrated into QHNet as a default setting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stabilizing Training with Small Batch Sizes using Exponential Moving Average (EMA) #217

Stabilizing Training with Small Batch Sizes using Exponential Moving Average (EMA) #217

Franklalalala commented Nov 7, 2024

Stabilizing Training with Small Batch Sizes using Exponential Moving Average (EMA) #217

Stabilizing Training with Small Batch Sizes using Exponential Moving Average (EMA) #217

Comments

Franklalalala commented Nov 7, 2024

Background

Describe the solution you'd like

Additional Context