EDM: E(3) Equivariant Diffusion Model for Molecule Generation in 3D.

Official code release for the paper Equivariant Diffusion for Molecule Generation in 3D.

If you want to set-up a rdkit environment, it may be easiest to install conda and run: conda create -c conda-forge -n my-rdkit-env rdkit

and then install the other required packages from there. The code should still run without rdkit installed though.

Training the EDM:

python main_qm9.py --n_epochs 3000 --exp_name edm_qm9 --n_stability_samples 1000 --diffusion_noise_schedule polynomial_2 --diffusion_noise_precision 1e-5 --diffusion_steps 1000 --diffusion_loss_type l2 --batch_size 64 --nf 256 --n_layers 9 --lr 1e-4 --normalize_factors [1,4,10] --test_epochs 20 --ema_decay 0.9999

A visualization of what happens during training:

After training

To analyze the sample quality of molecules

python eval_analyze.py --model_path outputs/edm_qm9 --n_samples 10_000

To visualize some molecules

python eval_sample.py --model_path outputs/edm_qm9 --n_samples 10_000

For GEOM-Drugs

First follow the intructions at data/geom/README.md to set up the data.

Training python main_geom_drugs.py --n_epochs 3000 --exp_name edm_geom_drugs --n_stability_samples 500 --diffusion_noise_schedule polynomial_2 --diffusion_steps 1000 --diffusion_noise_precision 1e-5 --diffusion_loss_type l2 --batch_size 64 --nf 256 --n_layers 4 --lr 1e-4 --normalize_factors [1,4,10] --test_epochs 1 --ema_decay 0.9999 --normalization_factor 1 --model egnn_dynamics --visualize_every_batch 10000

Analyze

python eval_analyze.py --model_path outputs/edm_geom_drugs --n_samples 10_000

Sample

python eval_sample.py --model_path outputs/edm_geom_drugs

Small note: The GPUs we used for these experiment were pretty large. If the memory does not fit, try running at a smaller size. The main reason is that the EGNN runs with fully connected message passing, which becomes very memory intensive.

For Conditional Generation

Train a Conditional EDM

python main_qm9.py --exp_name exp_cond_alpha --model egnn_dynamics --lr 1e-4 --nf 192 --n_layers 9 --save_model True --diffusion_steps 1000 --sin_embedding False --n_epochs 3000 --n_stability_samples 500 --diffusion_noise_schedule polynomial_2 --diffusion_noise_precision 1e-5 --dequantization deterministic --include_charges False --diffusion_loss_type l2 --batch_size 64 --normalize_factors [1,8,1] --conditioning alpha --dataset qm9_second_half

The argument --conditioning alpha can be set to any of the following properties: alpha, gap, homo, lumo, mu Cv. The same applies to the following commands that also depend on alpha.

Generate samples for different property values

python eval_conditional_qm9.py --generators_path outputs/exp_cond_alpha --property alpha --n_sweeps 10 --task qualitative

You can set --generators_path arguments to outputs/exp_35_conditional_nf192_9l_alpha to use our pre-trained model on alpha.

Train a property classifier network

cd qm9/property_prediction
python main_qm9_prop.py --num_workers 2 --lr 5e-4 --property alpha --exp_name exp_class_alpha --model_name egnn

Additionally, you can change the argument --model_name egnn by --model_name numnodes to train a classifier baseline that classifies only based on the number of nodes.

Evaluate the property classifier on EDM

Evaluate the trained property classifier on the samples generated by the trained EDM model

python eval_conditional_qm9.py --generators_path outputs/exp_cond_alpha --classifiers_path qm9/property_prediction/outputs/exp_class_alpha --property alpha --iterations 100 --batch_size 100 --task edm

To use a pre-trained generator and classifier model for alpha you can use the following arguments: --generators_path outputs/exp_35_conditional_nf192_9l_alpha and --classifiers_path qm9/property_prediction/outputs/exp_class_alpha_pretrained

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
configs		configs
data/geom		data/geom
egnn		egnn
equivariant_diffusion		equivariant_diffusion
generated_samples		generated_samples
outputs		outputs
qm9		qm9
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analyse_geom.py		analyse_geom.py
build_geom_dataset.py		build_geom_dataset.py
eval_analyze.py		eval_analyze.py
eval_conditional_qm9.py		eval_conditional_qm9.py
eval_sample.py		eval_sample.py
main_geom_drugs.py		main_geom_drugs.py
main_qm9.py		main_qm9.py
notebook_sampling.ipynb		notebook_sampling.ipynb
notebook_training.ipynb		notebook_training.ipynb
requirements.txt		requirements.txt
setup.py		setup.py
train_test.py		train_test.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EDM: E(3) Equivariant Diffusion Model for Molecule Generation in 3D.

Training the EDM:

After training

For GEOM-Drugs

For Conditional Generation

Train a Conditional EDM

Generate samples for different property values

Train a property classifier network

Evaluate the property classifier on EDM

About

Releases

Packages

Languages

License

luisaforozco/e3_diffusion_for_molecules

Folders and files

Latest commit

History

Repository files navigation

EDM: E(3) Equivariant Diffusion Model for Molecule Generation in 3D.

Training the EDM:

After training

For GEOM-Drugs

For Conditional Generation

Train a Conditional EDM

Generate samples for different property values

Train a property classifier network

Evaluate the property classifier on EDM

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages