A method efficiently leverages online human feedback to fine-tune Stable Diffusion for various range of tasks

An enhanced multimodal representation using weighted point clouds and its theoretical benefits
A 64x64 pre-trained diffusion model is all you need for 1-step high-resolution SOTA generation
Unified framework enables diverse samplers and 1-step generation SOTAs
Applications:
[SoundGen]

Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation Using GANs and Integrated Unaligned Clean Data

DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability

MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation

Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models

STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events
### Contact