dyth

David Yu-Tung Hui, 許宇同

I am an independent researcher interested in Deep Reinforcement Learning. My research focuses on increasing the optimization stability of off-policy gradient-based $Q$-learning algorithms over a range of tasks and hyperparameters. I'm especially interested in developing algorithms to solve continuous control tasks.

I've written two works along this research direction:

Stabilizing Q-Learning for Continuous Control
David Yu-Tung Hui
MSc Thesis, University of Montreal, 2022
I presented empirical evidence that LayerNorm prevented off-policy $Q$-Learning from diverging in the MuJoCo and DeepMind Control continuous control environments. I also showed that using LayerNorm in DDPG enabled learning non-trivial behaviors in the dog-run task of DeepMind Control.
[.pdf] [Errata]
Double Gumbel Q-Learning
David Yu-Tung Hui, Aaron Courville, Pierre-Luc Bacon
Spotlight at NeurIPS 2023
In this conference paper, we model noise introduced by a function approximator in $Q$-learning as a heteroscedastic Gumbel distribution. We derived a loss function from this noise model that was effective in off-policy continuous control -- our resultant algorithm achieved ~2x the aggregate performance of SAC after 1M training timesteps.
[.pdf] [Reviews] [Poster (.png)] [5-min talk] [1-hour seminar] [Code (GitHub)]

In 2023, I graduated with an MSc from Mila, University of Montreal. I'm looking for opportunities where I can continue my research.

For more information about me, see my Google Scholar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dyth

Block or report dyth

David Yu-Tung Hui, 許宇同

Pinned Loading