Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neural Discrete Representation Learning #23

Open
flrngel opened this issue Sep 18, 2018 · 0 comments
Open

Neural Discrete Representation Learning #23

flrngel opened this issue Sep 18, 2018 · 0 comments

Comments

@flrngel
Copy link
Owner

flrngel commented Sep 18, 2018

https://arxiv.org/abs/1711.00937

Abstract

  • paper proposes model(VQ-VAE) that learns "discrete representations"
  • differs from VAEs
    • encode network outputs discrete (means not continuous)
    • prior learnt than static
    • circumvent issues of posterior collapse
      • latent ignored by decoder (typically observed by other VAEs)

1. Introduction

  • usefulness of generic representations in unsupervised fashion is lack
  • model conserves the important features of the data in latent space while optimising for maximum likelihood
  • paper concentrate on representations
  • images can often be described concisely by language
  • paper most of VAE with discrete latent representations uses parameterization of the posterior distribution of observation but this paper relies on vector quantization
  • posterior collapse is that latents being ignored
  • can span many dimensions in data space

Models feature

  • simple and unsupervised
  • use discrete latent, not suffer from posterior collapse and has no variance issue
  • perform as well as continuous model
  • coherent and high quality on a wide variety

2. Related work

3. VQ-VAE

image

Order

  1. Encoder parameterises posterior distribution q(z|x) of discrete latent random variables z with data x
  2. posteriors and priors in VAEs are assumed normally distributed with diagonal covariance, which allows for Gaussian re-parameterization trick to be used [32, 23]
  • autoregressive prior and posterior models [14]
  • normalizing flows [10]
  • inverse autoregressive posteriors [22]

3.1. Discrete Latent variables

  • K is the size of the discrete latent space
  • D is dimensionality of each latent embedding vector e_i
    image

3.2. Learning

  • Loss
    • reconstruction loss
    • stop gradients
    • commitment loss
      image

4. Experiments

image

5. Conclusion

  • capable of modeling very long term dependencies through compressed discrete latent space
  • VQ-VAEs capture important features

My Comments

  • word discrete is by embedding(e_i) in VQ-VAE
  • training would be hard because we should consider hyperparameters in Loss function
  • tf.stop_gradient
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant