Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relational recurrent neural networks #17

Open
flrngel opened this issue Jun 19, 2018 · 0 comments
Open

Relational recurrent neural networks #17

flrngel opened this issue Jun 19, 2018 · 0 comments

Comments

@flrngel
Copy link
Owner

flrngel commented Jun 19, 2018

https://arxiv.org/abs/1806.01822
Paper from Deepmind

Abstract

  • paper's model built with intuition
  • so it is unclear why this works well
  • shows state-of-the-art on the WikiText103, Project Gutenberg, and GigaWord datasets

1. Introduction

2. Relational reasoning

  • this is process of understanding the ways in which entities are connected
  • author claims that current models can be cast as relational reasoning
    • convolutional can be said to compute a relation of pixels (entities)
    • node and edge in message passing neural networks are entities
      • learnable nodes-edge connection can be relation
    • see also [13, 14, 17]
  • but in current model, it is unclear whether inductive biases makes limitation
    • memory augmented networks

3. Model

  • paper's model is assembled with
    1. LSTMs
    2. memory augmented
    3. non-local net
  • paper's model applied attention between memories at a single time step
    • this differs from current models that uses attention across all previous representations

3.1. Allowing memories to interact using MHDPA

  • should note that MHDPA uses KQV linears
    image
    image
    equals to image which means proposed update to M(memory)

3.2. Encoding new memories

image

3.3. Introducing recurrence and embedding into an LSTM

see also Official tensorflow implemenation
image

image means row/memory-wise MLP with layer normalization

4. Experiments

  • they tested
    • 4.1. Illustrative supervised tasks
    • 4.2. Reinforcement learning
      • Mini Pacman with viewport
    • 4.3. Language Modeling
      • WikiText103
      • Project Gutenberg
      • GigaWord datasets

image

5. Results

Figure3 is from n-th farthest task
image
image

6. Discussion

  • author proposed inutions for the mechanisms for complex relational reasoning
  • author considers their result primarily as "evidence of improved function"

Appendix

image

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant