Below you will find pages that utilize the taxonomy term “Complex Structure”
Deciding <u>W</u>hen <u>T</u>o <u>F</u>eedforward (or WTF gates)
Another paper of mine, titled “Towards Implicit Complexity Control using Variable-Depth DNNs for ASR Systems” got accepted to the International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016 in Shanghai, which happened not too long ago.
The idea behind this one was the intuition that in a classification task, some instances should be simpler than others to classify. Similarly, the problem of deciding when to stop in an RNN setting is also an important one. If we take the bAbI task for example, and go an extra step and assume the number of logical steps to arrive at the answer is not provided for you, then you need to know when the network is ‘ready’ to give an answer.
Neural Turing Machines FAQ
There’s been some interest in Neural Turing Machines paper, and I’ve been getting some questions regarding my implementation via e-mail and the comments section on this blog. I plan to make this a blog post where I’ll regularly come back and update with answers to some of these questions as they come up, so do check back!
Learning Gaussian Feature Extractors
While playing around with the MNIST dataset and the example code, I tried to visualise the weights of the connections from the weights to the hidden layer. These can be thought of as feature extractors of the input.
Neural Turing Machines – A First Look
Some time last week, a paper from Google DeepMind caught my attention.
<p>
The paper is of particular interest to me because I’ve been thinking about how a recurrent neural network could learn to have access to an external form of memory. The approach taken here is interesting as it makes use of a balance between seeking using similarity of content, and shifting from that using location.
</p>
<p>
My focus this time would be on some of the details needed for implementation. Some of these specifics are glossed over in the paper, and I’ll try to infer whatever I can and, perhaps in the next post, have code (in Theano, what else?) to present.
</p>
<p>
Recursive Auto-encoders: An Introduction
I’ve talked a little bit about recursive auto-encoders a couple of posts ago. In the deep learning lingo, an auto-encoder network usually refers to an architecture that takes in an input vector, and through a series of transformations, is trained to reproduce that input in its prediction layer. The reason for doing this is to extract features that describe the input. One might think of it as a form of compression: If the network is asked to be able to reproduce an input with after passing it through hidden layers with a lot less neurons than the input layer, then some sort of compression has to happen in order for it to be able to create a good reconstruction. So let’s consider the above network. 8 inputs, 8 outputs, and 3 in the hidden layer. If we feed the network a one-hot encoding of 1 to 8 (setting only the neuron corresponding to the input to 1), and insist that that input be reconstructed at the output layer, guess what happens?