Below you will find pages that utilize the taxonomy term “Uncategorized”

Can Chinese Rooms Think?

There’s a tendency as a machine learning or CS researcher to get into a philosophical debate about whether machines will ever be able to think like humans. This argument goes so far back that the people that started the field have had to grapple with it. It’s also fun to think about, especially with sci-fi always portraying AI vs human world-ending/apocalypse showdowns, and humans always prevailing because of love or friendship or humanity.

But there’s a tendency for people in such a debate to wind up talking past each other.

Posts

FizzBuzz in Theano, or, Backpropaganda Through Time.

Posts

Dropout using Theano

A month ago I tried my hand at the Higgs Boson Challenge on Kaggle. I tried using an approach neural networks that got me pretty far initially, but other techniques seemed to have won out.

Posts

“It’s like Hinton diagrams, but for the terminal.”

Which of the two matrix representations below would you rather be looking at? Hinton diagrams are often used for visualising the learnt weights of neural networks. I’ve often found myself trying to imagine what the weights look like. And fortunately for me today, I remembered this project by GitHub’s Zach Holman. Turns out, overriding the way NumPy represents numbers wasn’t too hard, so I hacked myself a cool little solution.

Posts

Finding Maximum Dot (or Inner) Product

A problem that often arises in machine learning tasks is trying to find a row in a matrix that gives the highest dot product given a query vector. Some examples of such situations:

You’ve performed some kind of matrix factorisation for collaborative filtering for say, a movie recommendation system, and now, given a new user, you want to be able to specify a couple of movies that your system would predict he would rate highly.
A neural network where the final softmax predictive layer is huge (but you managed to train it, somehow).

In both these cases, the problem boils down to trying to search a collection of vectors to find the one that gives the highest (or the $k$ highest) dot product(s).

A simple way to do this would be to perform a matrix multiplication, and then to find the best scoring vector by scanning through the values. This is effectively performing $N$ dot product computations for a matrix with $N$ rows. Can we do better?

Posts

March Madness with Theano

I’m not particularly familiar with NCAA Men’s Division I Basketball Championship, but I’ve seen the March Machine Learning Madness challenge come up for a few years now, and I’ve decided to try my hand at it today.

I also haven’t tried a machine learning task like this one. At it’s simplest (assuming you don’t harvest more data about each team and their players), all you have is a set of game data: who won, who lost, and their respective scores. Intuitively, we should be able to look at tables like these and get a rough sense of who the better teams are. But how do we model it as a machine learning problem?

Posts

Naive Bayes Categorisation (with some help from Elasticsearch)

Back in November, I gave a talk during one of the Friday Hackers and Painters sessions at Plug-in@Block 71, aptly titled “How I do categorisation and some naive bayes sh*t” by Calvin Cheng. I promised I’d write a follow-up blog post with the materials I presented during the talk, so here it is.

Posts

My Quora Codesprint Submission

(this is x-posted on Quora)

I’ve had some experience in the past with machine learning, but I feel like I still don’t have a proper methodology. I’d like to hear what you guys think about what I’ve done here.