The Winning Lottery Ticket Hypothesis is a fun hypothesis in deep learning.
Essentially, the hypothesis states that a neural network contains a smaller subset of weights that can perform as well or better as the network it is contained in. This subnetwork is the winning lottery ticket. They were able to show that you are able to prune a large majority of the weights from a neural net and still generalize to unseen data. Isn’t this interesting?
The problem is, however, that this approach relies on sparsity, unstructured sparsity to be exact. The idea is that a matrix only has select few entries that are non-zero. This makes sense, considering that our neural network starts out as a dense network and pruning sets weights, or entries, in our linear algebra system zero. The problem being simply that our GPUs and TPUs currently don’t really deal well with sparsity.
Nevertheless, it would be extremely cool, if we could train a large model and get a smaller, equally performant model for free. That would mean we can compress a neural network without and possibly run them on much smaller hardware!
This is where the Generalized Winning Lottery Ticket Hypothesis comes into play. Some very smart mathematicians figured that if we “simply” change the reality of our neural network parameters, i.e. change the basis, we may be able to obtain a form of structured sparsity. Something that computers are much more capable to deal with!
Isn’t this neat? This is certainly a first step into making research into some exciting efficiencies in deep learning possible.
I write about these machine learning insights every week in my newsletter.