In 2017 I built my first neural network from scratch.
No imports, no libraries, just raw Matlab and a course assignment. In fact, the instructor set the course up in a way that tickled my brain just right. It was a competition between groups.
I didn’t even care that I took on a major part of the group project.
I was inspired to build the best network I could!
My brain was in hyperfocus.
I read the course slides, supplemental materials, exchanged emails with the tutor, something I have never done. And I even read the Deep learning book! It’s very possible that all my other PhD duties kind of fell to the wayside during these weeks.
But that wasn’t enough. I knew I had implemented the neural network correctly. It was predicting ok. The accuracy was above 80%. But training was slow and you could see that the model was over-fitting quickly. I read up on commonly used solutions on hand-written digit prediction.
I even tried to implement a convolutional neural network, but that was impossible to debug and near impossible to train in Matlab on a CPU.
Then I found data augmentations!
Stack Overflow had the code ready to perform small transformations on the images. Leaving believable changes, but still diversified the training data. My training and validation accuracy shot up above 95%. I was stoked!
Building a neural network from scratch in this small course-wide competition has taught me all the internals of neural networks. I even implemented a janky form of mini-batch SGD with momentum! While it’s not the most accessible form to learn, I attribute much of my insight into neural networks to my humble beginnings in Matlab.
We won second place, but more importantly, I won a passion for a subject that has carried me for 5 years and counting.