My PhD Thesis is out!
After four years in the making, it is done. I defended, and finally made public my thesis
I’m incredibly excited about it.
At 272 pages, my thesis is a mathematical foundation for neural networks which is 1) end-to-end, 2) uniform, and 3) prescriptive. It doesn’t deal with only one aspect of neural networks such as architectures, or backpropagation in isolation. Rather, it also tackles optimisers, weight tying, loss functions and provides a complete picture of supervised learning. This picture is presented in a uniform language: all of the pieces fit together into a coherent whole. This is done without depending on Euclidean spaces, or particulars of the implementation of backpropagation, for instance. Lastly, this framework is not merely descriptive, but it’s a prescriptive one. This means it can be used to implement supervised learning code as-is — as opposed to merely describing specific properties of supervised learning.
To achieve this, I use the language of category theory. Category theory is an incredibly powerful and expressive theory which has in the recent years been used as a unifying force among the scientific disciplines, revealing their commonalities that were previously unknown. I believe it has the potential to provide a structuralist foundation of deep learning, and help advance its alchemical practices into a rigorous, transparent and a verifiable foundation.
I’ve put a lot of effort into placing category theory in context of the recent developments in deep learning, but also science as a whole. As such, I hope the introduction of the thesis helps newcomers to category theory understand the nature of its contributions, and the conclusion of the thesis give a sense of the possibilities unlocked by this principled foundation — some of which we’ve explored in our recent paper Categorical Deep Learning: An Algebraic Theory of Architectures.
There’s much more to say but for now — go check out the thesis!
And stay tuned fantastic developments around the corner. Monoids, categories, functors, (co)limits and categorical concepts in general were an indespensable tool for myself and many other researchers for getting a deeper understanding of the world. It enabled us to find patterns that are shared across scientific disciplines, but also to be rigorous and verifiable in our endeavours.
Is there a way to enable deep learning systems to reason in such symbolic, and algebraic ways? To generalise, abstract, and find patterns that go beyond a single scientific field, and do so in interpretable and verifiable ways? This is something I'm exploring next. Stay tuned.