What started as a basic linear algebra calculator project grew into a symbolic tensor system with autodiff, custom matrix ops, attention mechanisms, LayerNorm, GELU, and even a text generation demo trained on the Brown corpus.
I'm still an undergrad, so my main goal is to deeply understand how deep learning actually works under the hood - gradients, attention, backpropagation, optimizers - by building it step-by-step with full visibility into everything, and without relying on big frameworks or libraries.
It’s not fast or production-ready, but that’s not the point. As of now, it’s more so aimed at exploration and understanding. I mainly wanted to explore how deep learning works by building it through first principles.
It’s still a work in progress (lots to learn and improve in terms of structure, docs, and performance), but I figured it was worth sharing.
I’d love any feedback, questions, ideas, or even just thoughts about what you’d add, change, or do differently. Thanks for reading!