**Summary:**I believe that there are better representations of neural networks that aid in faster understanding of backpropagation and gradient descent. I find representing neural networks as equation graphs combined with the value at run-time helps engineers who don’t have the necessary background in machine learning gets them up to speed faster. In this post, I generate a few graphs with Gorgonia to illustrate the examples.

Backpropagation has been explained to death. It really is as simple as applying the chain rule to compute gradients. However, in my recent adventures, I have found that this explanation isn’t intuitive to people who want to just get shit done. As part of my consultancy (hire me!) job^{[1]}, I provide a brief 1-3 day machine learning course to engineers who will maintain the algorithms that I designed. Whilst most of the work I do don’t use neural networks^{[2]}, recently there was a case where deep neural networks were involved.

This blog post documents what I found was useful to explain neural networks, backpropagation and gradient descent. It’s not meant to be super heavy with theory – think of it as an enabler for an engineer to hit the ground running when dealing with deep networks. I may elide over some details, so some basic understanding/familiarity of neural networks is recommended.

really, I need to pay the bills to get my startup funded↩^{[1]}people who think deep learning can solve every problem are either people with deep pockets aiming to solve a very general problem, or people who don’t understand the hype. I have found that most businesses do not have problems that involves a lot of non-linearities. In fact a large majority of problems can be solved with linear regressions.↩^{[2]}