A Direct Way of Understanding Backpropagation and Gradient Descent

Summary: I believe that there are better representations of neural networks that aid in faster understanding of backpropagation and gradient descent. I find representing neural networks as equation graphs combined with the value at run-time helps engineers who don’t have the necessary background in machine learning gets them up to speed faster. In this post, I generate a few graphs with Gorgonia to illustrate the examples.

Backpropagation has been explained to death. It really is as simple as applying the chain rule to compute gradients. However, in my recent adventures, I have found that this explanation isn’t intuitive to people who want to just get shit done. As part of my consultancy (hire me!) job[1], I provide a brief 1-3 day machine learning course to engineers who will maintain the algorithms that I designed. Whilst most of the work I do don’t use neural networks[2], recently there was a case where deep neural networks were involved.

This blog post documents what I found was useful to explain neural networks, backpropagation and gradient descent. It’s not meant to be super heavy with theory – think of it as an enabler for an engineer to hit the ground running when dealing with deep networks. I may elide over some details, so some basic understanding/familiarity of neural networks is recommended.

Continue reading

  1. [1]really, I need to pay the bills to get my startup funded
  2. [2]people who think deep learning can solve every problem are either people with deep pockets aiming to solve a very general problem, or people who don’t understand the hype. I have found that most businesses do not have problems that involves a lot of non-linearities. In fact a large majority of problems can be solved with linear regressions.

How To Make Money

Hey Chewxy, what do you think will happen if one day everyone decides to move their money onto a blockchain and no longer need banks?

That was a question that a friend asked me last week. I thought about the situation, gave some answers based on my what I understood of the world and the economy, while sketching out in broad strokes, what would happen. Essentially the conclusion was “civil unrest and war breaks out”[1].

Then came time to organize Sydney Python. Due to clashing meetup dates with Data Science Sydney, Girl Geek Sydney and other groups, there was a dearth of speakers. So I stepped up and gave a talk based on the hypothetical question. Here are the slides:

The code can be found in this the economics simulation github repository.
Continue reading

  1. [1]There were other conclusions too, I give the alternatives at the end of the blog post


I released Gorgonia on Thursday. Gorgonia is a library like Theano or TensorFlow, but mainly written in Go. It provides the necessary primitives for creating and executing neural networks and machine learning algorithms.

According to cloc, these are the stats:

chewxy@chewxy-Gallifrey:~/workspace/goworkspace7/src/github.com/chewxy/gorgonia$ cloc .
     357 text files.
     321 unique files.                                          
     604 files ignored.

http://cloc.sourceforge.net v 1.60  T=0.83 s (296.5 files/s, 55471.5 lines/s)
Language                     files          blank        comment           code
Go                             219           6308           3924          30858
Assembly                        22            585            740           2128
C/C++ Header                     2             55             57            666
C                                2             17             39            458
SUM:                           245           6965           4760          34110

So, it’s a pretty huge library. But the original version is about 80,000 LoC (though most of the lines of codes were different experimental variations of assembly code). I managed to cut down 50,000 LoC to something more manageable. In this post I want to outline the release of Gorgonia, and share some of the reasoning regarding the design of the library, as well as go thru some of the weirdness found in the library.

If you’re interested, here’s the video (otherwise, skip to the meat):

And here are the slides:

Continue reading

Yes and No

I was teaching my partner some mandarin recently and I came to the conclusion that “yes” and “no” are very weird constructs of language.

We were practicing one day, where I’d ask her questions in English and she’d reply in Mandarin. I asked her a yes/no question and she replied 不, to which I surprised myself by pointing out that 不 is ever only used in a negatory manner. People who know some Mandarin would interject and say, but there is 不(bù), 没(méi), and 无(wú) that can be used in stead of “no”. Yes, they can, but they’re usually not used without context.

Let’s look at some concrete examples to understand.
Continue reading

On the memory alignment of Go slice values

TL;DR and Meta – I was playing around with some AVX instructions and I discovered that there were some problems. I then described the investigation process of the issue and discovered that this was because Go’s slices are not aligned to a 32 byte boundary. I proceed to describe the alignment issue and devised two solutions, of which I implemented one.

On Thursday I decided to do some additional optimization to my Go code. This meant writing some assembly to get some of the AVX goodness into my program (I once gave a talk on the topic of deep learning in Go, where I touched on this issue). I am no stranger to writing assembly in Go, but it’s not something I touch very often, so sometimes things can take longer to remember how to do them. This is one of them. So this blog post is mainly to remind myself of that.

The values in Go slices are 16-byte aligned. They are not 32 byte aligned.
Continue reading

On Binary Classification of Human Beings

Over the years I have come up with some fun ideas of binary classifying people. They say “those who can’t do, teach”. That’s a binary classification – teachers and doers. I once did something like that, with a longer elaboration: Hackers and Engineers

Abstract Thinking Capabilities

Some people have better abstract thinking capabilities than others. I’ll use an example that makes this a particularly dangerous thought. Consider two young girls, A and B, who are playing with Barbie dolls. Now, let’s say both girls are black and are about say, 7 years old. They’re playing with white looking Barbie dolls. Girl A thinks of the doll she’s playing as a white girl. Girl B thinks of the doll she’s playing as an abstract representation of what a human female would be like.

I would suggest that both girls will grow up to be very different, based on the way they think alone. Girl A will grow up feeling that her race isn’t represented well in the toys she plays. Girl B on the other hand will not experience this as structural racism, mainly because – I would think – that having better abstract thinking capabilities mean a lack of attachment to one’s own identity (I call this the abstraction of identity).

To test this idea: we should be able to devise a test of sorts to test a person’s abstract thinking capability. We can also devise a test to examine people’s experiences with structural racism. Then correlate the answers. My hypothesis is that higher abstract thinking scores would correlate to lower experience of structural discrimination.

Functions and Instructions

A while ago, a friend gave a group of us an IQ quiz (and I hate those). The question was as goes: given 2 cups, one with 175ml and the other with a 250ml capacity, extract 100ml, 200ml, 220ml of water from a limitless jug of water. While some among the group were puzzling on how to get those quantities, there were others among the group who immediately called out 220ml as impossible[1].

You see, the problem is a gcd problem in disguise. It’s the very same problem as Euclid faced. Only multiples of the gcd (25, in case you were wondering) can be derived from machinations using the two cups. For us (well, me, at least), it was a gut feeling. Something seemed.. wrong about the number.

But that’s not the interesting part. The interesting part was most of the group who pointed out that 220 was impossible, took a very long time puzzling over the steps of how to actually get the others that are a multiple of 25. I especially took a long time to figure out the steps [2]. The other group who didn’t point out that 220 was impossible, took a far shorter time to figure out the succession of liquid pouring steps to get to their desired amount.

A quick survey noted that the group who pointed out 220 was impossible without even trying had math based degrees (well, all of us had math based degrees), and the other group had spent more time in their careers programming. This suggests to me that there are two modes of thinking. One I suspect the programming language theory community has long known: algorithmic thinking and machine-based thinking.

One thinking is based on more abstract concepts like function (Church and lambda calculus), and the other thinking based more on a concrete Turing machine (instructions in a linear fashion). One is not better than the other, in my opinion.

Ironically though, most of us in the group don’t use functional programming languages in our daily lives (all of us used Python, and most of us eschew the functional bits of it too). Though I did recommend using Haskell as a thinking tool for solving problems[3].

  1. [1]The long story was the question giver meant to say 225ml, not 220ml, but messed up the question
  2. [2]though I’d blame the alcohol I had imbibed by then
  3. [3]thinking of using Haskell in production and the cabal hell makes me want to just go cry into my shitty python virtualenv

Bloody Side Tracking Brain

This morning I woke up with Hungarian Dance No. 5 by Brahms stuck in my head. In my head, it’s a superiorly orchestrated, super high definition audio – much like sitting in the concert hall and being enveloped by the music of a live orchestra. I also have a very high quality copy of Hungarian Dance on my hard drive. I woke up at 5.45 am, went to gym, and returned at 7am and showered. It’s now 8.30 a.m., and I have yet to listened to the Hungarian Dance.

Reason? This was my thought pattern upon finishing up on my shower:

Hm, I should go give Hungarian Dance a good listen and get the earworm out of my head. Oh, my headphones have deteriorated in quality now. The flac quality won’t be represented well by the headphones. Better go look and buy new ones. Oh this one looks nice, I should go budget for it… oh wait, you just wanted to listen to Hungarian Dance, why are you shopping for headphones??! [rinse and repeat]

And for the last 1 hour, I’ve been shopping around for headphones. I have now a spreadsheet of headphones that are possible buys.

Ugh. I just need to get shit done. Not waste time with side tracks.

Batman v Superman – A Quick Thought

I watched BvS today. I don’t know what to think about it. Overall, I think the movie was a bit of a mess. But I can’t seem to pinpoint why. Breaking it down by the standard things that people use to judge movies, there doesn’t seem to be anything wrong.


Character-wise, I liked it quite a bit. Superman is at his Superman-est. Batman is also amongst the most Batmanest Batman I’ve encountered. For reference, I consider a lot of things to be Superman-in-character, but the epitome of it would be in Final Crisis, with the Miracle Machine and Nix Uotan. I consider the most Batman-in-Character Batman to be the tortured soul that Dr. Hurt put Batman through in Batman: Black Glove, culminating in Final Crisis and Batman RIP. The movie’s Superman and Batman are probably the most alike those two comicbook counterparts ever – though I’d argue that Batman in the movie is a hyperextreme version of the comic book version. Nonetheless, characterwise, they’re pretty much very close to the comic book representations, so I have no qualms.

Yes, even with Batman killing – he has caused the death of others in the movie, but never once directly caused the death of a person. Even KGBeast – Bats could have shot KGBeast in the head, cementing a direct kill. Instead, he shot the gas tank that KGBeast was wearing, causing it to blow up. In the mind and morality of Batman, he’s NOT killing someone – it’s the extreme version of “I’m not going to kill you, but I’m not going to save you either”. You can argue it’s a copout, but it’s a very perfectly Batman-ish character decision. It’s this Batman that I can believe shooting a radion bullet at Darkseid.

Supes isn’t a boyscout in this movie – he isn’t even a boyscout in the comics (Captain Marvel has that honour). But Supes has always been about the sacrifice to do something right. He’s the Jesus character, who is almost always conflicted about his actions and their repercussions – the guy who keeps news clippings in his Fortress of Solitude on the people whom he failed. I think Henry Cavill’s Kal El does just that. That sad shot in the Capitol scene nailed it for me. He even says it – Superman’s just a dream of a farmer from Kansas. I can really imagine this Superman being the one who assembles the Miracle Machine and wishes a happy ending for everyone.

Even Jesse Eisenberg’s Luthor, I feel, was Lex Luthor at his core. I consider the All-Star Superman Luthor and Lex Luthor: Man of Steel to be the epitome of the characterization of Lex. The Lex who would not cure his sister’s cancer because he wanted to show up Superman, and simply cannot abide being nothing but the best of Man – that’s the Lex Luthor I consider to be Lex Luthor. I mean, he was given an orange ring and became the god of Apokolips. He’ll become the Super Man (JL Rebirth spoilers). That’s the kind of person Lex is. He is a Machiavellian man who thrives in an environment of asymmetric information (though arguably you could say Bruce Wayne is more of that kind of person), and does whatever it takes to get what he wants. The surface of the mad scientist, the charming politician/businessman – that’s just the surface to who Lex Luthor is. Eisenberg, I feel, plays Lex with a different surface, but still holds the same core.

Lois and Diana were a bit of blank slates in the movie – they weren’t as central, but I felt their characterizations was decent enough for the movie. Wonder Woman was awesome – she got hit back by Doomsday and all she does is grin. Now THAT’s an Amazonian I can get behind.


And so we consider plot next – The plot of the movie was convoluted, but made quite a bit of sense given some thought. Here was Lex Luthor, a man, so torn by the fact that he’s no longer the best that he is – and he’s so used to getting what he wants, mind – plots to bring the Superman down. He does this the way a rational person does it: harm Superman’s reputation first – dishonor Superman’s reputation in the general public by making him show up to Namimia and frame KGBeast’s actions on him. Then engineer a public outcry through the use of the government process. At the same time, like any rational person, he creates a backup plan, because Junebug wouldn’t play ball. He needs access to the government’s storage of the Kryptonian facilities, to gain more knowledge, because knowledge would grant him more power. His political machinations fail, so he gets rid of the evidence (blaming Superman along the way) and goes to his second plan – to set the Bat of Gotham against Superman. His power comes from asymmetric information – he knows the identities of Batman and Superman (as befitting the smartest man in the world), and is able to use that information to get the leverage he needs.

I think the problem was that the story was told from very differing points of views. It starts as Bruce Wayne, anchoring the story with his narrative, but somehow that anchor is lost somewhere in the movie, after the chaotic scene switchings. And the piecing together of the story is told through Lois, while Superman was given a very small subplot (the Bat of Gotham and his vigilante ways).

The reveal and payoff (the scene with Lois and Swannick and the bullet) was worth it though – it immediately made the seemingly random and disconnected cut scenes in the beginning of the movie feel more like reading the beginnings of a comic book arc (especially Grant Morrison’s work).


I do feel that the editing was quite bad. It was choppy. Narratives didn’t lead from one to another. But like I mentioned in the previous section, it felt a bit like a comic book arc – having you to compartmentalize different parts of the arc before a payoff (it’s also one of the reasons why I don’t pull weeklies and rather read trades).

Against the justification that it was a bit more comic book like, I’m not so sure if the editing is genuinely bad or intentional. The only truly bad bit of editing was when Diana was reading the email – that scene just broke the tension of the Doomsday battle. Another jarring narrative transition that I didn’t quite enjoy was the lead in from Africa to the Congressional hearings.

cfgt, who watched the movie with me thought the movie was too long and the Capitol scene could have been cut. I disagree – because that scene cemented Luthor as a guy who doesn’t fuck around. I felt that there were certain scenes that were not well explained, characters not fully fleshed out. It shows Superman being upset at himself for not being vigilant enough. It doesn’t really hurt the plot though (although that may be bias given I have a fairly vast knowledge of the DC universe).

I felt like the movie was too rushed, and not enough time devoted to the development of characters. Lex for example, was noticeably a lot more deranged after emerging from the Kryptonian birthing matrix – as if Keelex told him something dark and foreboding to come, and he had decided to take matters into his own hands to have a Super Man that he can control (Doomsday).


In conclusion – I have no idea what to feel about the movie. When broken down and you think about it, each component – acting, character, general plot, etc all felt really good. It felt like the most comic-booky comic book movie I’ve ever watched, with some stellar acting. But on the whole, it was just a bummer. It felt heavy, and weighted, and joyless. There is a certain sense of doom throughout the movie – like a lead fog, weighing down upon the subconscious.

Perhaps that I have been compelled to actually write a blog post about Batman v Superman is a sign of denial – that I’m trying to actually convince myself it’s a good movie. Or perhaps, it’s a genuinely good movie that I feel is underworked. I have no idea.

p/s: Namimia? C’mon Zack, you could have used Khandaq, or Bialya, or Polokistan. Everyone’d have loved it even more.

Not Enough

Do you sometimes feel like you’re not smart enough, not strong enough… not _ enough to do your pursuits?

At what point do you give up? I am so tired. The alternative – not pursuing what I want to do… is worse.