On Twitter a couple of days ago I posted this tweet which had this image:
What To Test
You’re in a rush. Your product demo was due three months ago. And you still don’t have good unit tests. What do you do? You prioritize.
You wrote a very complex piece of machinery with a lot of moving parts. You sorta know how they all fit together in your mind but there isn’t any good know if the program you wrote actually works the way you want it. The solution: unit testing. But how to test and where to start? What do you do? You prioritize.
This post is about how I prioritize what to test. I know how I write my programs, and I write them in a particular way. Feel free to adapt the following to your own process.
Since my program is made mostly of functions, I prioritize my tests by analyzing the function calls of the package/program. Some functions are called many times by different callers, some functions are top functions which calls other functions and outside of the package the top functions are never called. This forms a hierarchy of sorts - some calls are simply more important than others. Since I write mostly in Go nowadays, I’ll use Go as an example. The Go toolchain actually provide us with a lot of useful tools to analyze and prioritize tasks. The most important tool is the callgraph program. This is the invocation of the spell:
callgraph -algo=static -format=graphviz $(go list -tags=release -f '{{.GoFiles}}' | sed -ne 's/\[//p' | sed -ne 's/\]//p') | grep -P '\-\> "(PACKAGENAME|\(\*?PACKAGENAME\.)' | uniq > callgraph.dot
[Read More]
A Direct Way of Understanding Backpropagation and Gradient Descent
Backpropagation has been explained to death. It really is as simple as applying the chain rule to compute gradients. However, in my recent adventures, I have found that this explanation isn’t intuitive to people who want to just get shit done. As part of my consultancy (hire me!) job* really, I need to pay the bills to get my startup funded , I provide a brief 1-3 day machine learning course to engineers who will maintain the algorithms that I designed. Whilst most of the work I do don’t use neural networks* people who think deep learning can solve every problem are either people with deep pockets aiming to solve a very general problem, or people who don't understand the hype. I have found that most businesses do not have problems that involves a lot of non-linearities. In fact a large majority of problems can be solved with linear regressions. , recently there was a case where deep neural networks were involved.
This blog post documents what I found was useful to explain neural networks, backpropagation and gradient descent. It’s not meant to be super heavy with theory - think of it as an enabler for an engineer to hit the ground running when dealing with deep networks. I may elide over some details, so some basic understanding/familiarity of neural networks is recommended.
[Read More]How To Make Money
Hey Chewxy, what do you think will happen if one day everyone decides to move their money onto a blockchain and no longer need banks?
That was a question that a friend asked me last week. I thought about the situation, gave some answers based on my what I understood of the world and the economy, while sketching out in broad strokes, what would happen. Essentially the conclusion was “civil unrest and war breaks out"* There were other conclusions too, I give the alternatives at the end of the blog post .
Then came time to organize Sydney Python. Due to clashing meetup dates with Data Science Sydney, Girl Geek Sydney and other groups, there was a dearth of speakers. So I stepped up and gave a talk based on the hypothetical question. Here are the slides:
The code can be found in this the economics simulation github repository.
[Read More]Gorgonia
I released Gorgonia on Thursday. Gorgonia is a library like Theano or TensorFlow, but mainly written in Go. It provides the necessary primitives for creating and executing neural networks and machine learning algorithms.
According to cloc, these are the stats:
chewxy@chewxy-Gallifrey:~/workspace/goworkspace7/src/github.com/chewxy/gorgonia$ cloc . 357 text files. 321 unique files. 604 files ignored. http://cloc.sourceforge.net v 1.60 T=0.83 s (296.5 files/s, 55471.5 lines/s) ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- Go 219 6308 3924 30858 Assembly 22 585 740 2128 C/C++ Header 2 55 57 666 C 2 17 39 458 ------------------------------------------------------------------------------- SUM: 245 6965 4760 34110 -------------------------------------------------------------------------------
So, it’s a pretty huge library. But the original version is about 80,000 LoC (though most of the lines of codes were different experimental variations of assembly code). I managed to cut down 50,000 LoC to something more manageable. In this post I want to outline the release of Gorgonia, and share some of the reasoning regarding the design of the library, as well as go thru some of the weirdness found in the library.
If you’re interested, here’s the video (otherwise, skip to the meat):
And here are the slides:
[Read More]On the memory alignment of Go slice values
On Thursday I decided to do some additional optimization to my Go code. This meant writing some assembly to get some of the AVX goodness into my program (I once gave a talk on the topic of deep learning in Go, where I touched on this issue). I am no stranger to writing assembly in Go, but it’s not something I touch very often, so sometimes things can take longer to remember how to do them. This is one of them. So this blog post is mainly to remind myself of that.
The values in Go slices are 16-byte aligned. They are not 32 byte aligned.
[Read More]Naming Things Poorly
Go Test Files Are Part of the Same Package
Addendum/Errata for “Monads, In My Python?”
I gave a talk at PyConAU – about monads. This blog posts contains some thoughts about the talk, and some addendum/errata that I was not able to cover in the talk. But first, here’s the talk and associated slides.
[Read More]