Go is a Pretty Average Language

But OCaml is Pretty Great

I had a data visualization problem at work. I’ve been thinking about set coverage issues, and wanted to test some ideas for visualizations. I had wanted to visualize the space of aggregate measures (i.e. things like means, etc). It later transpired that I didn’t need it, because my thinking around the issue had been wrong to begin with. I had written some code, and was eager to check it out. By the end of it, it had morphed into something entirely different, but it was a good entertaining night last night nonetheless. [Read More]

A Most Vivid Dream

I have had the most vivid and unusual dream that I feel compelled to note it down for posterity. Or for future self-introspection. It’s too long to write in my notebook and I feel more comfortable typing it out anyways. In the dream I was at a birthday party of a friend of mine. It had gone particularly wrong - I was the organizer and I had arranged for logistics to cater for two children. [Read More]

Deceptively Simple Is Deceptively Simple

I recently used the words “deceptively simple” to describe lambda calculus. One person reading my paper sent a comment back: “do you mean ‘deceptively complicated’?”. What I had meant to say was on the surface, lambda calculus looks simple. But when you consider all the meta conditions of alpha-renaming and substitutions - you know, practical things about programming - then it isn’t. It turns out the phrase “deceptively simple” requires disambiguation - it means one of two diametrically opposite meanings: [Read More]

Namespaces Are Useful

Or: How I Got Bitten by the Dot Import Gotcha

I was extending Gorgonia for a project of mine when I rapidly ran into a dot-import gotcha in Go. Specifically I was trying to implement a fused version of the Conv2d function that exists. The current Conv2d function works well, if you want to do image convolution related work. It could be quite a bit faster (if anyone from Intel is reading, I’d love some help in the same way Intel boosted the speeds of Caffe), but that’s not really the concern - different convolution algorithms have different performance characteristics, and should be used accordingly. [Read More]

Do You Need Deep Learning?

As a guy who has his own deep learning library that aims to rival Tensorflow and PyTorch, the answer is: “chances are, no”. Around this time last year, I was running a startup and as a side hustle, I was doing consulting work for any parties interested in machine learning. I get a lot of requests from businesses who want to empower their businesses with “AI”. The results from any successful consults*Don't let terms like " [Read More]

How To Use Go Interfaces

I occasionally give free Go consults and code review on top of my daily work. As such, I tend to read a lot of other peoples’ codes. And while this is really more of a feeling *Now, you should go, really? You're a statistician by training ffs, I’ve seen an increase in what I call “Java-style” interface usage.

This blog post is a Go specific recommendation from me, based on my experiences writing Go code, on how to use interfaces well.

For this blog post, the running example will span two packages: animal and circus. A lot of what I write about here is about code at the boundary of packages.

[Read More]

Tuples Are Powerful

In this post I lay out the unjustifyable reasons why Gorgonia lacks tuple types. Along the way we revisit the idea of constructing integer types from natural numbers using only tuples and the most basic functionalities. I then close this blog post with further thoughts about computation in general and what that holds for Gorgonia's future.

Over Chinese New Year clebrations, a friend asked (again) about the curious lack of a particular feature in Gorgonia, the deep-learning package for Go: tuples, which led to this tweet (that no one else found funny :( )

The feature that was missing is one that I’ve vehemently objected to in the past. So vehemently objected I was to this that by the first public release of Gorgonia, there was only one reference that it ever existed (by the time I released Gorgonia to public, I had been working of 3 versions of the same idea).

[Read More]

Term Rewriting Chinese Relatives

Learn Chinese AND Functional Programming At the Same Time

I recently attended QFPL’s excellent Haskell course. Tony Morris was a little DRY*It's a joke. Tony kept mentioning Don't Repeat Yourself and being lazy but nonetheless was an excellent presenter *The course shook my confidence in my existing ability to reason in Haskell for a bit but it was for the better - I had some fundamentals that were broken and Tony explained some things in a way that fixed it... for now - I have no doubt some basics will be lost to the ether in the next few months. So for the rest of the week I was in a bit of a equational-reasoning mode.

Then my dad sent me a cute link to a calculator that calculate vocatives for Chinese relatives. Given English as my first language (hence not default mode of thinking), this kicked me off in to a chain of thoughts about languages and symbols (you’d find a high amount of correlation between my switching modes of thinking and blog posts - the last time this happened, I wrote about yes and no).

One of the difficult things that many people report with programming languages is that the decoupling of syntax and semantics. I’ve often wondered if we might be better off with a syntax that is based off symbols (rather like APL) - the initial hurdle might be higher, but once that’d done, syntax and semantics are completely decoupled. Then we’d not have flame wars on syntax, rather a more interesting flame war on semantics and pragmatics.

Another line of thinking I had was the hypothetical development of computing and logics in a parallel universe where Chinese was the dominant linguistic paradigms - it’s one that I’ve had since I visited China for the first time.

Combined, these trains of thoughts led to this blog post. So let’s learn some Chinese while learning some (really restricted) functional programming! Bear in mind it’s a very rough unrigorous version.

[Read More]

Go For Data Science

This may come as a surprise for many people, but I do a large portion of my data science work in Go. I recently gave a talk on why I use Go for data science. The slides are here, and I’d also like to expand on a few more things after the jump:

[Read More]

Data Empathy, Data Sympathy

Today’s blog post will be a little on the light side as I explore the various things that come up in my experience working as a data scientist.

I’d like to consider myself to have a fairly solid understanding of statistics*I would think it's accurate to say that I may be slightly above average in statistical understanding compared to the rest of the population.. A very large part of my work can be classified as stakeholder management - and this means interacting with other people who may not have a strong statistical foundation as I have. I’m not very good at it in the sense that often people think I am hostile when in fact all I am doing is questioning assumptions*I get the feeling people don't like it but you can't get around questioning of assumptions..

Since the early days of my work, there’s been a feeling that I’ve not been able to put to words when I dealt with stakeholders. I think I finally have the words to express said feelings. Specifically it was the transference of tacit knowledge that bugged me quite a bit.

Consider an example where the stakeholder is someone who’s been experienced in the field for quite sometime. They don’t necessarily have the statistical know-how when it comes to dealing with data, much less the rigour that comes with statistical thinking. More often than not, decisions are driven by gut-feel based on what the data tells them. I call these sorts of processes data-inspired (as opposed to being data-driven decision making).

These gut-feel about data can be correct or wrong. And the stakeholders learn from it, becoming experienced knowledge. Or what economists call tacit knowledge.

The bulk of the work is of course transitioning an organization from being data-inspired to becoming actually data-driven.

[Read More]