Go Test Files Are Part of the Same Package

Just a quick one. I was working on improving performance for a certain method of mine. I had found the hot loop [1], and I wrote a few benchmark methods to test some ideas.

I was using the testing package’s benchmark function to benchmark the methods, and for a method, I had abstracted out some code so that it can run in a goroutine. Here’s the code for the function:


func receiver(ch chan tmp, out chan []float64, wg *sync.WaitGroup) {
    Ys := make([]float64, len(x))
    for v := range ch {
        Ys[v.id] = v.res
        wg.Done()
    }
    out <- Ys
}

Spotted the problem? No? It's the second line: retVal := make([]float64, len(x)). You'll note that x wasn't declared anywhere. And yet, the benchmarks ran! I only ran into a problem when I fed the test case a weird corner case where I knew funny things would happen.

At first I was puzzled why the compiler hadn't caught it. I scoured all through both files (it was a throwaway package, written solely to test ideas, so it had only two files: throwaway.go and throwaway_test.go)

Here are the top few lines of my throwaway_test.go file:


package throwaway

import (
    "testing"
    "math/rand"
)

var x []float64 = X(784) // <-- THE DECLARATION THAT CAUSED PAIN
func init() { ... }

I had added that variable originally as part of a setup/teardown function. I had then forgotten completely about it. I had been so used to writing code in the *_test.go files as if they were part of a separate package that just imported the functions, that I forgot that they were part of the same package.

The lesson learned today, other than global variables are evil[2], is that variables declared in the test files can have an effect on the main files if you're not careful about it, because the test files are part of the same package.

Now I want my hour lost back!

Adddendum

As Damian kindly points out:

The above only really happened because I was running go test -bench . a lot. If I had used go build . the compiler would have thrown an error

  1. [1] pprof is an amazing godsend. The days of dicking around in valgrind or cProfile are long a memory of the past
  2. [2] but sometimes are necessary, which I would argue for this specific test case, is

Whole Fruit Espresso

I’ve been toying around with new ideas of coffee lately. Here is one that I think went particularly well. It started with red-eyes: you put a shot of espresso in filter coffee, just to boost acidity and body of the coffee whilst still keeping the basic aromatics in the coffee (making espresso kills quite a bit of those).

I then moved on to the idea of making cascara red-eyes. If red eyes were flavourful, perhaps then using the pulp of the fruit will yield a different thing all together? And indeed it did. The hibiscus-y nature of the cascara tea does accentuate the espresso. Then I wondered if I could push it further – what if the cascara “tea” was made under pressure – i.e. espresso?
Continue reading

Intuitions From The Price Equation

George Price was a rather interesting fellow. A few months ago, I was reading a rather interesting piece about his life from HN. If you follow my blog posts (hello to the two of you), you’ll note that altruism and cooperative games is one of the things I like to blog about.

Following that article, I discovered the Price equation[1]. While grokking the equation, it had suddenly occurred to me that kin selection and group selection were indeed the same thing. It was a gut feeling, and I couldn’t prove otherwise.

So what I told you was true... from a certain point of view

I recently had a lot of time on hand[2], so I thought I’d sit down and try to make sense of my gut feel that kin selection and group selection were in fact the same thing. Bear in mind I’m neither a professional mathematician nor am I a professional biologist. I’m not even an academic and my interest in the Price equation came from an armchair economist/philosopher point of view. And so, while I grasp a lot of concepts, I may actually have understood them wrongly. In fact, just be forewarned that this entire post was a result of me stumbling around.

So, let’s recap what the Price equations look like (per Wikipedia):

\Delta z = \frac{1}{w} cov(w_i, z_i) + \frac{1}{w} E(w_i \Delta z_i)

Simply put, \Delta z is the difference in phenotype between a parent population and the child population. And that difference is a function of two things:

  1. The covariance of fitness and phenotype – \frac{1}{w} cov(w_i, z_i) where w is the average fitness of the population, w_i is the individual fitness of i , and z_i is the phenotype shared in the group.
  2. The expected value of the fitness of the difference between the group’s phenotype and the parent group’s phenotype.

Continue reading

  1. [1] Funny story. I was quite surprised I hadn’t heard of the Price equation, so I hit the books. I found the equation being referenced very very very very briefly in Martin Nowak’s Evolutionary Dynamics, and that was all
  2. [2] Being laid off does that to you :)

The Skynet Argument Against Social Media

In The Terminator (1984), Skynet sends a T-800 to terminate Sarah Connor. And the Terminator had to look up a phone book to find three Sarah Connors, because it mainly didn’t know what Sarah Connor looked like or where she lived.

That made sense in 1984. If the records had been destroyed in the war – records can be destroyed because physical drives were expensive and don’t have much capacity. Skynet wouldn’t have known how Sarah Connor looked like, or any other of her personal details. Rewatching The Terminator in 2015, this would have made no sense. If Skynet were made today, it would simply scour the cloud for information about Sarah Connor. And she’d be cleanly terminated.

There you go, kids. Don’t use social media. Arnold Schwartzeneggar and the T-1000 will come kill you.

Algorithms Are Chaotic Neutral

Carina Zona gave the Sunday keynote for PyConAU 2015. It was a very interesting talk about the ethics of insight mining from data, and algorithms. She gave examples of data mining fails – situations where Target discovered a teenage girl was pregnant before her parents even knew; or like machine learned Google search matches that implied black people were more likely to be arrested. It was her last few points that I got interested in the ethical dilemmas that may occur. And it is these last few points that I want to focus the discussion on.

One of the key points that I took away[1] was that the newer and more powerful machine learning algorithms out there are inadvertantly discriminate along the various power axes out there (think race, social economic background, gender, sexual orientation etc). There was an implicit notion that we should be designing better algorithms to deal with these sorts of biases.

I have experience designing these things and I quite disagree with that notion. I noted on Twitter that the examples were basically the machine learning algorithms were exposing/mirroring what is learned from the data.

Carina did indeed point out that the data is indeed biased – she did indeed point out that for example, film stock in the 1950s were tuned for fairer skin, and therefore the amount of photographic data for darker skinned peole were lacking [2]

But before we dive in deeper, I would like to bring up some caveats:

  • I very much agree with Carina that we have a problem. The points I’m disagreeing upon is the way we should go about to fix it
  • I’m not a professional ethicist, nor am I a philosopher. I’m really more of an armchair expert
  • I’m not an academic dealing with the topics – I consider myself fairly well read, but I am by no means an expert.
  • I am moderately interested in inequality, inequity and injustice, but I am absolutely disinterested with the squabbles of identity politics, and I only have a passing familiarity of the field.
  • I like to think of myself as fairly rational. It is from this point of view that I’m making my arguments. However, in my experience I have been told that this can be quite alienating/uncaring/insensitive.
  • I will bring my biases to this argument, and I will disclose my known biases whereever possible. However, it may be possible that I have missed, and so please tell me.

Continue reading

  1. [1] not necessarily the key points she was trying to communicate – it could just be I have shitty comprehension, hence rendering this entire blogpost moot
  2. [2] This NPR article seems to be the closest reference I have, which by the way is fascinating as hell.

Operator Overloading With Right Associativity In Python

It’s actually quite fun that after years of using something, you still find a new way to do something. So at the last Sydney Python meet up, there were showings of how Python interfaces objects.

Consider this for example:


class Blah(object):
    ''' skipping the __init__ and stuff '''
    def __add__(self, other):
        # skips checks and stuff
        return self.value + other

>>> b = Blah(2)
>>> b + 2 
4

However, it was pointed out by my friend Julian, that the other way wouldn’t work – that operator overloading was only left associative:


>>> b = Blah(2)
>>> 2 + b

Traceback (most recent call last):
  File "", line 1, in 
TypeError: unsupported operand type(s) for +: 'int' and 'Blah'

Last night as I was preparing my slides and code for my PyConAU talk, I accidentally found this. More specifically, I found out about the __radd__, __rmul__ etc methods.

So, if you implement both __add__ and __radd__ interface methods, you can have right associativity:


class Blah(object):
    ''' skipping the __init__ and others '''
    def __add__(self, other):
        # skips checks and stuff
        return self.value + other
    def __radd__(self, other):
        return self.__add__(other)

>>> b = Blah(2)
>>> b + 2 
4
>>> 2 + b
4

.

Here’s Julian’s proof of concept to show that ambiguities don’t matter:


class Multiplier(object):
    def __init__(self, description):
        self.description = description
 
    def __mul__(self, b):
        print ("__mul__ was called on {0}".format(self.description))
 
    def __rmul__(self, b):
        print ("__rmul__ was called on {0}".format(self.description))
 
    def __int__(self):
        return 43
 
 
 
a = Multiplier("a")
b = Multiplier("b")
 
# Confirm Chew's finding still works.
a*5
5*a

# Which gets priority in this ambiguous situation? Turns out __mul__ does.
a*b

# But, we can force it.
int(a)*b

So, there you go… kinda cool, eh?

Writing… Again

This blog has been awfully silent the past year. I guess now that my job has been made redundant, I’m going to return to writing more.

Hah! Here’s to hoping!

Designing SquatCoach

A few months ago, I blogged about my frustrations with logarithmic progressions with weightlifting. I highly enjoy linear progressions – who doesn’t enjoy work that is easy? But I was wrong about one thing: I hadn’t hit the logarithmic progression part. In fact as at the time of writing of this blog post, I am still firmly in the linear progression phase.

So what went wrong? The answer is form. I was basically squatting with exceedingly poor form. I was using all kinds of stabilizer muscles in an unbalanced way that left me injured often. I took notes and noticed that it was at around 55 to 60kg that I kept getting injured about and hence the weights I squatted lingered around there. There is an old saying goes: “Practice Makes Perfect”. That is wrong. The phrase that should really be passed around is “Perfect Practice Makes Perfect”.

The breakthrough came when I got got my partner to record me squatting for the first time. I had religiously read /r/fitness and /r/formcheck, so I had a fairly good idea of what good form is. I thought I had good form – I didn’t. One of the first things I noticed was that I wasn’t squatting anywhere near deep enough, despite the fact that I had all along thought that I was doing an ass-to-grass squat.

After years spending seated in front of the computer, I had no spatial awareness of how deep I was squatting. I had to learn what a deep squat was (learning the flexibility to do that is a tale on its own). I taught her how to check for correct form: the hip crease must go lower than the top of the kneecap to be counted as a good squat. And so she began to spot me. But this wasn’t fair for her as it was eating into her training time. So after a couple of sessions, I went about developing an app that used computer vision to determine if I was squatting with good form.

The thing about computer vision is while it’s easy to start, accuracy is a Difficult goal with a capital D. One indeed can spend a lot of effort to boost the accuracy a very miniscule amount. I cut down a lot of that by using various hacks like coloured sticker dots on the hip crease, knee and barbell tip to increase the accuracy of the app. By and large, I got it working, for me. But it wasn’t working for my partner, or a colleague who had begun to be interested about the app (he had separately approached me about the feasibility of an idea similar to SmartSpot, whose idea I love). The killing blow, I think was that I had irritated some fellow gym-goers by my wrapping of a gorillapod around their racks or bars in order to set up a static filming point.

And so it transpired I would need a new app. The app would have to do these things in order to teach me to have a better squat form:

  • Monitor my form as I squat
  • Inform me when I have hit a good form
  • Only one person involved – no interfering with anyone else in the gym

Continue reading

The Bane of Communicating Succinctly

You may have noticed I have not blogged for a while. And if you do follow me on twitter you’ll note that my tweet rate has also dropped.

Ever increasingly, I find the need to share some ideas, but the ideas cannot be succinctly communicated in a pithy sentence or two. I have a lot of what I consider to be “dangerous” ideas (in the vein of the Festival of Dangerous Ideas), and I think it is imperative to be clear about the ideas.

And so I would sit down and write a blog post about it, only for it to derail into some mega long essay that is at best reads like mindless rambling (for example, see my previous post). It is in these cases that I sometimes feel I’m better off not writing. But sometimes I get passionate about a topic, and start writing a lot

Then midway through, I’d lose steam. Here are the example of titles that I have in my archive that went nowhere:

  • Why Do Ceramics Heat Up in Microwave Ovens (3997 words and I lost steam) – this article began life exactly a year ago today
  • Making Friends – A Rant (1301 words, and still incomplete, as I’m still gathering data, though I’m quite sure I’ll lose steam on that too)
  • Scrambled Eggs, The Guide (1514 words, lost steam already)
  • Logarithmic (A musing on non linear progression of things)
  • Track (A musing on being on track for a plan, and why sometimes it’s ok to let go)
  • Graveyard of Sideprojects (originally written when there was a craze over having side projects. I have 200+ side projects that I have not touched for years)
  • The Virtuous Molecule (a blog post about the fallacies of natural products)
  • Reviews: I have 3 book reviews, 2 movie reviews in my drafts, and they are nowhere

I have since concluded that it’s the length that makes me lose steam.

Yesterday I read Evan Miller’s Four Days of Go. The takeaway is that I wish I could write like him. I actually felt envious that he was able to get his point across straight, and still not be dry.

I have a problem with communicating succinctly. I look at all the work emails that I sent out – most explanation type emails have graphs, definitions and all sorts of background things. Even when I highlight the key takeaway points, written in normal English, sometimes they are missed.

…[P]erfection is attained not when there is nothing more to add, but when there is nothing more to remove

So says the oft-quoted Antoine de Saint Exupery.

The problem is I don’t know what to take away. I don’t know what to remove. The typical advice of how to improve writing is to “write more”. I’ve written this blog for more than 10 years in one form or another. I actually need to know how to improve, not just write more.

Argh.