A Story of a Fat Go Binary

Small deliverables are important for CD scenarios and low-resource applications. In my case, I am building an agent that should run on different kinds of low resource devices - RAM, CPU, and actually - I can't predict how low.

Go binaries are small and self-contained: when you build a Go program the resulting single binary is all there is. This is in contrast to platforms like Java, Node.js, Ruby, and Python where you may feel your code is small, but then there's a mountain of dependencies behind it that you need to pack for a self-contained deliverable.

While having a self-contained binary is an important convenience, Go doesn't have a built-in way for conveniently reasoning about the size of dependencies, so that one can make informed decisions about including them or not.

In this post we'll introduce gofat, a tool which lets you break down dependency sizes of a Go project.

Building an IoT Agent

Let's breeze through a story that shows how to reason about and build a service - an IoT agent - that we'll deploy to a modest hardware device somewhere across the globe. This will highlight the architecture of such an agent from an operations point of view.

First, we want good CLI ergonomics, so we'll use kingpin, which is a POSIX compliant CLI flags/opts library; it's been such a great library that I've been carrying it over a lot of my projects by default.

In fact, I'll use my go-cli-starter project that includes it:

$ git clone https://github.com/jondot/go-cli-starter fattyproject
Cloning into 'fattyproject'...
remote: Counting objects: 55, done.
remote: Total 55 (delta 0), reused 0 (delta 0), pack-reused 55
Unpacking objects: 100% (55/55), done.

As an agent, we want to stay always up. We'll do that with a dummy loop that's doing nonsense, generally, for this exercise.

for {
    f := NewFarble(&Counter{})
    f.Bumple()
    time.Sleep(time.Second * 1)
}

Long running processes accumulate cruft — small traces of memory leaks, forgotten open file descriptors — even the smallest of leaks become huge when we’re talking a year-long up times.

You'd be happy to know that Go has a built-in metrics and health facility called expvars. This will be perfect to expose the agent's innerworkings: the idea is that since an agent is long-running, some times we'll have forensics sessions where we'll want to understand how's an agent doing - CPU, GC cycles and so on and expvars can do it for us, and also, expvarmon is pretty cool to use for that.

To use expvars we need a magic import. Magic because it will find and add an endpoint to an existing HTTP listener. This also means we need to have an HTTP endpoint up and running, so we'll take that from net/http.

import (
    _ "expvar"
    "net/http"

    :
    :

go func() {
    http.ListenAndServe(":5160", nil)
}()

Since we're becoming a sophisticated service, we might as well add a leveled logging facility. A good one is zap, by Uber.

import(
    :
    "go.uber.org/zap"
    :


logger, _ := zap.NewProduction()
logger.Info("OK", zap.Int("ip", *ip))

A service that’s always on, running in a remote device you can’t control — and most probably can’t update — is very rigid. It makes sense to bake in flexibility of some sort. One trick is to have it run custom commands, scripts, or basically — have it change its behavior without redeploying or restarting.

We’ll add a facility to run an arbitrary remote script. Although borderline suspicious, if this is your agent or service, then you can prepare an embedded runtime sandbox to run code which makes it OK. Two such runtimes that are popular to embed are Javascript and Lua.

We'll use an embedded Javascript engine called otto.

import(
    :
    "github.com/robertkrimen/otto"
    :

for {
    :

    vm.Run(`
        abc = 2 + 2;
        console.log("\nThe value of abc is " + abc); // 4
    `)

    :
}

Now, as long as we fetch the content that we stick in Run from a remote endpoint - we've got a sophisticated, self-updating, IoT agent!

Understanding Go Binary Dependencies

Let's look at what we've got so far.

$ ls -lha fattyproject
... 13M ... fattyproject*

Having that the dependencies we added are reasonable, we have caused our binary to creep up to 12MB in size. I still see this as a tiny binary in comparison to other languages and platforms; but from an IoT/modest hardware point of view, anything we can give up for a simpler or smaller overhead and size can be useful.

Our first task is to understand how did the dependencies in this binary add up.

Let's break down a well known binary first. GraphicsMagick is a modern variation on the well known ImageMagick image processing system, and you probably already have that installed. If not, it's a brew install graphicsmagick away on OSX.

Then, otool is an alternative for ldd on OSX. With it, we can break down a binary and see what kind of libraries it's linked to.

We can survey a dependency size by picking it up from the listing:

$ ls -lha /usr/l/.../-0_2/lib/libMagickCore-6.Q16.2.dylib
... 1.7M ... /usr/.../libMagickCore-6.Q16.2.dylib

Can we build a good mental map of any binary in this way? apparently, the answer is "No".

Go links its dependencies statically by default. This has the benefit of having one simple and portable deliverable - the binary itself. This also means that otool or any such binary-first tool would be useless.

$ cat main.go
package main

func main() {
    print("hello")
}

$ go build && otool -L main
main:

So to try and still break down a Go binary to its dependencies, we must use a Go-enlightened tool that can understand the Go binary format. Let's find one.

To get a dump of the available tools, use go tool:

$ go tool
addr2line
api
asm
cgo
compile
cover
dist
doc
fix
link
nm
objdump
pack
pprof
trace
vet
yacc

You can dive right into the source listing of these, and if we look at the nm tool, for example, we can view its package documentation in src/cmd/nm/doc.go.

I pointed out nm intentionally. As it happens, this tool is exotically close to what we're trying to do, but not close enough. It may list symbols and object sizes, but none of that makes sense if we're trying to make up the dependencies of a binary.

$ go tool nm -sort size -size fattyproject | head -n 20
  5ee8a0    1960408 R runtime.eitablink
  5ee8a0    1960408 R runtime.symtab
  5ee8a0    1960408 R runtime.pclntab
  5ee8a0    1960408 R runtime.esymtab
  4421e0    1011800 R type.*
  4421e0    1011800 R runtime.types
  4421e0    1011800 R runtime.rodata
  551a80     543204 R go.func.*
  551a80     543204 R go.string.hdr.*
  12d160     246512 T github.com/robertkrimen/otto._newContext
  539238     100424 R go.string.*
  804760      65712 B runtime.trace
   cd1e0      23072 T net/http.init
  5e3b80      21766 R runtime.findfunctab
  1ae1a0      18720 T go.uber.org/zap.Any
  301510      18208 T unicode.init
  5e9088      17924 R runtime.typelink
  3b7fe0      16160 T crypto/sha512.block
  8008a0      16064 B runtime.semtable
  3f6d60      14640 T crypto/sha256.block

The numbers above may be accurate for dependencies (in the second row), for example _newContext from the otto package, but the math might be a bit involved or missing for the general sense.

Gofat

There's one last trick that does work. When you compile your Go binary, Go will generate interim binaries, for each dependency, before statically linking these all up into the one binary.

Our strategy will be to take sizes of all dependencies in that moment. Here's how:

Let's pick this command apart.

eval `go build -work -a 2>&1`

Using the -a flag, we're telling Go to ignore any cache and build a project from the start. This will force a build of all dependencies. Using -work outputs a working dir environment variable export pragma, so we eval that (thanks Go team!).

find $WORK -type f -name "*.a" | xargs -I{} du -hxs "{}" | gsort -rh

Having the WORK environment variable populated and pointing into our build working directory, we now look for all *.a files, which represent the compiled form of our dependencies with the find tool.

We then feed all lines, which are file locations, into xargs which in turn is a utility that lets you run a command on each of the piped lines from before - in our case, into du that takes a size of a file.

We use gsort (the GNU version of sort) to perform a reverse sort of the sizes.

sed -e s:${WORK}/::g

Lastly, we strip out a prefix of the WORK folder from everything we've got and we display a nice and clean dependency string.

Now that we understand how this works, let's get to the fun part of seeing what's taking up those 12MB in our binary!

Trimming down the fat

Let's run gofat for the first time on our mock IoT agent project.

If you're trying this yourself, you'll notice build times are considerably longer with gofat. This is because we're running a build in -a mode, which means rebuild everything.

Now that we know how much space each dependency is taking, let's roll up our sleeves and make observations, and decisions.

1.8M    net/http.a

Doing anything related to HTTP handling is heavy. We can probably drop this, and not use expvars, and instead periodically log vitals and health to a log file. As long as we do that frequently, it should be as good.

788K    gopkg.in/alecthomas/kingpin.v2.a
388K    github.com/alecthomas/template.a

This is a big surprise, around 1MB for a nice-to-have POSIX flag parsing feature. We can drop that and use standard library flags, or even read configuration from environment variables and do away with flags (which I can tell you also takes some space).

Newrelic adds up another 1.3MB, we can drop that as well:

668K    github.com/newrelic/go-agent.a
624K    github.com/newrelic/go-agent/internal.a

Zap as well. We can use the standard way to log in Go:

392K    go.uber.org/zap/zapcore.a

Otto, being an embedded Javascript engine should be heavy, and we can confirm that:

2.2M github.com/robertkrimen/otto.a 312K github.com/robertkrimen/otto/parser.a 172K github.com/robertkrimen/otto/ast.a

Meanwhile, logrus is lightweight for being a feature-packed logging library:

128K    github.com/Sirupsen/logrus.a

We can leave that in.

Conclusion

We saved around 7MB by accepting that we don't have to use certain dependencies and that we can take alternative from Go's standard library in their stead. I can tell you already that instead of 12MB you can manage to have this binary down to 1.2MB.

We can keep repeating this process, removing libraries, fitting in others, seeing how big our binary turns up. But there's one take away: size is not an issue. For the general rule - you shouldn't be doing this because Go dependencies are already small in comparison to other platforms.

With that, you should always make sure you have the tools that help you generate more visibility into the projects you're building, and gofat can be one of those when you're building for resource constrained environments.