Ruby and Go Sitting in a Tree

In this post, we’ll see how to build a Ruby gem with Go based native extension I called scatter, which is enabled by the latest Go 1.5 release and its c-shared build mode.

Feel free to drop me a line or tweet if you have questions or comments.

C Shared Libraries and Go, Ruby, Node.js, Python.

With Go 1.5, we’ve got a sparkling new go build -buildmode=c-shared mode, with which we can build C shared libraries.

Go, Ruby, Node.js and Python (and others) use C shared libraries as extensions. As an example, if you wanted to parse XML and either your language didn’t have such a parser (as doubtful as it sounds), or wasn’t fast enough parsing, you could use the robust libxml C library which exists since almost forever (well, 99’).

The big deal is - Go 1.5 enables you to build libraries such as libxml, in Go instead of C, and then use them from any host language that can use C libraries.

Here’s a crazy thing: Firefox and Go via a Firefox addon.

The Case for Ruby and Go

I don’t believe Ruby is slow for general purpose use cases, but it would be for specialized ones. When you’re using Ruby for the wrong workload, it might appear slow.

With C extensions, and now Go - you don’t have to completely switch away from Ruby to tackle such a workload, when it doesn’t make sense.

Ruby FFI and C Extensions

FFI should be very easy to use in Ruby given that you already have a properly built C library.

Usually, you would use FFI on an existing, well known library and create what is called a “binding”, much like any of these projects.

So to make Ruby call C code, for performance or simply for standing on shoulders of giants we would:

  • Build a C library and export functions for FFI (which is really just using existing functions), or,
  • Find an existing C library and point FFI at its API

And of course with Go 1.5,

  • Build a Go library and expose it as a C shared library, and point FFI at its API

Getting Started

Building a C shared library based on Go is easy, here’s a simple example. The Go part:

// libsq.go
package main
import "C"

//export sq
func sq(num int) int {
  return num*num

Here, the export comment is meaningful and Go uses it to mark out functions to export. Building as a C shared library goes like this:

$ go build -buildmode=c-shared -o libsq.go

And now we can tie this up with Ruby and FFI. The Ruby part:

# sq.rb
require 'ffi'
module Sq
  extend FFI::Library
  ffi_lib File.expand_path("./", File.dirname(__FILE__))
  attach_function :sq, [:string], :string

# test it out

We’re done with the basic example. Seeing how easy this was, you probably are starting to have a ton of ideas. Note them down.

But before you get started, let’s take a look at an expanded, more practical, example.

Scatter - Go Powered Parallel HTTP Requests Ruby Gem

Scatter is a showcase gem that does a scatter/gather (or fan-out, etc) type requests on a list of URLs in parallel.

For somewhat pragmatic purposes (there are other C based gems that do this) I’d like this kind of heavy lifting to be performed by Go’s HTTP stack, and modeled with Go’s channels and goroutines for concurrency.


We start, as before, by building our Go based C shared library. We’ll make a single function that coordinates all of the HTTP requests.

// somewhere in func scatter_request...
for _, uri := range cmd.URIs {
  go func(_uri string) {
    c <- makeRequest(_uri)

result := Result{}
for _, uri := range cmd.URIs {
  resp := <-c
  result[uri] = resp

This is the core of the idea. From this, you see that there are no locks, no complex or dirty coordination to achieve concurrency.

Pitfalls and Patterns

In the simple sq example earlier before, I’ve used ints as the parameters and return values. This was intentional - it was hiding some pitfalls.

You might want to pass complex objects down to Go from Ruby, arrays perhaps, and then you might want to return multiple values from Go back to Ruby like the idiomatic Go result and error.

ruby-ffi solves the complex objects case for you, but not always arrays, and when you want to return multiple parameters, you really should return a complex object (struct).

In this example, we want to:

  • Pass multiple URLs, or even varargs
  • Return a big result, which is an aggregation of all of the requests (array or hash)
  • Signal an error if it happened

Unfortunately we hit all of the mentioned points of doing FFI. We have to build structs manually, we have to limit array sizes to work with built in arrays, or we have to start dealing with pointers and unsafe references.

Staying Happy

I want to stay in my happy zone in terms of developer experience for now. This is why in this example we will defer parameter and return value coding to an application-side codec.

We’ll pass a string in, and a string out. We’ll encode JSON in, and decode JSON out, and this could that easily be Protobuf, Thrift, msgpack, or Avro instead - what ever you like.

Remember - this is a trick, but not a dirty one. It might very well take you where ever you want to go, without the headache.

Working With Strings

So strings just became super important. And they should be - regardless of what we’re doing here strings were always the workhorse data type of software.

But again, I used a deceptively simple example. I didn’t use strings intentionally, because if you tried simply doing this:

// libsq.go
package main
import "C"

//export prnt
func prnt(str string) {

It wouldn’t work.

Although it looks like everything is working - str will be empty. This is because Ruby strings are illusively different than Go strings. To resolve it, we’ll do this:

// libsq.go
package main
import "C"

//export prnt
func prnt(data *C.char) {


If your Go code works on a shared Ruby resource, you need to start juggling the GIL. This can be painful if you don’t have much experience with building robust concurrent code. And even if you do, concurrency is hard, and it’s harder when you also need to manage memory without a GC (in C extension land).

So, if we stick to a somewhat naive principle where our Go extension does big things completely, we wouldn’t need to criss-cross between Ruby and Go code and maintain ownerships all around.

Shared Memory

You should read about Go and CGo memory management, but hopefully, again, if we stick to the tips in this specific example - as harsh as they may sound, most of it would be irrelevant.

Making a Gem

You now have all of the theory needed to build your own Go powered Ruby gems.

Let’s take a look at the grunt work of laying out such a gem so that it would build a hybrid of Go and Ruby and run successfully after install for your users.

Building Native Extensions

Ruby gems support C Ruby and Java (JRuby) extensions by way of rake-compiler It covers several concerns while building native extensions:

  • Locating files and resources for building C or Java
  • Generating a machine-specific context (makefiles, configuration) so that the build will address that specific machine’s architecture and bit width
  • Running the build with the relevant tooling
  • Installing and cleaning up

What you need to do is:

  • In your gemspec, point to an extconf.rb with spec.extensions = %w[ext/extconf.rb], your extconf file will be a simple and minimal descriptor of the build (out of our scope for this post)

And, optionally:

  • Provide instruction per environment (Java, C)
  • Toggle inclusion of prebuilt binaries, that were cross-compiled ahead of time

Building Our Native Go Extension

We don’t have a C codebase, nor do we need the C tooling. So we can do either of these two:

  1. Depend on Go tooling. A Go 1.5+ compiler should exist at the user’s machine, we’ll build the Go library on gem install time.
  2. Cross compile and include all binary artifacts with our gem. On install time, sniff out the platform and depend on the correct binary.

We’ll do (1) because it feels cleaner for our purposes and makes debugging easier.

If you’d like to streamline the workflow then I’d recommend (2). This way, the Ruby and Go codebase releases are allowed for independent (and healthy) evolution.

Our Slim extconf and Makefile

Since Go makes a great toolset, this is how our extconf would look like :)

# this is a cheat
puts "running make"
`make build`

And our makefile:

	go build -buildmode=c-shared -o libscatter.go
# fake out clean and install

.PHONY: build

That’s it. All the rest is standard Ruby gem, and our FFI glue code.

Summing Up

This idea works well - you can clone the scatter repo, install, run benchmarks and more.

Personally, I fantasized about a Go powered Ruby gem ever seeing and trying to build a Rust based Gem, after reading Rust and Skylight.

These kind of concoctions breath new life into a community and a platform.

Make Sure It’s Worth It

For the time of this writing, this idea is pretty much novel. There are not much official advise, and the road is paved with gotchas (again, see Firefox and Go).

I think that for now, if you limit yourself to the what was presented here, you may be on the safe side.

Truth be told, with Go, the friction for building native extensions goes down dramatically. Setting up such a Go extension is fun, rather than a full-fledged C extension. Suddenly “is it worth it?” becomes “why not?”.

The Real World

Ruby is fast enough for most things. However, you might find a Go based library useful for packing raw performance and cleaner concurrent codebase into:

  • An existing project, that’s so old it wouldn’t respond well to an architecture refactor
  • An existing team, that wouldn’t respond well for swapping an entire tech stack
  • Ruby projects that miss libraries that Go already has, or has lower-quality libraries in the same sense.

Low Level Go

Let’s take a look at a golang binary size with real life dependencies.

These are two of my own projects, where I knew they had to run on a command line, and across platforms:

But what about when your work is very simple, and requires doing some classic C with direct operating system calls? Does it “pay off” to build that in Go?

Some would write a small C program and be done with it; being small it would probably be a lot of fun since there’s no abstraction that can stop you anywhere.

However, a one-off C program will probably compromise on portability and perhaps some other arguable properties such as code clarity and ease of maintenance.

Let’s see what does it take to get as close as possible to that C program, but stay within Go.

Reducing the Go Binary Size

Go binaries tend to grow fast with each included dependency. However, let’s make a very important statement: in real life, Go programs are small enough for that to not matter at all. Moreover, at runtime the resources consumed will be much less than say Java and Ruby and Python, which is perfect.

That being said, if we still want to get close to a C binary, we don’t want to be in the MB range, but in KBs.

How low can we go?

The Go binary will pack the garbage collector, the goroutines scheduler and the dependencies you include via import.

Taking a very minimal CLI program, let’s say we have os to cover basic file handling and flag to parse and access command line arguments. We don’t even do any common I/O operations here.

package main
func main(){

Binary size: 1.8MB.

Going foward, let’s remove flag and assume we can get to ARGV via os.Args. It’ll be less elegant but, whatever.

package main
func main(){

Binary size: 893Kb. That looks quite good. Can we do better?


Going through Go docs, we bump into syscall, Go’s interface into the low level OS primitives:

The primary use of syscall is inside other packages that provide a more portable interface to the system, such as “os”, “time” and “net”. Use those packages rather than this one if you can.

Let’s swap everything with “raw” syscalls:

package main
func main(){
  syscall.Open("foobar", 0, 666)

Binary size: 544Kb. Neat (we’ll stop here - for anything practical, I assure you this is the bare minimum :).

We can shave around 20KB more with strip but let’s forget about that for the moment.

Shake off abstractions

As with C, you can do without abstractions in Go. You could get a lot of mileage with syscall, as do a lot of the standard packages in order to implement the higher level, more streamlined Go API.

Here is a snippet from os.Chdir, reassuring it’s a fancy wrapper around syscall.

func (f *File) Chdir() error {
  if f == nil {
    return ErrInvalid
  if e := syscall.Fchdir(f.fd); e != nil {
    return &PathError{"chdir",, e}
  return nil


For a real life example, You can take a look at cronlock, a small utility I’ve built with the conclusions from this article. Being that it drives a mission-critical component, it had to be small and simple.

Working without abstractions from time to time is nice; it feels like being a kid with a LEGO again and there’s a special place for C in my heart to relay that feeling. Go makes it a tad bit more accessible and portable.

Note: syscall is supposed to be depracated in favor of a better architecture, but has not yet - and the ideas here should still be valid after the transition. See more here.

Open Sourcing Castbox

In January 2014 I gave a talk at the Israeli Devcon in Tel-Aviv, named “Chromecast Internals”. I announced Castbox at the end of that talk.

Getting that exposure brought up interesting ideas which postponed my plan of open sourcing it, but today, I have no option but to bury these plans due to Google Chromecast changes.

So, much delayed, I’m open sourcing Castbox. The good news is that this project, 8 months later, is more robust (since I wanted to build a business around it).

Castbox still works with most apps and will continue to work until all Chromecast apps migrate to the new Google protocol (which may probably take time).

You can use it to develop your apps and have a Chromecast without the real Chromecast if you want - on a RaspberryPi for example.

Why Go

I have built several other open source projects in Go in the past 2 years, and am running Go in production for a long while. However, I have never stated my opinion and point of view on Go, and I hope to cover some of it below.

My goals for this project were to:

  • Have a build for Raspberry Pi
  • Develop on OSX and run on Linux and Windows
  • Have a reasonably happy development experience
  • Be certain that I will consume low resources and run fast

Parsing Binary Data With Node.js

I’ll start by highlighting some of the pillars of binary data, hopefully in a breeze. If you find yourself very attracted to these topics, I recommend this book (you can skip the HLA/assembly parts). Also note that it’s a bit oldschool (I read that more than 10 years ago but it left quite an impression) so there may be newer and better resources to learn from.


A “computer” word, is a sort of unit of grouping of bits. For example, a word can be 8, 16, 32, 64 etc, bits wide. Typically a word’s width is coupled to the CPU’s architecture’s width (i.e. 64bit CPU) but in our case, we’ll treat the meaning of word as “a set of N fixed-size bits” where N is the number of bits.


The term “endian” comes from “end”. When you look at a sequence of bytes and want to convert a group of bytes to a plain old number, it stands to denote which end of the number is first; in the case of big endian the first part is the bigger one. In the case of little endian the first part is the little one.

For example, there are two ways to look at the couple of bytes appearing in a binary file: 01 23.

Asset Pipeline Internals

Almost a year ago, I wrote about build management for Javascript projects.

In a hindsight a year proved to be a ton of time on the client-side.

Most notably Grunt (which I only mentioned briefly) took off like a rocket, and in the same manner Yeoman - which I almost instantly considered a swiss army knife for doing my client-side only projects.

Yeoman though, which relies on Grunt, is going through some fundamental changes and looks like it is being re-arranged and re-planned for a while now.

For what it’s worth I do support the new Yeoman changes, but instead of waiting for it to crystalize I tought it is time to re-evaluate what’s out there today and see if Yeoman can be replaced altogather (the answer is ‘Yes’, keep reading :).

ZeroMQ and Ruby a Practical Example

For a specific high-performance workloads, I wanted to include a new and highly optimized endpoint onto Roundtrip.

If you don’t know what Roundtrip is yet, feel free to quickly check out the previous Roundtrip post and come back once you got the idea of what it does.

I had to select both a wire protocol and an actual transport that will be very efficient. To gain an even higher margin over HTTP, I knew I wanted it to be at least binary and not very chatty.

A good option for this would be Thrift, for example. However I wanted to go as low as I could, because I didn’t really need anything more than the bare simplest RPC mechanism.

However, going with straight up TCP wouldn’t gain me much because I typically hold development ease and maintainability as an additional value. There was only one thing I felt offering an awesome development model and being as close to (or even better than, on some occasions) TCP…

Tracking Your Business

You’ve built (or are maintaining) a product which has many services that span over different machines at the backend. These services are all orchestrating together to implement one or many more business processes.

How are you tracking it?

Pragmatic Concurrency With Ruby

I’m coming from a parallel computation, distributed systems background by education, and have relatively strong foundations in infrastructural concurrent/parallel libraries and products that I’ve built and maintained over the years both on the JVM and .Net.

Recently, I’ve dedicated more and more time building and deploying real concurrent projects with Ruby using JRuby, as opposed to developing with Ruby (MRI) with concurrency the way it is (process-level and GIL thread-level). I’d like to share some of that with you.

Feel free to bug me on twitter:

Administrative notes«EOF:

This may come as a lengthy information-packed read. You can put the blame on me for this one because I wanted to increase the value for the reader as much as possible and pack something that could have been a lengthy book, into a single highly concentrated no-bullshit article.

As an experiment, I also put most of the example code in a repository including the source of this article. Please feel free to fork and apply contributions of any kind, I’ll gladly accept pull requests.

Github repo:



This article was recently translated to Serbo-Croatian language by Anja Skrba from - Thanks Anja!

Concurrency is Awesome!

Remember those old 8-bit games you used to play as a child?. In a hindsight - you know its awesome, but if you’re a gamer or just a casual gamer, and you’re forced to play it today, the graphics will feel bad.

This is because it’s a detail thing; just like childhood computer games, as time passes, it seems like your brain doesn’t care (or forgets) the proper details.

So given that one is an MRI Ruby developer, her mindset would be that concurrency just works, and it is easy and awesome. But you might be right guessing that due to the level of cynicism going around here - it isn’t the end of it.

The MRI Ruby GIL is gracefully keeping some details away from you: yes things are running in parallel with the help of properly built I/O libraries (for example: historically, the MySQL gem was initially not doing it properly, which meant your thread would block on I/O), but surely, code isn’t running in parallel. It’s just like what your brain did when it covered up for those horrific 8-bit graphics that you were sure are still awesome.

Building Your Tools With Thor

Thor is not new; first built as a rake and sake replacement, first commit is well over 4 years ago.

Jump ahead several years and Thor is part of the foundation of the new-generation rails generator, and very popular tools such as Bundler and Foreman.

Recently, @wykatz emerged a fantastic looking (and much deserved) Thor website, and although I’ve started doing Thor based projects over two years ago, I think its the right time to write about Thor itself.

Today, Thor can serve as a rake replacement, great generator building framework, and a general purpose CLI toolkit.

First Look at Mruby

mruby is minimalistic Ruby, developed by Matz (Ruby’s creator) and funded by the Japanese ministry of Economy.

I’ve been waiting for this to go public since Matz’ early announcements of him being working on it. This is very exciting.


  $ git clone
  $ make

Compilation is a fantastic error-less breeze, around 20 seconds.

Hello mruby

Lets see how this thing should work.

$ cd bin
$ cat > hello.rb
puts "hello mruby!"
$ ./mruby hello.rb
hello mruby!