Ruby and Go Sitting in a Tree

In this post, we'll see how to build a Ruby gem with Go based native extension I called scatter, which is enabled by the latest Go 1.5 release and its c-shared build mode.

Feel free to drop me a line or tweet if you have questions or comments.

C Shared Libraries and Go, Ruby, Node.js, Python.

With Go 1.5, we've got a sparkling new go build -buildmode=c-shared mode, with which we can build C shared libraries.

Go, Ruby, Node.js and Python (and others) use C shared libraries as extensions. As an example, if you wanted to parse XML and either your language didn't have such a parser (as doubtful as it sounds), or wasn't fast enough parsing, you could use the robust libxml C library which exists since almost forever (well, 99').

The big deal is - Go 1.5 enables you to build libraries such as libxml, in Go instead of C, and then use them from any host language that can use C libraries.

Here's a crazy thing: Firefox and Go via a Firefox addon.

The Case for Ruby and Go

I don't believe Ruby is slow for general purpose use cases, but it would be for specialized ones. When you're using Ruby for the wrong workload, it might appear slow.

With C extensions, and now Go - you don't have to completely switch away from Ruby to tackle such a workload, when it doesn't make sense.

Ruby FFI and C Extensions

FFI should be very easy to use in Ruby given that you already have a properly built C library.

Usually, you would use FFI on an existing, well known library and create what is called a "binding", much like any of these projects.

So to make Ruby call C code, for performance or simply for standing on shoulders of giants we would:

  • Build a C library and export functions for FFI (which is really just using existing functions), or,
  • Find an existing C library and point FFI at its API

And of course with Go 1.5,

  • Build a Go library and expose it as a C shared library, and point FFI at its API

Getting Started

Building a C shared library based on Go is easy, here's a simple example. The Go part:

// libsq.go
package main
import "C"

//export sq
func sq(num int) int {
  return num*num
}

func main(){}

Here, the export comment is meaningful and Go uses it to mark out functions to export. Building as a C shared library goes like this:

$ go build -buildmode=c-shared -o libsq.so libsq.go

And now we can tie this up with Ruby and FFI. The Ruby part:

# sq.rb
require 'ffi'
module Sq
  extend FFI::Library
  ffi_lib File.expand_path("./libsq.so", File.dirname(__FILE__))
  attach_function :sq, [:int], :int
end

# test it out
Sq.sq("foobar")

We're done with the basic example. Seeing how easy this was, you probably are starting to have a ton of ideas. Note them down.

But before you get started, let's take a look at an expanded, more practical, example.

Scatter - Go Powered Parallel HTTP Requests Ruby Gem

Scatter is a showcase gem that does a scatter/gather (or fan-out, etc) type requests on a list of URLs in parallel.

For somewhat pragmatic purposes (there are other C based gems that do this) I'd like this kind of heavy lifting to be performed by Go's HTTP stack, and modeled with Go's channels and goroutines for concurrency.

libscatter.go

We start, as before, by building our Go based C shared library. We'll make a single function that coordinates all of the HTTP requests.

// somewhere in func scatter_request...
for _, uri := range cmd.URIs {
  go func(_uri string) {
    c <- makeRequest(_uri)
  }(uri)
}

result := Result{}
for _ := range cmd.URIs {
  resp := <-c
  result[resp.uri] = resp
}

This is the core of the idea. From this, you see that there are no locks, no complex or dirty coordination to achieve concurrency.

Pitfalls and Patterns

In the simple sq example earlier before, I've used ints as the parameters and return values. This was intentional - it was hiding some pitfalls.

You might want to pass complex objects down to Go from Ruby, arrays perhaps, and then you might want to return multiple values from Go back to Ruby like the idiomatic Go result and error.

ruby-ffi solves the complex objects case for you, but not always arrays, and when you want to return multiple parameters, you really should return a complex object (struct).

In this example, we want to:

  • Pass multiple URLs, or even varargs
  • Return a big result, which is an aggregation of all of the requests (array or hash)
  • Signal an error if it happened

Unfortunately we hit all of the mentioned points of doing FFI. We have to build structs manually, we have to limit array sizes to work with built in arrays, or we have to start dealing with pointers and unsafe references.

Staying Happy

I want to stay in my happy zone in terms of developer experience for now. This is why in this example we will defer parameter and return value coding to an application-side codec.

We'll pass a string in, and a string out. We'll encode JSON in, and decode JSON out, and this could that easily be Protobuf, Thrift, msgpack, or Avro instead - what ever you like.

Remember - this is a trick, but not a dirty one. It might very well take you where ever you want to go, without the headache.

Working With Strings

So strings just became super important. And they should be - regardless of what we're doing here strings were always the workhorse data type of software.

But again, I used a deceptively simple example. I didn't use strings intentionally, because if you tried simply doing this:

// libsq.go
package main
import "C"

//export prnt
func prnt(str string) {
  println(str)
}

It wouldn't work.

Although it looks like everything is working - str will be empty. This is because Ruby strings are illusively different than Go strings. To resolve it, we'll do this:

// libsq.go
package main
import "C"

//export prnt
func prnt(data *C.char) {
  println(C.GoString(data))
}

GIL

If your Go code works on a shared Ruby resource, you need to start juggling the GIL. This can be painful if you don't have much experience with building robust concurrent code. And even if you do, concurrency is hard, and it's harder when you also need to manage memory without a GC (in C extension land).

So, if we stick to a somewhat naive principle where our Go extension does big things completely, we wouldn't need to criss-cross between Ruby and Go code and maintain ownerships all around.

Shared Memory

You should read about Go and CGo memory management, but hopefully, again, if we stick to the tips in this specific example - as harsh as they may sound, most of it would be irrelevant.

Making a Gem

You now have all of the theory needed to build your own Go powered Ruby gems.

Let's take a look at the grunt work of laying out such a gem so that it would build a hybrid of Go and Ruby and run successfully after install for your users.

Building Native Extensions

Ruby gems support C Ruby and Java (JRuby) extensions by way of rake-compiler It covers several concerns while building native extensions:

  • Locating files and resources for building C or Java
  • Generating a machine-specific context (makefiles, configuration) so that the build will address that specific machine's architecture and bit width
  • Running the build with the relevant tooling
  • Installing and cleaning up

What you need to do is:

  • In your gemspec, point to an extconf.rb with spec.extensions = %w[ext/extconf.rb], your extconf file will be a simple and minimal descriptor of the build (out of our scope for this post)

And, optionally:

  • Provide instruction per environment (Java, C)
  • Toggle inclusion of prebuilt binaries, that were cross-compiled ahead of time

Building Our Native Go Extension

We don't have a C codebase, nor do we need the C tooling. So we can do either of these two:

  1. Depend on Go tooling. A Go 1.5+ compiler should exist at the user's machine, we'll build the Go library on gem install time.
  2. Cross compile and include all binary artifacts with our gem. On install time, sniff out the platform and depend on the correct binary.

We'll do (1) because it feels cleaner for our purposes and makes debugging easier.

If you'd like to streamline the workflow then I'd recommend (2). This way, the Ruby and Go codebase releases are allowed for independent (and healthy) evolution.

Our Slim extconf and Makefile

Since Go makes a great toolset, this is how our extconf would look like :)

# this is a cheat
puts "running make"
`make build`

And our makefile:

build:
    go build -buildmode=c-shared -o libscatter.so libscatter.go
# fake out clean and install
clean:
install:

.PHONY: build

That's it. All the rest is standard Ruby gem, and our FFI glue code.

Summing Up

This idea works well - you can clone the scatter repo, install, run benchmarks and more.

Personally, I fantasized about a Go powered Ruby gem ever seeing and trying to build a Rust based Gem, after reading Rust and Skylight.

These kind of concoctions breath new life into a community and a platform.

Make Sure It's Worth It

For the time of this writing, this idea is pretty much novel. There are not much official advise, and the road is paved with gotchas (again, see Firefox and Go).

I think that for now, if you limit yourself to the what was presented here, you may be on the safe side.

Truth be told, with Go, the friction for building native extensions goes down dramatically. Setting up such a Go extension is fun, rather than a full-fledged C extension. Suddenly "is it worth it?" becomes "why not?".

The Real World

Ruby is fast enough for most things. However, you might find a Go based library useful for packing raw performance and cleaner concurrent codebase into:

  • An existing project, that's so old it wouldn't respond well to an architecture refactor
  • An existing team, that wouldn't respond well for swapping an entire tech stack
  • Ruby projects that miss libraries that Go already has, or has lower-quality libraries in the same sense.