llgo update #10: "hello, world!" redux

It’s about time for another progress update on llgo. I’ve made decent progress recently, so let’s go through what’s new.

Highlights

I’ve been refactoring bits of code and fixing bugs aplenty, so there is a mass of noise in the git commits. In terms of new function, the news is that we now have:

Type switches.
Type assertions.
Labeled statements; goto, labeled break and continue.
The llgo-dist command; more on this below.
String conversions: to/from byte slices; from rune/int.
String range. I’m sure the implementation could be improved.
Implemented sync/atomic using LLVM atomic operations intrinsics.
Various changes to enable linking multiple packages (e.g. exported symbols are now prefixed with their package path).
Additional support for floats (thanks to spate); partial support for complex numbers.
”…args” calls to variadic functions (including slice append).
A self-contained runtime package. I have cloned (and slightly modified in some cases) the Go portion of the runtime package from gc, and combined it with the runtime code I had already written for llgo.
Bridge code for the math package, which mostly just redirects the exported functions to the internal, pure-Go implementations.
System calls (Linux/AMD64 only so far).
Closures; more below.

llgo-dist

I have begun implementing a command that takes care of building llgo, its runtime, and in the future any other tools that might be considered part of llgo (e.g. an in-development linker). This tool will set up the cgo flags given the path to an “llvm-config” program, and build gollvm.

reflect, fmt, oh my!

Last week, I mentioned on Google+ that I managed to get the reflect package working. At least enough of it to get the fmt package to work. At least enough of the fmt package to get fmt.Println(“Hello, world!”) to work… Yep, the holy grail of programming examples now compiles, links, and runs, using llgo. This demonstrates the following things work:

Compilation of the following packages: errors, io, math, os, reflect, runtime, strconv, sync, sync/atomic, syscall, time, unicode/utf8, unsafe.
Package imports (still using the gcimporter from exp/types.)
Linking multiple compiled packages using llvm-link.
Interfaces and reflection (fmt.Println uses reflection to determine the underlying type).
System calls (fmt.Println will eventually issue a system call to write to the stdout file).

Closures

Yes indeed, we now have closures. The code is pretty hackish, so I expect it’s not very solid. I have implemented them using LLVM’s trampoline intrinsics. Essentially you provide LLVM with a function that takes N parameters, give it a block of (executable) memory and an argument to bind, and it fills in the block with function code for a function with N-1 parameters (the Nth one being bound).

Unfortunately I have found that the closures are not playing nicely with lli/JIT, which means the closure unit test I have written fails. If I compile it with llc/gcc, though, it works just fine. So either I’ve done something subtly stupid, or the JIT is clobbering something it shouldn’t. As far as I got with debugging was finding that the bound argument value is wrong when the function is entered.

I expect I’ll probably replace this implementation for a couple of reasons:

Portability: I’d rather avoid platform-specific code like this. For one thing, the PNaCl ABI calls out trampoline intrinsics as being unsupported.
Testability: I should investigate the problems I observed with lli/JIT further, and I’m loath to change implementation to support tests, it is a real problem. I rely heavily on tests to make sure I haven’t broken anything.

Until I find out that using trampolines has a marked benefit to performance in real programs, I intend to replace the current implementation with one that uses a pair of pointers for functions. The bound argument will stored in one pointer, and the function pointer in another. This has implications for all function calls, though it should be simple to achieve good performance in most cases.

What’s next?

Haven’t figured this one out yet. I have been meaning to play more with PNaCl, so I might take some time now to do that. I expect I’ll be slowing down development considerably early 2013, as (a) we’re knocking down our place and rebuilding, and (b) my second child is on the way. I hope to have llgo in a better state for contributions by then, so others can pick up the slack.

I expect in the near future I’ll start playing with clang/cgo integration, as I start playing with PNaCl. I’ll write back when I have something to demonstrate.

Until then.

Posted November 25, 2012. Tags: go, llgo, llvm.

llgo update: Go1, automated tests

This week I finished up Udacity CS373: Programming a Robotic Car, and also finally finished reading GEB. So I’ll hopefully be able to commit some more time to llgo again.

I moved on to Go’s weekly builds a while back, and updated both llgo and gollvm to conform. I’m now on Go 1, as I hope most people are by now, and llgo is in good shape for Go 1 too. That’s not to say that it compiles all of the Go 1 language, just that it runs in Go 1. Apart from that, I’ve just been working through some sample programs to increase the compiler’s capability.

One of the things that I’ve been a bit lazy about with llgo is automated testing, something I’m usually pretty keen on. I’ve grown anxious over regressions as time has gone on in the development, so I’ve spent a little bit of time this week putting together an automated test suite, which I mentioned in golang-nuts a few days ago. The test suite doesn’t cover a great deal yet, but it has picked up a couple of bugs already.

One of the numerous things I like about Go is its well integrated tooling. For testing, Go provides the testing package, and go test tool. So you write your unit tests according to the specifications in the “testing” package, run “go test”, and your tests are all run. This is comparable to, say, Python, which has a similar “unittest” package. It is vastly more friendly than the various C++ unit test frameworks; that’s in large part due to the way the Go language is designed, particularly with regard to how it fits into build systems and is parsed.

In Go, everything you need to build a package is in the source (assuming you use the “go” command).

The only external influences on the build process (environment variables GOOS, GOARCH, GOROOT, etc.) apply to the entire build procedure, not to single compilation units. Each variant will end up in a separate location when built: ${GOPATH}/pkg/${GOOS}_${GOARCH}/<pkgname>.
Platform-specific code is separated into multiple files (xxx_linux.go, xxx_windows.go, …), and they’re automatically matched with the OS/architecture by the “go” command.
Package dependencies are automatically and unambiguously resolved. Compare this with C/C++ headers, which might come from anywhere in the preprocessor’s include path.

So anyway, back to llgo’s testing. It works just like this: I’ve created a separate program for each test case in the llgo/llgo/testdata directory. Each of these programs corresponds to a test case written against the “testing” package, which does the following:

Run the program using “go run”, and store the output.
Redirect stdout to a pipe, and run a goroutine to capture the output to a string.
Compile the program using llgo’s Compile API, and then interpret the resultant bitcode using gollvm’s ExecutionEngine API.
Restore the original stdout, and compare the output with that of the original “go run”.

Pretty obvious I guess, but I was happy with how easy it was to do. Defer made the job of redirecting, restoring and closing file descriptors pain free; the go statement and channels made capturing and communicating the resulting data a cinch.

This is getting a little ramble-ish, so I’ll finish up. While testing, I discovered a problem with the way LLVM types are generated from types.Type’s, which basically means that they need to be cached and reused, rather that generated afresh each time. At the same time I intend to remove all references to LLVM from my clone of the “types” package, and offer my updates back to the Go team. It’s not fully functional yet, but there’s at least a few gaps that I’ve filled in.

One last thing: LLVM 3.1 is due out May 14, so gollvm and llgo will no longer require LLVM from SVN. I really want to eliminate the dependency on llvm-config from the build of gollvm. I’m considering a dlopen/dlsym shim and removing the cgo dependency on LLVM. I’d be keen to hear some opinions, suggestions or alternatives.

Until next time.

Posted April 8, 2012. Tags: go, llgo, gollvm, llvm.

Imports in llgo, jr.

So I realised I’m a doofus the other day, when I started getting closer to completion on producing export metadata in llgo. Rolling my own import mechanism is unnecessary for now. Instead, I can just lean on the import mechanism that exists in the standard library (well, until Go 1 at least): go/types/GcImporter.

I’ve modified llgo to use go/ast/NewPackage, rather than the old code I had that was using go/parser/ParseFiles. The NewPackage function takes an optional “importer” object which will be used for inter-package dependency resolution, whereas ParseFiles does no resolution. The standard GcImporter type may be used to identify exports by interrogating the object and archive files in $GOROOT. The AST that’s generated is filled in with external declarations, so it’s then up to llgo to convert those into LLVM external declarations. Easy peasy.

Now it’s time to come up with a symbol naming scheme. Without having thought about it too hard, I’m going to start off with the assumption that the absolute name of the symbol (package+name), with slashes converted to dots, will do the trick. Once I’ve implemented that, I’ll need to start work on the runtime in earnest. It’s also high time I put some automated tests in place, since things are starting to get a little stabler.

In the long term I’ll probably want to continue on with my original plan, which is to generate module-level metadata in the LLVM bitcode, and then extract this in a custom importer. It should be quite straightforward. Earlier this week I wrapped up some updates to gollvm to add an API to make generating source-level debugging metadata simpler. This will be useful not only for describing exports, but also for what it’s intended: generating DWARF debug information.

In other news: my wife just ordered the 1-4a box set of The Art of Computer Programming for me. At the moment I am slowly making way through Gödel, Escher, Bach: an Eternal Golden Braid, and so far, so good. Looking forward to more light reading for the bus/train!

Posted February 19, 2012. Tags: go, llgo, gollvm, llvm.

llgo: back in business

I’ve been hacking away on llgo, on evenings and weekends when I’ve had the chance. It’s now roughly equivalent in functionality to where it was before I upgraded LLVM to trunk (3.1) and broke everything. There’s a couple of added bonuses too, like proper arbitrary precision constants, and partial support for untyped constants.

Now that the basics are working again, I’ll get back to working on the import/export mechanism. I expect this will expose more design flaws, and will take a while. I still plan to make use of debug metadata, which I am not altogether familiar with. I’ll also need to decide how the linker and the runtime library are going to work.

In other news, I’ve moved Pushy to GitHub. I’m not actively developing it at the moment, but I wanted to consolidate the services I’m consuming. I do have an addition to the Java API in the works: a sort of remote classloader, that will communicate over a Pushy connection to fetch classes/resources. The idea is to make it really quick and easy to run a bit of Java code on a remote machine, without having to deploy the application remotely. I’ll hopefully get around to pushing this change within the coming few weeks.