llgo update #10: "hello, world!" redux

It’s about time for another progress update on llgo. I’ve made decent progress recently, so let’s go through what’s new.

Highlights

I’ve been refactoring bits of code and fixing bugs aplenty, so there is a mass of noise in the git commits. In terms of new function, the news is that we now have:

  • Type switches.
  • Type assertions.
  • Labeled statements; goto, labeled break and continue.
  • The llgo-dist command; more on this below.
  • String conversions: to/from byte slices; from rune/int.
  • String range. I’m sure the implementation could be improved.
  • Implemented sync/atomic using LLVM atomic operations intrinsics.
  • Various changes to enable linking multiple packages (e.g. exported symbols are now prefixed with their package path).
  • Additional support for floats (thanks to spate); partial support for complex numbers.
  • ”…args” calls to variadic functions (including slice append).
  • A self-contained runtime package. I have cloned (and slightly modified in some cases) the Go portion of the runtime package from gc, and combined it with the runtime code I had already written for llgo.
  • Bridge code for the math package, which mostly just redirects the exported functions to the internal, pure-Go implementations.
  • System calls (Linux/AMD64 only so far).
  • Closures; more below.

llgo-dist

I have begun implementing a command that takes care of building llgo, its runtime, and in the future any other tools that might be considered part of llgo (e.g. an in-development linker). This tool will set up the cgo flags given the path to an “llvm-config” program, and build gollvm.

reflect, fmt, oh my!


Last week, I mentioned on Google+ that I managed to get the reflect package working. At least enough of it to get the fmt package to work. At least enough of the fmt package to get fmt.Println(“Hello, world!”) to work… Yep, the holy grail of programming examples now compiles, links, and runs, using llgo. This demonstrates the following things work:

  1. Compilation of the following packages: errors, io, math, os, reflect, runtime, strconv, sync, sync/atomic, syscall, time, unicode/utf8, unsafe.
  2. Package imports (still using the gcimporter from exp/types.)
  3. Linking multiple compiled packages using llvm-link.
  4. Interfaces and reflection (fmt.Println uses reflection to determine the underlying type).
  5. System calls (fmt.Println will eventually issue a system call to write to the stdout file).

Closures

Yes indeed, we now have closures. The code is pretty hackish, so I expect it’s not very solid. I have implemented them using LLVM’s trampoline intrinsics. Essentially you provide LLVM with a function that takes N parameters, give it a block of (executable) memory and an argument to bind, and it fills in the block with function code for a function with N-1 parameters (the Nth one being bound).

Unfortunately I have found that the closures are not playing nicely with lli/JIT, which means the closure unit test I have written fails. If I compile it with llc/gcc, though, it works just fine. So either I’ve done something subtly stupid, or the JIT is clobbering something it shouldn’t. As far as I got with debugging was finding that the bound argument value is wrong when the function is entered.

I expect I’ll probably replace this implementation for a couple of reasons:

  • Portability: I’d rather avoid platform-specific code like this. For one thing, the PNaCl ABI calls out trampoline intrinsics as being unsupported.
  • Testability: I should investigate the problems I observed with lli/JIT further, and I’m loath to change implementation to support tests, it is a real problem. I rely heavily on tests to make sure I haven’t broken anything.
Until I find out that using trampolines has a marked benefit to performance in real programs, I intend to replace the current implementation with one that uses a pair of pointers for functions. The bound argument will stored in one pointer, and the function pointer in another. This has implications for all function calls, though it should be simple to achieve good performance in most cases.

What’s next?

Haven’t figured this one out yet. I have been meaning to play more with PNaCl, so I might take some time now to do that. I expect I’ll be slowing down development considerably early 2013, as (a) we’re knocking down our place and rebuilding, and (b) my second child is on the way. I hope to have llgo in a better state for contributions by then, so others can pick up the slack.

I expect in the near future I’ll start playing with clang/cgo integration, as I start playing with PNaCl. I’ll write back when I have something to demonstrate.

Until then.

Posted November 25, 2012. Tags: go, llgo, llvm.

llgo update, milestone

In between gallivanting in Sydney, working, and organising to have a new house built, I’ve squeezed in a little bit of work on llgo. If you’ve been following along on Github, you’ll have seen that things have progressed a bit since last time I wrote.

Aside from a slew of bug fixes and trivialities, llgo now implements:

  • Slice operations (make, append, slice expressions). I’ve only implemented single-element appends so far, i.e. No append(s, a, b, c, …) or (s, abc…) yet.
  • Named results in functions.
  • Maps - creation, indexing, assignment, and deletion. The underlying implementation is just a dumb linked-list at this point in time. I’ll implement it as a hash map in the future, when there aren’t more important things to implement.
  • Range statements for arrays, slices and maps. I haven’t done strings yet, simply because it requires a bit more thought into iterating through strings runes-at-a-time. I don’t expect it’ll be too much work.
  • Branch statements, except for goto. You can now break, continue, and fallthrough.
  • String indexing, and slicing.
  • Function literals. Once upon a time these were working, but they haven’t been for a while. Now they are again. Note that this does not include support for closures at this stage, so usefulness is pretty limited.

Early on in the development of llgo, I decided that rather than implementing the compiler by going through the specification one item at a time, I’d drive the development by attempting to compile a real program. For this, I chose maketables, a program from the unicode standard library package. As of today, llgo can successfully compile the program. That is, it compiles that specific file, maketables.go. It doesn’t yet compile all of its dependencies, and it certainly doesn’t link or produce a usable program.

So now I’ll be working towards getting all of the dependencies compiling, then linking. In the interest of seeing usable progress, I think I might now take a bottom-up approach and start focusing on the core libraries, like runtime and syscall. I’ll report back when I have something interesting to say.

Posted September 9, 2012.

gocov, llgo update

I guess it’s time for a quick update. I’m not very diligent with this blogging thing; too busy having fun, programming. Sorry about that!

Introducing gocov

A couple of weeks ago I announced gocov, a coverage testing tool for the Go programming language. I wrote gocov to quickly get an idea of how broadly tested packages are (namely exp/types, which I’m working on in the background). The tool itself is written in Go, and works by source instrumentation/transformation. Currently gocov only does statement coverage.

Using gocov is relatively simple (if I do say so myself). First, you install gocov by running:

go get github.com/axw/gocov/gocov

This will install the gocov tool into your $GOPATH/bin directory. Once you have it installed, you can test a package (i.e. run its tests, and generate coverage data), by running:

gocov test <path/to/package>

Under the covers, this will run “go test <path/to/package>”, after having gone through the process of instrumenting the source. Once the tests are complete, gocov will output the coverage information as a JSON structure to stdout. So you might want to pipe that output somewhere…

Once you’ve got the coverage information, you’ll probably want to view it. So there are two other gocov commands: report, and annotate. The report command will generate a text report of the coverage of all the functions in the coverage information provided to it. For example:

gocov test github.com/axw/llgo/types | gocov report

… will generate a report that looks something like:


types/exportdata.go readGopackHeader 69.23% (913)
types/gcimporter.go gcParser.expect 66.67% (46)
types/gcimporter.go gcParser.expectKeyword 66.67% (23)
The annotate command will print out the source for a specified function, along with an annotation for each line that was missed. For example:

gocov test github.com/axw/llgo/types | gocov annotate - types.gcParser.expectKeyword

… will output the following:

266             func (p gcParser) expectKeyword(keyword string) {
267 lit := p.expect(scanner.Ident)
268 if lit != keyword {
269 MISS p.errorf(“expected keyword %s, got %q”, keyword, lit)
270 }
271 }

As is often the case when I write software, I wrote gocov for my own needs; as such it’s not terribly featureful, only doing what I’ve needed thus far. If you would like to add a feature (maybe HTML output, or branch coverage), feel free to send a pull request on the Github repository, and I’ll take a gander.

Anyway, I hope it’s of use to people. But not too many people, I don’t have time to fix all of my crappy code! (Just kidding, I have no life.)

Update on llgo: interface comparisons, exp/types

I don’t have a lot to report on this front, as I’ve been doing various other things, like that stuff up there, but I can share a couple of bits of mildly interesting news.

I’ve been working a little on the runtime for llgo, and I’m proud to say there’s now an initial implementation of interface comparison in the runtime. This involved filling in the algorithm table for runtime types, implementing the runtime equality function (runtime.memequal), and implementing a runtime function (runtime.compareI2I) to extract and call it. It probably doesn’t sound exciting when put like that, but this is something of a milestone.

By the way, if you want to actually use the runtime, you can do it like this:

  1. Compile your program with llgo, storing the bitcode in file x.ll.
  2. Compile llgo/runtime/.go with llgo, storing the bitcode in file y.ll.
  3. Link the two together, using llvm-link: llvm-link -o z.ll x.ll y.ll
And you’re done. The resultant module, housed in z.ll, contains your program and the llgo runtime. Now you can concatenate strings and compare interfaces to your heart’s content. Eventually llgo will contain an integrated linker, which will rewrite symbol names according to package paths.


Finally, on exp/types: I submitted my first two CL’s. Some of my ideas for exp/types were ill thought out, so the first was rejected (fairly), and the second needs some rework. I’ll be writing up a design proposal document at some stage, to better document my rationale for changes. Anyway, I’ll keep plugging away…

Ade!

Posted July 21, 2012.

Unit-testing llgo's runtime

It’s been a while since I last wrote, primarily because I’ve been moving house and was without Internet at home during the process. It’s back now, but now I have Diablo III to contend with.

In my previous post I mentioned that I would create a new branch for working on the llgo runtime. I haven’t done that yet, though I haven’t broken the build either. Rather, I’ve introduced conditional compilation to gollvm for builds against LLVM’s trunk where unreleased functionality is required, e.g. LinkModules. This isn’t currently being used in llgo-proper, so I’ve gotten away without branching so far.

The tag for building gollvm with unreleased functions is “llvmsvn”, so to build gollvm with LLVM’s trunk, including the LinkModules function, do the following:

curl https://raw.github.com/axw/gollvm/master/install.sh -tags llvmsvn | sh
So I didn’t break “the build”, meaning you can still build gollvm/llgo without also building LLVM from source. I did, however, break the llgo unit tests, as they are using the new LinkModules function. If you want to run the unit tests without building LLVM from source, then you can comment out the call to llvm.LinkModules in llgo/utils_test.go; of course, you should expect failures due to the runtime not being linked in, but that doesn’t involve all tests.

What else is new?
  • I announced on golang-dev a couple of weeks ago that I intend to work on getting exp/types up to snuff. I’ve moved some of the type construction code out of llgo-proper into llgo/types (a fork of exp/types), and eliminated most of the llgo-specific stuff from llgo/types. I’ll need to set aside some time soon to learn how to use Mercurial and create some changelists.
  • A few weeks ago I started playing with llgo and PNaCl, to see how hard it would be to get something running in Chrome. It works (with the IR Translator/external sandbox anyway), but then llgo doesn’t really do much at the moment.
That’s all for now.

Posted June 3, 2012.

An llgo runtime emerges

It’s been a long time coming, but I’m now starting to put together pieces of the llgo runtime. Don’t expect much any time soon, but I am zeroing in on a design at least. The sleuths in the crowd will find that only string concatenation has been implemented thus far, which is pretty boring. Next up, I hope, will be interface-to-interface conversions, and interface-to-value conversions, both of which require (for a sane implementation) a runtime library.

I had previously intended to write the runtime largely in C, as I expected that would be the easiest route. I started down this road writing a basic thread creation routine using pthread, written in C. The code was compiled using Clang, emitting LLVM IR which could be easily linked with the code generated by llgo. It’s more or less the same idea implemented by the gc Go compiler (linking C and Go code, not relying on pthread). Even so, I’d like to write the runtime in Go as much as possible.

Why write the runtime in Go? Well for one, it will make llgo much more self contained, which will make distribution much easier since there won’t be a reliance on Clang. Another reason is based on a lofty, but absolutely necessary goal: that llgo will one day be able to compile itself. If llgo compiles itself, and compiles its own runtime, then we have a great target for compiler optimisations: the compiler itself. In other words, “compiler optimisations should pay for themselves.”

In my last post I mentioned that LLVM 3.1 is coming up fast, and this release has the changes required by llgo. Unfortunately, I’ve just found that the C API lacks an interface for linking modules, so I’m going to have to submit a patch to LLVM again, and the window for inclusion in 3.1 has certainly passed. Rather than break gollvm/llgo’s trunk again, I’ll create a branch for work on the runtime. I’ll post again when I’ve submitted a patch to LLVM, assuming the minor addition is accepted.

Posted April 28, 2012.

llgo update: Go1, automated tests

This week I finished up Udacity CS373: Programming a Robotic Car, and also finally finished reading GEB. So I’ll hopefully be able to commit some more time to llgo again.


I moved on to Go’s weekly builds a while back, and updated both llgo and gollvm to conform. I’m now on Go 1, as I hope most people are by now, and llgo is in good shape for Go 1 too. That’s not to say that it compiles all of the Go 1 language, just that it runs in Go 1. Apart from that, I’ve just been working through some sample programs to increase the compiler’s capability.

One of the things that I’ve been a bit lazy about with llgo is automated testing, something I’m usually pretty keen on. I’ve grown anxious over regressions as time has gone on in the development, so I’ve spent a little bit of time this week putting together an automated test suite, which I mentioned in golang-nuts a few days ago. The test suite doesn’t cover a great deal yet, but it has picked up a couple of bugs already.

One of the numerous things I like about Go is its well integrated tooling. For testing, Go provides the testing package, and go test tool. So you write your unit tests according to the specifications in the “testing” package, run “go test”, and your tests are all run. This is comparable to, say, Python, which has a similar “unittest” package. It is vastly more friendly than the various C++ unit test frameworks; that’s in large part due to the way the Go language is designed, particularly with regard to how it fits into build systems and is parsed.

In Go, everything you need to build a package is in the source (assuming you use the “go” command).
  • The only external influences on the build process (environment variables GOOS, GOARCH, GOROOT, etc.) apply to the entire build procedure, not to single compilation units. Each variant will end up in a separate location when built: ${GOPATH}/pkg/${GOOS}_${GOARCH}/<pkgname>.
  • Platform-specific code is separated into multiple files (xxx_linux.go, xxx_windows.go, …), and they’re automatically matched with the OS/architecture by the “go” command.
  • Package dependencies are automatically and unambiguously resolved. Compare this with C/C++ headers, which might come from anywhere in the preprocessor’s include path.
So anyway, back to llgo’s testing. It works just like this: I’ve created a separate program for each test case in the llgo/llgo/testdata directory. Each of these programs corresponds to a test case written against the “testing” package, which does the following:
  1. Run the program using “go run”, and store the output.
  2. Redirect stdout to a pipe, and run a goroutine to capture the output to a string.
  3. Compile the program using llgo’s Compile API, and then interpret the resultant bitcode using gollvm’s ExecutionEngine API.
  4. Restore the original stdout, and compare the output with that of the original “go run”.
Pretty obvious I guess, but I was happy with how easy it was to do. Defer made the job of redirecting, restoring and closing file descriptors pain free; the go statement and channels made capturing and communicating the resulting data a cinch.

This is getting a little ramble-ish, so I’ll finish up. While testing, I discovered a problem with the way LLVM types are generated from types.Type’s, which basically means that they need to be cached and reused, rather that generated afresh each time. At the same time I intend to remove all references to LLVM from my clone of the “types” package, and offer my updates back to the Go team. It’s not fully functional yet, but there’s at least a few gaps that I’ve filled in.

One last thing: LLVM 3.1 is due out May 14, so gollvm and llgo will no longer require LLVM from SVN. I really want to eliminate the dependency on llvm-config from the build of gollvm. I’m considering a dlopen/dlsym shim and removing the cgo dependency on LLVM. I’d be keen to hear some opinions, suggestions or alternatives.

Until next time.

Posted April 8, 2012. Tags: go, llgo, gollvm, llvm.

Imports in llgo, jr.

So I realised I’m a doofus the other day, when I started getting closer to completion on producing export metadata in llgo. Rolling my own import mechanism is unnecessary for now. Instead, I can just lean on the import mechanism that exists in the standard library (well, until Go 1 at least): go/types/GcImporter.

I’ve modified llgo to use go/ast/NewPackage, rather than the old code I had that was using go/parser/ParseFiles. The NewPackage function takes an optional “importer” object which will be used for inter-package dependency resolution, whereas ParseFiles does no resolution. The standard GcImporter type may be used to identify exports by interrogating the object and archive files in $GOROOT. The AST that’s generated is filled in with external declarations, so it’s then up to llgo to convert those into LLVM external declarations. Easy peasy.

Now it’s time to come up with a symbol naming scheme. Without having thought about it too hard, I’m going to start off with the assumption that the absolute name of the symbol (package+name), with slashes converted to dots, will do the trick. Once I’ve implemented that, I’ll need to start work on the runtime in earnest. It’s also high time I put some automated tests in place, since things are starting to get a little stabler.

In the long term I’ll probably want to continue on with my original plan, which is to generate module-level metadata in the LLVM bitcode, and then extract this in a custom importer. It should be quite straightforward. Earlier this week I wrapped up some updates to gollvm to add an API to make generating source-level debugging metadata simpler. This will be useful not only for describing exports, but also for what it’s intended: generating DWARF debug information.

In other news: my wife just ordered the 1-4a box set of The Art of Computer Programming for me. At the moment I am slowly making way through Gödel, Escher, Bach: an Eternal Golden Braid, and so far, so good. Looking forward to more light reading for the bus/train!

Posted February 19, 2012. Tags: go, llgo, gollvm, llvm.

llgo: back in business

I’ve been hacking away on llgo, on evenings and weekends when I’ve had the chance. It’s now roughly equivalent in functionality to where it was before I upgraded LLVM to trunk (3.1) and broke everything. There’s a couple of added bonuses too, like proper arbitrary precision constants, and partial support for untyped constants.

Now that the basics are working again, I’ll get back to working on the import/export mechanism. I expect this will expose more design flaws, and will take a while. I still plan to make use of debug metadata, which I am not altogether familiar with. I’ll also need to decide how the linker and the runtime library are going to work.

In other news, I’ve moved Pushy to GitHub. I’m not actively developing it at the moment, but I wanted to consolidate the services I’m consuming. I do have an addition to the Java API in the works: a sort of remote classloader, that will communicate over a Pushy connection to fetch classes/resources. The idea is to make it really quick and easy to run a bit of Java code on a remote machine, without having to deploy the application remotely. I’ll hopefully get around to pushing this change within the coming few weeks.

Posted February 11, 2012. Tags: go, llgo, llvm, pushy.

cmonster 0.2 released

Last week I announced a new version of cmonster (now version 0.2) on the Clang mailing list. I’ve finally updated the Github page for cmonster with some basic examples, and installation instructions.

I asked on the Clang mailing list for some feedback, but so far all I’m hearing is crickets. I’m surprised that nobody’s interested enough to reply, but I’ll freely admit that I’m not particularly good at marketing. If you do check it out, let me know what I can do to make it useful for you.

In other news: I haven’t spent much time on llgo recently, what with Real Life happening all the time. I sent a patch into LLVM to add improved support for named metadata, which was accepted. I also made a bunch of fixes to gollvm so that it builds and works with LLVM 3.0 (and some additional changes to work with trunk).

There’s been some changes to LLVM that mean I can no longer abuse the metadata system by attaching metadata to values. Metadata can now be specified only on instructions. This means that I can no longer attach type information to values using metadata, nor identify function receivers in a similar way. So llgo will need to maintain type and receiver (amongst other) information outside of LLVM’s values. This was always going to be necessary, I’d just been putting it off to get something that worked.

I hope to get back to making inroad with llgo soon. I feel like I made pretty good progress on implementing my crazy ideas in 2011. Here’s hoping 2012 works out as well.

Posted January 7, 2012.

Imports in llgo

It’s been a while. I’ve implemented bits and pieces of the Go language: constants (though not properly handling arbitrary precision numbers at this stage), structs, functions (with and without receivers, as declarations and as literals), and goroutines. Much of the implementation is simplistic, not covering all bases. I want to get something vaguely useful working before I go down the path to completeness.

I intended to wait until I had something interesting to show off, but unfortunately I’ve hit a snag with LLVM.

Debug information might sound luxurious, but it’s how I’m intending to encode information about exported symbols, so I can implement imports. The gc compiler creates a header in archives that lists all of this information. LLVM provides a standard set of metadata for describing debug information, using DWARF descriptors.

So I figured I could build on the LLVM metadata, extending it to describe Go-specific information where necessary. The benefit here, aside from reusing an existing well-defined format, is that I’d get DWARF debugging information in my output binaries, something I’d eventually want anyway. Unfortunately it appears that there’s no C API for generating named metadata.

I’ll have a look at extending the LLVM C API now. Down the rabbit hole…

Update - 7 January 2012
I submitted a patch to LLVM to add support for adding named metadata to a module, which has been accepted. This will, presumably, be part of LLVM 3.1.

Posted December 3, 2011.