Coming in Juju 2.3: storage improvements

I’ve just about wrapped up a set of improvements to storage for Juju 2.3, the next “minor” release. If you’re already using, or have been planning to use Juju’s storage support, read on.

Dynamic storage management

Juju charms can specify storage requirements: the number of filesystems or block devices its application requires. For example, the PostgreSQL charm requires a filesystem on which to store the database. If you don’t tell Juju otherwise, the storage will go onto the root filesystem, but you can also tell Juju to provide the charm with cloud storage (Amazon EBS, OpenStack Cinder, etc.)

One of the missing pieces that users have been asking for is the ability to manage the lifecycle of storage independently of applications and units, and to reuse existing storage. In Juju 2.3, when you remove an application or unit, the storage attached to the unit(s) will (if possible) be detached, rather than destroyed, and will remain in the model. You can then either remove the storage using juju remove-storage, or attach it to a new unit using the new juju attach-storage command, or the --attach-storage flag added to juju deploy and juju add-unit. To complement juju attach-storage, there is also a new juju detach-storage command.

So to illustrate, you can now deploy PostgreSQL with cloud storage, then remove the application, and redeploy (e.g. with more RAM), using the same storage.

juju deploy postgresql --storage=10G
…
juju remove-application postgresql
…
juju deploy postgresql --constraints mem=16G --attach-storage pgdata/0

We’re still working on giving you commands to remove storage from the model without destroying it, and then import it into a new model (possibly new controller). This is required for disaster recovery. Whether this makes it for 2.3 depends on prioritisation; if it doesn’t make it for 2.3, it shouldn’t be far behind.

LXD Storage Provider

One thing that we hadn’t planned for 2.3, but we did manage to get done, is a LXD storage provider. LXD has recently added its own storage management API, and Juju 2.3 will have a storage provider that uses it. I originally implemented the Juju side of things as a bit of a hack, behind a feature flag, in order to speed up the development of the aforementioned attach/detach changes. The LXD storage API turned out to be very straight forward to build on, so we decided to release the Juju changes into the wild in case it’s of use to others. Particularly if you’re developing or testing charms that use storage, this should be useful.

Using the LXD storage provider is as simple as:

juju deploy postgresql --storage=10G,lxd

Each storage pool using the “lxd” storage provider will create an associated storage in LXD. When you create a storage pool in Juju, you need to specify two configuration attributes:

the LXD storage pool name, as the “lxd-pool” attribute
the LXD storage driver, as the “driver” attribute

You can also define driver-specific attributes, which will be passed through to the LXD storage driver verbatim.

Juju predefines a “lxd-zfs” pool, with the following attributes:

lxd-pool=juju-zfs
driver=zfs
zfs.pool_name=juju-zfs

If you deploy an application with storage using the lxd-zfs pool, Juju will create a LXD storage pool called “juju-zfs” with the “zfs” driver, and ZFS pool called “juju-zfs”. To find out more about the LXD storage driver options, see the LXD storage docs.

Posted July 13, 2017. Tags: juju, storage, lxd.

Juju 2.1 and CentOS

In the Juju 2.1 release, I made a couple of small changes to better support CentOS servers.

The first change was to support “manual provisioning” of CentOS machines. Manual provisioning is when you point Juju at a machine, and Juju connects to the machine over SSH and sets it up with a Juju agent. To do this, Juju needs to run a small shell script to discover the OS version and hardware characteristics of the machine. With a minor change to that script, you can now manually provision CentOS machines.

The second change is to support CentOS LXD images. A small change was needed in the Juju code to support the “centos7” OS version, and alter the way we handle local LXD image aliases. If an image exists locally with the expected alias (e.g. “juju/centos7/amd64”), then we’ll use that and skip looking in the remote image sources. This also improves container startup time when you live in a faraway land like me. Altering Juju is not quite enough though, as there are no existing CentOS images that Juju can use.

Juju (mostly) requires cloud-init to be present on the machines it starts, so that it can inject Juju-specific configuration and scripts to run on startup. Unforunately, there are no CentOS LXD images that have cloud-init already. To work around this, I wrote a standalone Go program to transform the linuxcontainers.org CentOS image: github.com/axw/juju-lxd-centos-image-builder. Eventually we hope to have pre-canned CentOS LXD images available to Juju, but for now you can use this program to prepare an image for Juju. Run it from the LXD host, and Juju will be able to use the resulting image.

Posted February 23, 2017. Tags: juju, lxd.

New in Juju 2.1: Prometheus Metrics

Juju is an application modelling tool, enabling “model-driven operations”. I won’t go into detail about what Juju is in this blog post, so if you’re new to Juju I suggest clicking on the link and reading a bit more.

Juju is a distributed application, with a “controller” cluster that manages cloud resources (machines, networks, volumes, etc.), and applications that use those resources. The controller cluster is currently based on top of MongoDB, utilising replica sets for data replication and leadership election.

As well as the controller cluster, Juju agents run on every virtual machine that the controller manages. The controllers, and those agents, each run many fine-grained, but dedicated “workers”. For example, each agent runs a worker to detect block devices and publish that information to the controller cluster; each controller runs a worker to maintain the replica sets in MongoDB.

Many things can go wrong in a distributed system. Network partitions can cause system-wide failures. Bad actors (badly written; less often, malicious) may starve others of resources. Failure to release memory or file handles leads to exhaustion, causing a DoS. Juju has seen its fair share of each of these problems.

To combat such issues, we have recently added Prometheus monitoring to Juju. As of Juju 2.1, Juju controllers and agents will export Prometheus metrics. There are two ways to get at them:

(on controllers) an HTTPS endpoint, https://…:17070/introspection/metrics.
(on Linux agents) an abstract domain socket, @jujud-machine-<machine-ID>

Juju metrics available from Prometheus

Configuring Prometheus to scrape Juju controllers

To configure Prometheus to scrape metrics from Juju controllers, you will need to add a new scrape target to Prometheus. The metrics endpoint requires authorisation, so you will need to configure a user and password for Prometheus to use:

$ juju add-user prometheus
$ juju change-user-password prometheus
new password: <password>
type new password again: <password>

For this new “prometheus” user to be able to access the metrics endpoint, you must grant the user read access to the controller model:

$ juju grant prometheus read controller

This gives the prometheus user just enough permission to read information on the controller, without allowing it to make changes, which would not be ideal for a monitoring application.

Juju serves the metrics over HTTPS, currently with no option of degrading to HTTP. You can configure your Prometheus to skip validation, or you can store the controller’s CA certificate in a file for Prometheus to verify the server’s certificate against:

$ juju controller-config ca-cert > /path/to/juju-ca.crt

We can now add a scrape target to Prometheus. Modify prometheus.yml, adding the following scrape target:

scrape_configs:
  job_name: juju
    metrics_path: /introspection/metrics
    scheme: https
    static_configs:
      targets: ['<controller-address>:17070']
    basic_auth:
      username: user-prometheus
      password: <password>
    tls_config:
      ca_file: /path/to/juju-ca.crt

Juju API requests total metric

Configuring Prometheus to scrape Juju agents

To expose the metrics of agents, you can deploy the juju-introspection charm onto that agent’s machine. For example, on machine 1, you would run:

juju deploy ~axwalk/juju-introspection --to 1

The metrics of that agent can then be obtained via:

http://<machine-1-address>:19090/agents/machine-1/metrics

Note that this is not an officially supported charm. The code for it is available at: