OpenStack Icehouse RC1 for Ubuntu 14.04 and 12.04

OpenStack Icehouse RC1 packages for Cinder, Glance, Keystone, Neutron, Heat, Ceilometer, Horizon and Nova are now available in the current Ubuntu development release and the Ubuntu Cloud Archive for Ubuntu 12.04 LTS.

To enable the Ubuntu Cloud Archive for Icehouse on Ubuntu 12.04:

sudo add-apt-repository cloud-archive:icehouse
sudo apt-get update

Users of the Ubuntu development release (trusty) can install OpenStack Icehouse without any further steps required.

Other packages which have been updated for this Ubuntu release and are pertinent for OpenStack users include:

  • Open vSwitch 2.0.1 (+ selected patches)
  • QEMU 1.7 (upgrade to 2.0 planned prior to final release)
  • libvirt 1.2.2
  • Ceph 0.78 (firefly stable release planned as a stable release update)

Note that the 3.13 kernel that will be released with Ubuntu 14.04 supports GRE and VXLAN tunnelling via the in-tree Open vSwitch module – so no need to use dkms packages any longer!  You can read more about using Open vSwitch with Ubuntu in my previous post.

Ubuntu 12.04 users should also note that Icehouse is the last OpenStack release that will be backported to 12.04 – however it will receive support for the remainder of the 12.04 LTS support lifecycle (3 years).

Remember that you can always report bugs on packages in the Ubuntu Cloud Archive and Ubuntu 14.04 using the ubuntu-bug tool – for example:

ubuntu-bug nova-compute

Happy testing!

 

Tagged , , , , ,

Which Open vSwitch?

Since Ubuntu 12.04, we’ve shipped a number of different Open vSwitch versions supporting various different kernels in various different ways; I thought it was about time that the options were summarized to enable users to make the right choice for their deployment requirements.

Open vSwitch for Ubuntu 14.04 LTS

Ubuntu 14.04 LTS will be the first Ubuntu release to ship with in-tree kernel support for Open vSwitch with GRE and VXLAN overlay networking – all provided by the 3.13 Linux kernel. GRE and VXLAN are two of the tunnelling protocols used by OpenStack Networking (Neutron) to provide logical separation between tenants within an OpenStack Cloud.

This is great news from an end-user perspective as the requirement to use the openvswitch-datapath-dkms package disappears as everything should just *work* with the default Open vSwitch module. This allows us to have much more integrated testing of Open vSwitch as part of every kernel update that we will release for the 3.13 kernel going forward.

You’ll still need the userspace tooling to operate Open vSwitch; for Ubuntu 14.04 this will be the 2.0.1 release of Open vSwitch.

Open vSwitch for Ubuntu 12.04 LTS

As we did for the Raring 3.8 hardware enablement kernel, an openvswitch-lts-saucy package is working its way through the SRU process to support the Saucy 3.11 hardware enablement kernel; if you are using this kernel, you’ll be able to continue to use the full feature set of Open vSwitch by installing this new package:

sudo apt-get install openvswitch-datapath-lts-saucy-dkms

Note that if you are using Open vSwitch on Ubuntu 12.04 with the Ubuntu Cloud Archive for OpenStack Havana, you will already have access to this newer kernel module through the normal package name (openvswitch-datapath-dkms).

DKMS package names

Ubuntu 12.04/Linux 3.2: openvswitch-datapath-dkms (1.4.6)
Ubuntu 12.04/Linux 3.5: openvswitch-datapath-dkms (1.4.6)
Ubuntu 12.04/Linux 3.8: openvswitch-datapath-lts-raring-dkms (1.9.0)
Ubuntu 12.04/Linux 3.11: openvswitch-datapath-lts-saucy-dkms (1.10.2)
Ubuntu 12.04/Linux 3.13: N/A
Ubuntu 14.04/Linux 3.13: N/A

Hope that makes things clearer…

Tagged , ,

Call for testing: Juju and gccgo

Today I uploaded juju-core 1.17.0-0ubuntu2 to the Ubuntu Trusty archive.

This version of the juju-core package provides Juju binaries built using both the golang gc compiler and the gccgo 4.8 compiler that we have for 14.04.

The objective for 14.04 is to have a single toolchain for Go that can support x86, ARM and Power architectures. Currently the only way we can do this is to use gccgo instead of golang-go.

This initial build still only provides packages for x86 and armhf; other architectures will follow once we have sorted out exactly how to provide the ‘go’ tool on platforms other than these.

By default, you’ll still be using the golang gc built binaries; to switch to using the gccgo built versions:

sudo update-alternatives --set juju /usr/lib/juju-1.17.0-gcc/bin/juju

and to switch back:

sudo update-alternatives --set juju /usr/lib/juju-1.17.0/bin/juju

Having both versions available should make diagnosing any gccgo specific issues a bit easier.

To push the local copy of the jujud binary into your environment use:

juju bootstrap --upload-tools

This is not recommended for production use but will ensure that you are testing the gccgo built binaries on both client and server.

Thanks to Dave Cheney and the rest of the Juju development team for all of the work over the last few months to update the codebases for Juju and its dependencies to support gccgo!

Tagged

OpenvSwitch for Ubuntu 12.04.3 LTS

Supporting the OpenvSwitch datapath kernel module packages on Ubuntu 12.04 whilst ensuring compatibility with the hardware enablement kernels that we push out for each point release has been challenging; the patch set I had to implement on top of 1.4.0 to support the Quantal 3.5 kernel was not insignificant!

The upstream provided datapath kernel module is important for OpenStack users as it provides support for overlay networking using GRE tunnels which is used extensively by Neutron for separation of Layer 2 tenant networks. Right now the native kernel module does not support this feature (although that is being worked on – hopefully for 14.04 we can drop the datapath module provided by upstream completely).

For the Raring 3.8 kernel that will ship with the Ubuntu 12.04.3 point release we are taking a slightly different approach; instead of patching the hell out of the 1.4.0 OpenvSwitch datapath module again, we will be providing specific packages for the Raring HWE kernel.

If you currently use the openvswitch-datapath-dkms module and want to switch to the Raring HWE kernel then you will need to take the following action:

sudo apt-get install openvswitch-datapath-lts-raring-dkms

There is also an equivalent openvswitch-datapath-lts-raring-source package for users of module-assistant. These packages are based on the 1.9.0 release of OpenvSwitch that we have in Ubuntu 13.04 which provides full compatibility with the 3.8 kernel.

The userspace tools and daemons, openvswitch-switch for example, are compatible with later datapath module versions so these won’t be upgraded.

These updates are currently in the precise-proposed pocket undergoing verification testing in preparation for release alongside Ubuntu 12.04.3 – see bug 1213021 for full details if you would like to help out with testing.

EOM

Tagged , ,

Targetted machine deployment with Juju

As I blogged previously, its possible to deploy multiple charms to a single physical server using KVM, Juju and MAAS with the virtme charm.

With earlier versions of Juju it was also possible to use the ‘jitsu deploy-to’ hack to deploy multiple charms onto a single server without any separation; however this had some limitations specifically around use of ‘juju add-unit’ which just did crazy things and made this hack not particularly useful in real-world deployments.  It also does not work with the latest versions of Juju which no longer use Zookeeper for co-ordination.

As of the latest release of Juju (available in this PPA and in Ubuntu Saucy), Juju now has native support for specifying which machine a charm should be deployed to:

juju bootstrap --constraints="mem=4G"
juju deploy --to 0 mysql
juju deploy --to 0 rabbitmq-server

This will result in an environment with a bootstrap machine (0) which is also running both mysql and rabbitmq:

$ juju status
machines:
  "0":
    agent-state: started
    agent-version: 1.11.4
    dns-name: 10.5.0.41
    instance-id: 37f3e394-007c-42b9-8bde-c14ae41f50da
    series: precise
    hardware: arch=amd64 cpu-cores=2 mem=4096M
services:
  mysql:
    charm: cs:precise/mysql-26
    exposed: false
    relations:
      cluster:
      - mysql
    units:
      mysql/0:
        agent-state: started
        agent-version: 1.11.4
        machine: "0"
        public-address: 10.5.0.41
  rabbitmq-server:
    charm: cs:precise/rabbitmq-server-12
    exposed: false
    relations:
      cluster:
      - rabbitmq-server
    units:
      rabbitmq-server/0:
        agent-state: started
        agent-version: 1.11.4
        machine: "0"
        public-address: 10.5.0.41

Note that you need to know the identifier of the machine that you are going to “deploy –to” – in all deployments, machine 0 is always the bootstrap node so the above example works nicely.

As of the latest release of Juju, the ‘add-unit’ command also supports the –to option, so its now possible to specifically target machines when expanding service capacity:

juju deploy --constraints="mem=4G" openstack-dashboard
juju add-unit --to 1 rabbitmq-server

I should now have a second machine running both the openstack-dashboard service and a second unit of the rabbitmq-server service:

$ juju status
machines:
  "0":
    agent-state: started
    agent-version: 1.11.4
    dns-name: 10.5.0.44
    instance-id: 99a06a9b-a9f9-4c4a-bce3-3b87fbc869ee
    series: precise
    hardware: arch=amd64 cpu-cores=2 mem=4096M
  "1":
    agent-state: started
    agent-version: 1.11.4
    dns-name: 10.5.0.45
    instance-id: d1c6788a-d120-44c3-8c55-03aece997fd7
    series: precise
    hardware: arch=amd64 cpu-cores=2 mem=4096M
services:
  mysql:
    charm: cs:precise/mysql-26
    exposed: false
    relations:
      cluster:
      - mysql
    units:
      mysql/0:
        agent-state: started
        agent-version: 1.11.4
        machine: "0"
        public-address: 10.5.0.44
  openstack-dashboard:
    charm: cs:precise/openstack-dashboard-9
    exposed: false
    relations:
      cluster:
      - openstack-dashboard
    units:
      openstack-dashboard/0:
        agent-state: started
        agent-version: 1.11.4
        machine: "1"
        public-address: 10.5.0.45
  rabbitmq-server:
    charm: cs:precise/rabbitmq-server-12
    exposed: false
    relations:
      cluster:
      - rabbitmq-server
    units:
      rabbitmq-server/0:
        agent-state: started
        agent-version: 1.11.4
        machine: "0"
        public-address: 10.5.0.44
      rabbitmq-server/1:
        agent-state: started
        agent-version: 1.11.4
        machine: "1"
        public-address: 10.5.0.45

These two features make it much easier to deploy complex services such as OpenStack which use a large number of charms on a limited number of physical servers.

There are still a few gotchas:

  • Charms are running without any separation, so its entirely possible for Charms to stamp all over each others configuration files and try to bind to the same network ports.
  • Not all of the OpenStack Charms are compatible with the latest version of Juju – this is being worked on – checkout the OpenStack Charmers branches on Launchpad.

Juju is due to deliver a feature that will provide full separation of services using containers which will resolve the separation challenge.

For the OpenStack Charms, the OpenStack Charmers team will be aiming to limit file-system conflicts as much as possible – specifically in charms that won’t work well in containers such as nova-compute, ceph and quantumneutron-gateway because they make direct use of kernel features and network/storage devices.

Ubuntu OpenStack SRU cadence

At the last Ubuntu Developer Summit, the Ubuntu Server team discussed moving to a fixed cadence for releasing point releases of OpenStack into Ubuntu and the Ubuntu Cloud Archive for 12.04 under the Ubuntu Stable Release update process.

The amount of time between upstream point release and acceptance into Ubuntu and the Ubuntu Cloud Archive is relatively short, but the team felt that a more regular cadence was required to allow users of OpenStack on Ubuntu to plan around upstream point releases.

For future OpenStack point releases the Ubuntu Server team will be following a new cadence for pushing these releases into Ubuntu. This should allow the team to test and promote a point release of OpenStack into Ubuntu within two weeks of the upstream point release. Hopefully this will allow users of OpenStack on Ubuntu to plan upgrades a little more effectively going fowards.

For full details see the SRU Cadence documentation.

EOM

Mixing physical and virtual servers with Juju and MAAS

This is one the most common questions I get asked about deploying OpenStack on Ubuntu using Juju and MAAS is:

How can we reduce the number of servers required to deploy a small OpenStack Cloud?

OpenStack has a number of lighter weight services which don’t really make best use of anything other than the cheapest of cheap servers in this type of deployment; this includes the cinder, glance, keystone, nova-cloud-controller, swift-proxy, rabbitmq-server and mysql charms.

Ultimately Juju will solve the problem of service density in physical server deployments by natively supporting deployment of multiple charms onto the same physical servers; but in the interim I’ve hacked together a Juju charm, “virtme”, which can be deployed using Juju and MAAS to virtualize a physical server into a number of KVM instances which are also managed by MAAS.

Using this charm in conjunction with juju-jitsu allows you to make the most of a limited number of physical servers; I’ve been using this charm in a raring based Juju + MAAS environment:

juju bootstrap
(mkdir -p raring; cd raring; bzr branch lp:~virtual-maasers/charms/precise/virtme/trunk virtme)
jitsu deploy-to 0 --config config.yaml --repository . local:virtme

Some time later you should have an additional 7 servers registered into the MAAS controlling the environment ready for use. The virtme charm is deployed directly to the bootstrap node in the environment – so at this point the environment is using just one physical server.

The config.yaml file contains some general configuration for virtme:

virtme:
  maas-url: "http://<maas_hostname>/MAAS"
  maas-credentials: "<maas_token>"
  ports: "em2"
  vm-ports-per-net: 2
  vm-memory: 4096
  vm-cpus: 2
  num-vms: 7
  vm-disks: "10G 60G"

virtme uses OpenvSwitch to provide bridging between KVM instances and the physical network; right now this requires a dedicated port on the server to be cabled correctly – this is configured using ‘ports’. Each KVM instance will be configured with ‘vm-ports-per-net’ number of network ports on the OpenvSwitch bridge.

virtme also requires a URL and credentials for the MAAS cluster controller managing the environment; it uses this to register the details of the KVM instances it creates back into MAAS. Power control is supported using libvirt; virtme configures the libvirt daemon on the physical server to listen on the network and MAAS uses this to power control the KVM instances.

Right now the specification of the KVM instances is a little clunky – in the example above, virtme will create 7 instances with 2 vCPUS, 4096MB of memory and two disks, a root partition that is 10G and a secondary disk of 60G. I’d like to refactor this into something a little more rich to describe instances; maybe something like:

vms:
  small:
    - count: 7
    - cpu: 2
    - mem: 4096
    - networks: [ eth1, eth2 ]
    - disks: [ 10G, 20G ]

Now that the environment has a number of smaller, virtualized instances, I can deploy some OpenStack services onto these units:

juju deploy keystone
juju deploy mysql
juju deploy glance
juju deploy rabbitmq-server
....

leaving your bigger servers free to use for nova-compute:

juju deploy -n 6 --constraints="mem=96G" nova-compute

WARNING: right now libvirt is configured with no authentication or security on its network connection; this has obvious security implications! Future iterations of this charm will probably support SASL or SSH based security.

BOOTNOTE: virtme is still work-in-progress and is likely to change; if you find it useful let me know about what you like/hate!

Tagged , ,

Ubuntu Cloud Archive Bug Reporting

Since its launch bug reporting for packages sourced from the Ubuntu Cloud Archive for Ubuntu 12.04 LTS has been a little awkward and somewhat manual.

As of apport version 2.0.1-0ubuntu17.2, you can now:

ubuntu-bug <pkgname>

for packages from the Cloud Archive and bugs will get routed to the correct project in Launchpad with lots of extra bug data.

Thanks to those who spent time reporting bugs to-date – hopefully this will make you lives a little easier!

EOM

Ubuntu OpenStack Activity Update, February 2013

Folsom 2012.2.1 Stable Release Update

A number of people has asked about the availability of OpenStack 2012.2.1 in Ubuntu 12.10 and the Ubuntu Cloud Archive for Folsom; well its finally out!

Suffice to say it took longer that expected so we are making some improvements to the way we manage these micro-releases going forward which should streamline the process for 2012.2.3.

Cloud Archive Version Tracker

In order to help users and administrators of the Ubuntu Cloud Archive track which versions of what are where, the Ubuntu Server team are now publishing Cloud Archive reports for Folsom and Grizzly.

Grizzly g2 is currently working its way into the Ubuntu Cloud Archive (its already in Ubuntu Raring) and should finish landing into the updates pocket next week.

News from the CI Lab

We now have Ceph fully integrated into the testing that we do around OpenStack; this picked up a regression in Nova and Cinder in the run up to 2012.2.1.

This highlights the value of the integration and system testing that we do in the Ubuntu OpenStack CI lab (see my previous post for details on the lab). Identifying regressions was high on the list of initial objectives we agreed for this function!

Focus at the moment is on enabling testing of Grizzly on Raring (its already up and running for Precise) and working on an approach to testing the OpenStack Charm HA work currently in-flight within the team. In full this will require upwards of 30 servers to test so we are working on a charm that deploys Ubuntu Juju and MAAS (Metal-as-a-Service) on a single, chunky server, allowing for physical-server-like testing of OpenStack in KVM. For some reason seeing 50 KVM instances running on a single server is somewhat satisfying!

This work will also be re-used for more regular, scheduled testing outside of the normal build-deploy-test pipeline for scale-out services such as Ceph and Swift – more to follow on this…

Ceilometer has also been added to the lab; at the moment we are build testing and publishing packages in the Grizzly Trunk PPA; Yolanda is working on a charm to deploy Ceilometer.

Ceph LTS Bobtail

The next Ceph LTS release (Bobtail) is now available in Ubuntu Raring and the Ubuntu Cloud Archive for Grizzly.

One of the key highlights for this release is the support for Keystone authentication and authorization in the Ceph RADOS Gateway.

The Ceph RADOS Gateway provides multi-tenant, highly scalable object storage through Swift and S3 RESTful interfaces.

Integration of the Swift protocol with Keystone completes the complementing story that Ceph provides when used with OpenStack.

Ceph can fulfil ALL storage requirements in an OpenStack deployment; its integrated with Cinder and Nova for block storage, Glance for image storage and can now directly provide integrated, Swift compatible, multi-tenant object storage.

Juju charm updates to support Keystone integration with Ceph RADOS Gateway are in the Ceph charms in the charm store.

Tagged , , ,

Eventing Upstart

Upstart is a an alternative /bin/init, a replacement for System V style initialization and has been the default init in Ubuntu since 9.10, RHEL6 and Google’s Chrome OS; it handles starting of tasks and services during boot, stopping them during shutdown and supervising them while the system is running.

The key difference from traditional init is that Upstart is event based; processes managed by Upstart are started and stopped as a result of events occurring in the system rather than scripts being executed in a defined order.

This post provides readers with a walk-through of a couple of Upstart configurations and explains how the event driven nature of Upstart provides a fantastic way of managing the processes running on your Ubuntu Server install.

Disecting a simple configuration

Lets start by looking at a basic Upstart configuration; specifically the one found in Floodlight (a Java based OpenFlow controller):

description "Floodlight controller"

start on runlevel [2345]
stop on runlevel [!2345]

setuid floodlight
setgid floodlight

respawn

pre-start script
    [ -f /usr/share/floodlight/java/floodlight.jar ] || exit 0
end script

script
    . /etc/default/floodlight
    exec java ${JVM_OPTS} -Dpython.home=/usr/share/jython \
        -Dlogback.configurationFile=/etc/floodlight/logback.xml \
        -jar /usr/share/floodlight/java/floodlight.jar \
        $DAEMON_OPTS 2>&1 >> /var/log/floodlight/floodlight.log
end script

This configuration is quite traditional in that it hooks into the runlevel events that upstart emits automatically during boot to simulate a System V style system initialization:

start on runlevel [2345]
stop on runlevel [!2345]

These provide a simple way to convert a traditional init script into an Upstart configuration without having to think to hard about exactly which event should start your process; note that this configuration also starts on the filesystem event – this is fired when all filesystems have been mounted.  For more information about events see the Upstart eventsman page.

The configuration uses stanza’s that tell Upstart to execute the scripts in the configuration as a different user:

setuid floodlight
setgid floodlight

and it also uses a process control stanza:

respawn

This instructs Upstart to respawn the process if it should die for any unexpected reason; Upstart has some sensible defaults on how many times it will attempt todo this before giving it up as a bad job – these can also be specified in the stanza.

The job has two scripts specified; the first is run prior to actually starting the process that will be monitored:

pre-start script
    [ -f /usr/share/floodlight/java/floodlight.jar ] || exit 0
end script

In this case its just a simple check to ensure that the floodlight package is still installed; Upstart configurations are treated as conf files by dpkg so won’t be removed unless you purge the package from your system. The final script is the one that actually exec’s the process that will be monitored:

script
    . /etc/default/floodlight
    exec java ${JVM_OPTS} -Dpython.home=/usr/share/jython \
        -Dlogback.configurationFile=/etc/floodlight/logback.xml \
        -jar /usr/share/floodlight/java/floodlight.jar \
        $DAEMON_OPTS 2>&1 >> /var/log/floodlight/floodlight.log
end script

Upstart will keep an eye on the java for Floodlight process during its lifetime.

And now for something clever…

The above example is pretty much a direct translation of a init script into an Upstart configuration; when you consider that an Upstart configuration can be triggered by any event being detected the scope of what you can do with it increases exponentially.

Ceph, the highly scalable, distributed object storage solution which runs on Ubuntu on commodity server hardware, has a great example on how this can extend to events occurring in the physical world.

Lets look at how Ceph works at a high level.

A Ceph deployment will typically spread over a large number of physical servers; three will be running the Ceph Monitor daemon (MON) and will be acting in quorum to monitor the topology of the Ceph deployment.  Ceph clients connect to these servers to retrieve this map which they then use to determine where the data they are looking for resides.

The rest of the servers will be running Ceph Object Storage daemons (OSD); these are responsible for storing/retrieving data on physical storage devices.   The recommended configuration is to have one OSD per physical storage device in any given server.  Servers can have quite a few direct-attached disks so this could be 10′s of disks per server – so in a larger deployment you may have 100′s or 1000′s of OSD’s running at any given point in time.

This presents a challenge; how does the system administrator manage the Ceph configuration for all of these OSD’s?

Ceph takes a innovative approach to address this challenge using Upstart.

The devices supporting the OSD’s are prepared for use using the ‘ceph-disk-prepare’ tool:

ceph-disk-prepare /dev/sdb

This partitions and formats the device with a specific layout and UUID so it can be recognized as OSD device; this is supplemented with an Upstart configuration which fires when devices of this type are detected:

description "Ceph hotplug"

start on block-device-added \
DEVTYPE=partition \
ID_PART_ENTRY_TYPE=4fbd7e29-9d25-41b8-afd0-062c0ceff05d

task
instance $DEVNAME

exec /usr/sbin/ceph-disk-activate --mount -- "$DEVNAME"

This Upstart configuration is a ‘task’; this means that its not long running so Upstart does not need to provide ongoing process management once ‘ceph-disk-activate’ exits (no respawn or stopping for example).

The ‘ceph-disk-activate’ tool mounts the device, prepares it for OSD usage (if not already prepared) and then emits the ‘ceph-osd’ event with a specific OSD id which has been allocate uniquely across the deployment; this triggers a second Upstart configuration:

description "Ceph OSD"

start on ceph-osd
stop on runlevel [!2345]

respawn
respawn limit 5 30

pre-start script
    set -e
    test -x /usr/bin/ceph-osd || { stop; exit 0; }
    test -d "/var/lib/ceph/osd/${cluster:-ceph}-$id" || { stop; exit 0; }

    install -d -m0755 /var/run/ceph

    # update location in crush; put in some suitable defaults on the
    # command line, ceph.conf can override what it wants
    location="$(ceph-conf --cluster="${cluster:-ceph}" --name="osd.$id" --lookup osd_crush_location || : )"
    weight="$(ceph-conf --cluster="$cluster" --name="osd.$id" --lookup osd_crush_initial_weight || : )"
    ceph \
        --cluster="${cluster:-ceph}" \
        --name="osd.$id" \
        --keyring="/var/lib/ceph/osd/${cluster:-ceph}-$id/keyring" \
        osd crush create-or-move \
    -- \
        "$id" \
    "${weight:-1}" \
    root=default \
    host="$(hostname -s)" \
    $location \
       || :

    journal="/var/lib/ceph/osd/${cluster:-ceph}-$id/journal"
    if [ -L "$journal" -a ! -e "$journal" ]; then
       echo "ceph-osd($UPSTART_INSTANCE): journal not present, not starting yet." 1>&2
       stop
       exit 0
    fi
end script

instance ${cluster:-ceph}/$id
export cluster
export id

exec /usr/bin/ceph-osd --cluster="${cluster:-ceph}" -i "$id" -f

This Upstart configuration does some Ceph configuration in its pre-start script to tell Ceph where the OSD is physically located and then starts the ceph-osd daemon using the unique OSD id.

These two Upstart configurations are used whenever OSD formatted block devices are detected; this includes on system start (so that OSD daemons startup on boot) and when disks are prepared for first use.

So how does this extend into the physical world?

The ‘block-device-added’ event can happen at any point in time -  for example:

  • One of the disks in server-X dies; the data centre operations staff have a pool of pre-formatted Ceph OSD replacement disks and replace the failed disk with a new one; Upstart detects the new disk and bootstraps a new OSD into the Ceph deployment.
  • server-Y dies with a main-board burnout;  the data centre operations staff replace the server with a swap out, remove the disks from the dead server and insert into the new one and install Ceph onto the system disk; Upstart detects the OSD disks on reboot and re-introduces them into the Ceph topology in their new location.

In both of these scenarios no additional system administrator action is required; considering that a Ceph deployment might contain 100′s of servers and 1000′s of disks, automating activity around physical replacement of devices in this way is critical in terms of operational efficiency.

Hats off to Tommi Virtanen@Inktank for this innovative use of Upstart – rocking work!

Summary

This post illustrates that Upstart is more than just a simple, event based replacement for init.

The Ceph use case shows how Upstart can be integrated into an application providing event based automation of operational processes.

Want to learn more? The Upstart Cookbook is a great place to start…

Tagged , , ,
Follow

Get every new post delivered to your Inbox.

Join 133 other followers