Upstart is a an alternative /bin/init, a replacement for System V style initialization and has been the default init in Ubuntu since 9.10, RHEL6 and Google’s Chrome OS; it handles starting of tasks and services during boot, stopping them during shutdown and supervising them while the system is running.
The key difference from traditional init is that Upstart is event based; processes managed by Upstart are started and stopped as a result of events occurring in the system rather than scripts being executed in a defined order.
This post provides readers with a walk-through of a couple of Upstart configurations and explains how the event driven nature of Upstart provides a fantastic way of managing the processes running on your Ubuntu Server install.
Disecting a simple configuration
Lets start by looking at a basic Upstart configuration; specifically the one found in Floodlight (a Java based OpenFlow controller):
description "Floodlight controller"
start on runlevel [2345]
stop on runlevel [!2345]
setuid floodlight
setgid floodlight
respawn
pre-start script
[ -f /usr/share/floodlight/java/floodlight.jar ] || exit 0
end script
script
. /etc/default/floodlight
exec java ${JVM_OPTS} -Dpython.home=/usr/share/jython \
-Dlogback.configurationFile=/etc/floodlight/logback.xml \
-jar /usr/share/floodlight/java/floodlight.jar \
$DAEMON_OPTS 2>&1 >> /var/log/floodlight/floodlight.log
end script
This configuration is quite traditional in that it hooks into the runlevel events that upstart emits automatically during boot to simulate a System V style system initialization:
start on runlevel [2345]
stop on runlevel [!2345]
These provide a simple way to convert a traditional init script into an Upstart configuration without having to think to hard about exactly which event should start your process; note that this configuration also starts on the filesystem event – this is fired when all filesystems have been mounted. For more information about events see the Upstart eventsman page.
The configuration uses stanza’s that tell Upstart to execute the scripts in the configuration as a different user:
setuid floodlight
setgid floodlight
and it also uses a process control stanza:
respawn
This instructs Upstart to respawn the process if it should die for any unexpected reason; Upstart has some sensible defaults on how many times it will attempt todo this before giving it up as a bad job – these can also be specified in the stanza.
The job has two scripts specified; the first is run prior to actually starting the process that will be monitored:
pre-start script
[ -f /usr/share/floodlight/java/floodlight.jar ] || exit 0
end script
In this case its just a simple check to ensure that the floodlight package is still installed; Upstart configurations are treated as conf files by dpkg so won’t be removed unless you purge the package from your system. The final script is the one that actually exec’s the process that will be monitored:
script
. /etc/default/floodlight
exec java ${JVM_OPTS} -Dpython.home=/usr/share/jython \
-Dlogback.configurationFile=/etc/floodlight/logback.xml \
-jar /usr/share/floodlight/java/floodlight.jar \
$DAEMON_OPTS 2>&1 >> /var/log/floodlight/floodlight.log
end script
Upstart will keep an eye on the java for Floodlight process during its lifetime.
And now for something clever…
The above example is pretty much a direct translation of a init script into an Upstart configuration; when you consider that an Upstart configuration can be triggered by any event being detected the scope of what you can do with it increases exponentially.
Ceph, the highly scalable, distributed object storage solution which runs on Ubuntu on commodity server hardware, has a great example on how this can extend to events occurring in the physical world.
Lets look at how Ceph works at a high level.
A Ceph deployment will typically spread over a large number of physical servers; three will be running the Ceph Monitor daemon (MON) and will be acting in quorum to monitor the topology of the Ceph deployment. Ceph clients connect to these servers to retrieve this map which they then use to determine where the data they are looking for resides.
The rest of the servers will be running Ceph Object Storage daemons (OSD); these are responsible for storing/retrieving data on physical storage devices. The recommended configuration is to have one OSD per physical storage device in any given server. Servers can have quite a few direct-attached disks so this could be 10’s of disks per server – so in a larger deployment you may have 100’s or 1000’s of OSD’s running at any given point in time.
This presents a challenge; how does the system administrator manage the Ceph configuration for all of these OSD’s?
Ceph takes a innovative approach to address this challenge using Upstart.
The devices supporting the OSD’s are prepared for use using the ‘ceph-disk-prepare’ tool:
ceph-disk-prepare /dev/sdb
This partitions and formats the device with a specific layout and UUID so it can be recognized as OSD device; this is supplemented with an Upstart configuration which fires when devices of this type are detected:
description "Ceph hotplug"
start on block-device-added \
DEVTYPE=partition \
ID_PART_ENTRY_TYPE=4fbd7e29-9d25-41b8-afd0-062c0ceff05d
task
instance $DEVNAME
exec /usr/sbin/ceph-disk-activate --mount -- "$DEVNAME"
This Upstart configuration is a ‘task’; this means that its not long running so Upstart does not need to provide ongoing process management once ‘ceph-disk-activate’ exits (no respawn or stopping for example).
The ‘ceph-disk-activate’ tool mounts the device, prepares it for OSD usage (if not already prepared) and then emits the ‘ceph-osd’ event with a specific OSD id which has been allocate uniquely across the deployment; this triggers a second Upstart configuration:
description "Ceph OSD"
start on ceph-osd
stop on runlevel [!2345]
respawn
respawn limit 5 30
pre-start script
set -e
test -x /usr/bin/ceph-osd || { stop; exit 0; }
test -d "/var/lib/ceph/osd/${cluster:-ceph}-$id" || { stop; exit 0; }
install -d -m0755 /var/run/ceph
# update location in crush; put in some suitable defaults on the
# command line, ceph.conf can override what it wants
location="$(ceph-conf --cluster="${cluster:-ceph}" --name="osd.$id" --lookup osd_crush_location || : )"
weight="$(ceph-conf --cluster="$cluster" --name="osd.$id" --lookup osd_crush_initial_weight || : )"
ceph \
--cluster="${cluster:-ceph}" \
--name="osd.$id" \
--keyring="/var/lib/ceph/osd/${cluster:-ceph}-$id/keyring" \
osd crush create-or-move \
-- \
"$id" \
"${weight:-1}" \
root=default \
host="$(hostname -s)" \
$location \
|| :
journal="/var/lib/ceph/osd/${cluster:-ceph}-$id/journal"
if [ -L "$journal" -a ! -e "$journal" ]; then
echo "ceph-osd($UPSTART_INSTANCE): journal not present, not starting yet." 1>&2
stop
exit 0
fi
end script
instance ${cluster:-ceph}/$id
export cluster
export id
exec /usr/bin/ceph-osd --cluster="${cluster:-ceph}" -i "$id" -f
This Upstart configuration does some Ceph configuration in its pre-start script to tell Ceph where the OSD is physically located and then starts the ceph-osd daemon using the unique OSD id.
These two Upstart configurations are used whenever OSD formatted block devices are detected; this includes on system start (so that OSD daemons startup on boot) and when disks are prepared for first use.
So how does this extend into the physical world?
The ‘block-device-added’ event can happen at any point in time – for example:
- One of the disks in server-X dies; the data centre operations staff have a pool of pre-formatted Ceph OSD replacement disks and replace the failed disk with a new one; Upstart detects the new disk and bootstraps a new OSD into the Ceph deployment.
- server-Y dies with a main-board burnout; the data centre operations staff replace the server with a swap out, remove the disks from the dead server and insert into the new one and install Ceph onto the system disk; Upstart detects the OSD disks on reboot and re-introduces them into the Ceph topology in their new location.
In both of these scenarios no additional system administrator action is required; considering that a Ceph deployment might contain 100’s of servers and 1000’s of disks, automating activity around physical replacement of devices in this way is critical in terms of operational efficiency.
Hats off to Tommi Virtanen@Inktank for this innovative use of Upstart – rocking work!
Summary
This post illustrates that Upstart is more than just a simple, event based replacement for init.
The Ceph use case shows how Upstart can be integrated into an application providing event based automation of operational processes.
Want to learn more? The Upstart Cookbook is a great place to start…