Extreme OpenStack: Scale testing OpenStack Messaging

Just prior to the Paris OpenStack Summit in November, the Ubuntu Server team had the opportunity to repeat and expand on the scale testing of OpenStack Icehouse that we did in the first quarter of last year with AMD and SeaMicro. HP where kind enough to grant us access to a few hundred servers in their Discovery Lab; specifically three chassis of HP ProLiant Moonshot m350 cartridges (540 in total): indexThe m350 is an 8-core Intel Atom based server with 16GB of RAM and 64GB of SSD based direct attached storage. They are designed for scale out workloads, so not an immediately obvious choice for an OpenStack Cloud, but for the purposes of stretching OpenStack to the limit, having lots of servers is great as it puts load on central components in Neutron and Nova by having a large number of hypervisor edges to manage. We had a few additional objectives for this round of scale testing over and above re-validating the previous scale test we did on Icehouse on the new Juno release of OpenStack:

  • Messaging: The default messaging solution for OpenStack on Ubuntu is RabbitMQ; alternative messaging solutions have been supported for some time – we wanted to specifically look at how ZeroMQ, a broker-less messaging option, scales in a large OpenStack deployment.
  • Hypervisor: The testing done previously was based on the libvirt/kvm stack with Nova; The LXC driver was available in an early alpha release so poking at this looked like it might be fun.

As you would expect, we used the majority of the same tooling that we used in the previous scale test:

  • MAAS (Metal-as-a-Service) for deployment of physical server resources
  • Juju: installation and configuration of OpenStack on Ubuntu

in addition, we also decided to switch over to OpenStack Rally to complete the actual testing and benchmarking activities. During our previous scale test this project was still in its infancy but its grown a lot of features in the last 9 months including better support for configuring Neutron network resources as part of test context set-up.

Messaging Scale

The first comparison we wanted to test was between RabbitMQ and ZeroMQ; RabbitMQ has been the messaging workhorse for Ubuntu OpenStack deployments since our first release, but larger clouds do make high demands on a single message broker both in terms of connection concurrency and message throughput. ZeroMQ removes the central broker from the messaging topology, switching to a more directly connected edge topology.

The ZeroMQ driver in Oslo Messaging has been a little unloved over the last year or so, however some general stability improvements have been made – so it felt like a good time to take a look and see how it scales. For this part of the test we deployed a cloud of:

  • 8 Nova Controller units, configured as a cluster
  • 4 Neutron Controller units, configured as a cluster
  • Single MySQL, Keystone and Glance units
  • 300 Nova Compute units
  • Ganglia for monitoring

In order to push the physical servers as hard as possible, we also increased the default workers (cores x 4 vs cores x 2) and the cpu and ram allocation ratios for the Nova scheduler. We then completed an initial 5000 instance boot/delete benchmark with a single RabbitMQ broker with a concurrency level of 150.  Rally takes this as configuration options for the test runner – in this test Rally executed 150 boot-delete tests in parallel, with 5000 iterations:

action min (sec) avg (sec) max (sec) 90 percentile 95 percentile success count
total 28.197 75.399 220.669 105.064 117.203 100.0% 5000
nova.boot_server 17.607 58.252 208.41 86.347 97.423 100.0% 5000
nova.delete_server 4.826 17.146 134.8 27.391 32.916 100.0% 5000

Having established a baseline for RabbitMQ, we then redeployed and repeated the same test for ZeroMQ; we immediately hit issues with concurrent instance creation.  After some investigation and re-testing, the cause was found to be Neutron’s use of fanout messages for communicating with hypervisor edges; the ZeroMQ driver in Oslo Messaging has an inefficiency in that it creates a new TCP connection for every message it sends – when Neutron attempted to send fanout messages to all hypervisors edges with a concurrency level of anything over 10, the overhead in creating so many TCP connections causes the workers on the Neutron control nodes to back up, and Nova starts to timeout instance creation on network setup.

So the verdict on ZeroMQ scalability with OpenStack? Lots of promise but not there yet….

We introduced a new feature to the OpenStack Charms for Juju in the last charm release to allow use of different RabbitMQ brokers for Nova and Neutron, so we completed one last messaging test to look at this:

action min (sec) avg (sec) max (sec) 90 percentile 95 percentile success count
total 26.073 114.469 309.616 194.727 227.067 98.2% 5000
nova.boot_server 19.9 107.974 303.074 188.491 220.769 98.2% 5000
nova.delete_server 3.726 6.495 11.798 7.851 8.355 98.2% 5000

unfortunately we had some networking problems in the lab which caused some slowdown and errors for instance creation, so this specific test proved a little in-conclusive. However, by running split brokers, we were able to determine that:

  • Neutron peaked at ~10,000 messages/sec
  • Nova peaked at ~600 messages/sec

It’s also worth noting that the SSDs that the m350 cartridges use do make a huge difference, as the servers don’t suffer from the normal iowait times associated with spinning disks.

So in summary, RabbitMQ still remains the de facto choice for messaging in an Ubuntu OpenStack Cloud; it scales vertically very well – add more CPU and memory to your server and you can deal with a larger cloud – and benefits from fast storage.

ZeroMQ has a promising architecture but needs more work in the Oslo Messaging driver layer before it can be considered useful across all OpenStack components.

In my next post we’ll look at how hypervisor choice stacks up…

Advertisements
Tagged , ,

9 thoughts on “Extreme OpenStack: Scale testing OpenStack Messaging

  1. Stephan Adig says:

    What about rabbitmq clustered? and thinking about master/master? eventually with ipvs in front?

    • JavaCruft says:

      The RabbitMQ charm for Juju does support native clustering of RabbitMQ (its part of the Ubuntu OpenStack HA reference architecture) we did not cover this deployment topology during this test.

  2. boris-42 says:

    James,

    Nice stuff, it will be nice to see as well results of:

    rally task report

    command, could you please share?

    As well in Rally we have special section:
    users stories https://github.com/stackforge/rally/tree/master/doc/user_stories
    it will be extremely helpful if you contribute to it!

  3. Dariush Marsh-Mossadeghi says:

    Did you do any tuning of the underlying network stack on any of the nodes ? Could well be that zmq is bottle-necking due to the underlying tcp stack config parameters

    • JavaCruft says:

      We used stock Ubuntu defaults for the underlying network stack; the OpenStack charms should be implementing best practice – if you have some recommended defaults for zmq I’d be happy to incorporate them!

  4. Nick says:

    Great job. The fanout and receiver part has been discussed in the mailing list, I think, but there’s no design proposal.

    By the way, ceilometer support is done and we’ll submit it to review soon.

    Nick

  5. […] By James Page: Extreme OpenStack: Scale testing OpenStack Messaging […]

  6. […] By James Page: Extreme OpenStack: Scale testing OpenStack Messaging […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: