Tag Archives: Load testing

Why A Locust Swarm Is A Good Thing

locust_io_master_slave2_00Recently, I gave a talk on how “A Locust Swarm Can be a Good Thing!” at Stir Trek in Columbus.  The talk covered our experience of load testing and preparing for hundreds of thousands of users on the first day of the .Realtor site launch.

This was a challenging environment with lots of external dependencies of varying capacity and failure tolerance.  We knew from the beginning that we’d never be able to load test all of our dependencies at once, so we had to figure out ways to test in isolation without spending all our time rewriting our test infrastructure.  Additionally, our load testing started late enough that we would not be able to coordinate an independent test with all of our dependencies in time.  We also needed to figure out what our users would do without having ever observed user behavior on the real site.  We built a model user funnel to capture our expectations and continually tweaked it as we discovered new wrinkles.  This funnel formed the basis for our load test script and allowed us to prioritize our integration concerns.

In the end, we learned a lot about making our workflows asynchronous, linux kernel optimization, decoding performance metrics, and building giant DDoS clouds of load test slaves.  We also learned that load testing should start “earlier”.  Conversations about load and user behavior drive new requirements and testing can uncover fundamental infrastructure problems.  Decoding and isolating performance problems can require a lot of guessing and experimentation; things that are difficult to do thoroughly with an unmovable launch date.  It’s also difficult to make large scale changes to an application with confidence under time pressure.  One of my key takeaways is to be nicer to external partners.  The point of load testing is to find the breaking points of a system and most people don’t like when their toys get broken. Building trust and safety into that relationship is very important before trying to figure out where and how something went wrong.

Check out the slides to the original talk here.

Stir Trek is an excellent conference with extremely thoughtful organizers and friendly people.  20 people came to talk to me in person after my talk, which was great!  Tickets sell out very quickly, but I recommend getting in next year if you can!

IoT Course Week 11: Load Testing

Screen Shot 2015-10-15 at 3.07.21 PM

Last week, we dove into into the importance of incorporating and collecting analytics through your connected device, how that information helps provide business value, and played with some of the ways that information can be displayed using some pretty graphs.

This Week

This week, we’ll continue our focus on non-functional requirements and start load testing. With connected devices, if the device can’t call home to its shared services, it loses a lot of its value as a smart device. These services need to be highly reliable, but things get interesting when thousands or millions of devices decide to call home at the same time.

To load test, we’ll generate concurrent usage on system until a limit, bottleneck, unexpected behavior, or issue is discovered. This usage should model real-life usage as close as possible, so the analytics we put in place last week will be a valuable resource. In instances where we don’t have data to work with, we can build out user funnels and extrapolate based on anticipated usage. Bad things will happen if we ship thousands of products without any idea how our system will react under the load. This data will also be a useful baseline for capacity planning and system optimization experiments.

The Lampi system has two shared services that we need to put under load. One is the Django web server that handles login, and the other is the MQTT broker that handles sending messages to the lamp.

Load Testing with Locust

To test the web server we use Locust. Locust has become a LeanDog favorite due to its simple design, scalability, extensibility, and scriptability. We’ve used it to generate loads of 200,000 simultaneous users distributed across the US, Singapore, Ireland, and Brazil. These simulated users (locusts) walked through multi-page workflows at varying probabilities, modeling the end-to-end user interaction, complete with locusts dropping out of the user funnel at known decision points.

Locusts are controlled via a locustfile.py. The one below shows a user logging in and going to the home page:

from locust import HttpLocust, TaskSet, task

class UserBehavior(TaskSet):

def on_start(self):
self.login()

def login(self):
response = self.client.get("/accounts/login/?next=/")
csrftoken = response.cookies.get('csrftoken', '')
self.client.post("/accounts/login/?next=/", {"csrfmiddlewaretoken": csrftoken, "username": {{USERNAME}}, "password": {{PASSWORD}} })

@task(1)
def load_page(self):
self.client.get("/")

class WebsiteUser(HttpLocust):
task_set = UserBehavior
min_wait = 5000
max_wait = 9000

In order to run locust, we’ll need a machine outside of the system to simulate a number of devices. Locust is a python package, so it can run on most OSes. It uses a master/slave architecture so you can distribute the simulated users and allow for more and more load.

Once you install locust and start the process, you control the test via a web interface.

Screen Shot 2016-05-04 at 12.09.46 PM

Locust will aggregate the requests to a particular endpoint and provide statistics and errors for those requests.  

Screen Shot 2016-05-04 at 12.09.55 PM

Screen Shot 2016-05-04 at 12.10.03 PM

Load Testing with Malaria

To test MQTT we used a fork of Malaria. Malaria was designed to exercise MQTT brokers. Like locust, Malaria spawns a number of processes to publish MQTT messages. Unlike locust, it’s not easy to script; you have to fork it to do parametric testing or randomize data.

usage: malaria publish [-D DEVICE_ID] [-H HOST] [-p PORT] [-n MSG_COUNT] [-P PROCESSES]

Publish a stream of messages and capture statistics on their timing

optional arguments:
-D DEVICE_ID (Set the device id of the publisher)
-H HOST (MQTT host to connect to (default: localhost))
-p PORT, (Port for remote MQTT host (default: 1883))
-n MSG_COUNT (How many messages to send (default: 10))
-P PROCESSES (How many separate processes to spin up (default: 1))

By modulating MSG_COUNT and PROCESSES you can control the load being sent to the broker.

Running some Example loads
Small load: Using 1 process, send 10 messages, from device id [device_id]

loadtest$ ./malaria publish -H [broker_ip] -n 10 -P 1 -D [device_id]

Produces results similar to this:

Clientid: Aggregate stats (simple avg) for 1 processes
Message success rate: 100.00% (10/10 messages)
Message timing mean 344.51 ms
Message timing stddev 2.18 ms
Message timing min 340.89 ms
Message timing max 347.84 ms
Messages per second 4.99
Total time 14.04 secs

Large load: Using 8 processes, send 10,000 messages each, from device id [device_id]

loadtest$ ./malaria publish -H 192.168.0.42 -n 10000 -P 8 -D [device_id]

Monitoring The Broker

MQTT provides a set of topics that allow you to monitor the broker.

This command will show all the monitoring topics (note that the $ is escaped with a backslash):

cloud$ mosquitto_sub -v -t \$SYS/#

The sub topics ‘…\load...’ are of particular interest.

Gather data

Before we start testing, we should figure out what metrics we want to measure. Resources on the shared system (CPU, memory, bandwidth, file handles) are good candidates for detecting capacity issues. Focusing on the user experience (failure rate, response time, latency) will help you hone in on the issues that will incur support costs or retention problems. Building the infrastructure to gather, analyze and visualize those metrics can be a significant part of the load testing process – but those tools are also necessary to do useful operational support in production. For the class, students used sysstat, locust, mqtt and malaria to gather metrics. A production-like system might use AWS Cloudwatch, New Relic, Nagios, Cacti, Munin, or a combination of other excellent tools.

The point of load testing is to find the limits and then decide what to do about them. There will be a point where the cost to rectify the issue is greater than any immediate benefit, load testing will help you find that bar. During the class, limits of a 1000 simultaneous users for web and 5,000-10,000 MQTT messages per process were common.

Final project

For their final project two students from the class, Matthew Bentley and Andrew Mason, decided to take on some of the problems with mqtt-malaria and extend Locust to publish MQTT messages. Using Locust they were able to scale their load test infrastructure across many machines and put a broker under more stress. In their previous testing with malaria, they found the point where a single device could send no more messages (at a reasonable rate), but they could not scale malaria to determine at what point the broker would not process any additional connected devices’ messages. Through their efforts, they reached 100% CPU on the broker, pushing 1 million messages a minute to 4000 users. As a result of their work they also open sourced their contribution to locust.