More Rigorous Testing for MQTT Servers

I’ve had a sustained interest in a rigorous approach to test design ever since I studied formal methods in the 1990s. This is reflected in the domain name of this blog. ((All the test material referred to in this post is in the Eclipse Paho project.))

Testing both client libraries and servers for adherence to the MQTT specification seems a natural candidate. Any form of rigorous testing will need more investment than a few manual experiments, but the payoff is potentially large (unless you really like a boring, repetitive life). The MQTT standardization process is nearly complete. A major goal of any standardization is to encourage implementations, so there will be plenty of opportunity for a rigorous test suite to recoup the investment made in it.

If a test suite is a haphazard collection of acquired test programs, it tends to suffer from gaps in coverage and a lack of consistent approach and motivation. When the object of that test suite changes, the suite will be hard to maintain because you have to work out the intent of each of the component test programs. Even explaining the exact reasoning behind an individual test program can be hard. With the approach outlined here I hope to make the goals and attainments of the test suite much more transparent, and maintenance easier and more logical.

I start with the MQTT 3.1.1 Specification. The standardized MQTT specification is a great improvement over the previous version because it is has been argued over intensively, with the (planned for) result that many of the ambiguities have been resolved. Each identified conformance statement has been numbered and listed in Appendix B. I use that list as a starting point for measuring test coverage of the complete MQTT specification. ((It has been pointed out to me, by Peter Niblett of IBM, that this list has been constructed by fallible human beings, and thus itself is not rigorous.  I knew that already, but you have to start from somewhere, and this is better than any other specification I have previously tested against. ))

An example statement from the table:

Normative Statement Number Normative Statement
[MQTT-1.5.3-1] The character data in a UTF-8 encoded string MUST be well-formed UTF-8 as defined by the Unicode specification [Unicode] and restated in RFC 3629 [RFC3629]. In particular this data MUST NOT include encodings of code points between U+D800 and U+DFFF. If a Server or Client receives a Control Packet containing ill-formed UTF-8 it MUST close the Network Connection.

What we need first is a model of the expected behaviour of an MQTT implementation, a test oracle. I had previously written an MQTT server in Python, for test purposes, so I updated it for MQTT 3.1.1. I also seeded it with log entries for conformance statements from the specification at appropriate points ((Look for"[MQTT … statements in this example. )) During the execution of this server, any conformance statements encountered are logged. At the end of the test, we can measure the number of conformance statements that that test invoked in the server. We can assess a test suite both by the conformance statements invoked, and by those which are not.

Next we need a model of the test inputs, the MQTT packets that we want to send to the model MQTT server.  We wouldn’t need this model if we could exercise the whole input space without worrying, but for any non-trivial software, this space is so vast that we need to focus.  Even a computer needs to focus.  The first model I’ve created is intended to cover the basic MQTT test scenarios. It responds to QoS 2 publish flows correctly for instance. In the basic scenarios, we want to make sure that the “golden paths” are executed, and some of the error cases. Other input models will be needed for complete coverage — when a large message takes longer to send than the keep alive interval, for instance.

With the Python script the input model is used to generate a sequence of inputs which are sent to the model MQTT server.  The outputs from the model MQTT server are recorded along with the inputs, to result in a self-checking test case.  The exact same sequence of inputs are not used more than once in the same test. The resulting test case can be used directly against a real MQTT server with the There are some ‘little’ complications — MQTT packet IDs do not have to be the same on each run, so they have to be remapped. But in general we replay the same recorded scenario against the MQTT server we want to test.

Currently, the generated test suites reach about 55% coverage of the conformance statements. I have to work on this input model, and maybe some others to increase that coverage to as near 100% as I can manage.   Another complication is that some behaviours of MQTT are optional or can vary, so we have to be able to generate test suites for each of these variations. I aim to be working on these and other items in the upcoming weeks and months before EclipseCon NA next year.

Some more information is available on the Paho wiki. I hope this post is helpful in explaining my intentions. Please do ask questions. I expect at least two other follow on posts:

  • More Rigorous Testing for MQTT Clients
  • A “model” implementation of an MQTT server

so this won’t be the last word on the subject.

2 Replies to “More Rigorous Testing for MQTT Servers”

  1. Ian this looks great. The conformance statements certainly feel like the most appropriate place to start. A model based testing approach is something I have little experience with but certainly seems appropriate for testing specification implementations where behaviour is well defined. I presume the idea here is to generate a test client to invoke the various transitions in a broker model and vice versa? How would you go about building the models? Can these be extracted from say a reference implementation?

    1. Martyn – there are two types of models: output (results) and input. The Python MQTT broker in the test package is the output model – that is, the implementation that will produce the expected behaviour for any test input. You can call it a reference implementation. The input model describes the data to be used in the test generation, things like client ID values, topic names and payloads to be used. When you use the data from the input model against the output model, and repeat, you get a complete test case. An example is here: I’ve been improving that output though since then. Ignoring the DEBUG lines, you get sequences of actions with their associated results:

      CALL connect with {‘sock’: , ‘cleansession’: True, ‘clientid’: ”}
      , Connacks(DUP=False, QoS=0, Retain=False, ReturnCode=0))
      RESULT from connect is 0

      I’m starting to think a tutorial would be a good idea.

Leave a Reply

Your email address will not be published. Required fields are marked *