A Story of MQTT 5.0

The MQTT protocol has been around since the late 90s when it was created to enable the monitoring of a long distance oil pipeline. It went through several iterations before landing on version 3.1, published by IBM.

The next step was standardisation, at the OASIS standards body. As anyone who has taken part in a standardisation committee will know, this process is necessarily bureaucratic and slow. To speed up adoption, the main imperative was minimising disruption to existing implementations, as set out in the Technical Committee (TC) charter.

As a result, wholesale changes to the MQTT 3.1 specification were not allowed in the 3.1.1 standard. This meant that many irritating flaws could not be fixed nor widely sought enhancements included. This is where MQTT 5.0 comes in. While we still wanted to minimise disruption (no-one wanted to repeat the experiences of say AMQP 0.9 to 1.0), we also wanted to address the MQTT wish-list as far as possible so that major changes would not be needed for a long time to come. Whether we succeeded in that aim, time will tell.

To help us make sense of the multitude of items on that wish list, as I wrote in September 2016, they were grouped into four Big Ideas:

  • Improved error reporting
  • Extensible metadata
  • Scalability and large scale systems
  • Resource Constrained Clients and Performance Improvements

At that time, many of the solutions were not decided upon, but now with the availability of Committee Specification 01, I can write about the details. We are in the final stages of the standardisation process for 5.0. We hope to complete the process of rubber stamping in the next few months, and expect no substantive changes during that time.

Improved Error Reporting

Negative responses, or nacks, were the biggest omission from earlier versions of MQTT. If the client or server had a problem with the request or packet from the other end, the only recourse in many circumstances was to close the TCP connection. Connect packets were the exception to this: the connack always had a return code. MQTT 3.1.1 added a negative response code to subscribe requests because there was space available in the “granted QoS” field of the suback packet. Publish requests, however, were still not catered for. This is now remedied.

Reason Codes

As all ack packets now have reason codes, they have been consolidated into one set, which starts like this:

Reason codes are one byte. Values from 0 to 0x127 inclusive indicate successful outcomes, those from 0x128 to 0xFF unsuccessful. So as the subscribe response can have an error code:

msc {
    arcgradient="4";

    c [label="client"], b [label="broker"];

    c => b [label="connect"];
    b => c [label="connack(rc=0)"];

    c => b [label="subscribe"];
    b => c [label="suback(rc=0x80)"];

    ...;
}

so can the publish, when the QoS is greater than 0:

msc {
    arcgradient="4";

    c [label="client"], b [label="broker"];

    c => b [label="connect"];
    b => c [label="connack(rc=0)"];
    ...;
    c => b [label="publish, qos=1"];
    b => c [label="puback(rc=0x80)"];
    ...;
    c => b [label="publish, qos=2"];
    b => c [label="pubrec(rc=0x80)"];
    ...;
    c => b [label="publish, qos=2"];
    b => c [label="pubrec(rc=0x0)"];
    c => b [label="pubrel(rc=0x80)"];
    ...;
    c => b [label="publish, qos=2"];
    b => c [label="pubrec(rc=0x0)"];
    c => b [label="pubrel(rc=0x0)"];
    b => c [label="pubcomp(rc=0x80)"];
    ...;
}

For the QoS 2 exchange, it stops if any of the reason codes are 0x80 or above. This is a major improvement on previous versions of MQTT, where continuing with the exchange or terminating the connection were the only options.

Server initiated disconnect

In MQTT versions prior to 5.0, only the client could send a disconnect packet. This meant that in any case where the server wanted to end the conversation with a client, there was no option but to just terminate the TCP connection. A common case is when the server shuts down – there is no error in the interaction between broker and client, but the client has no idea what’s happening. In MQTT 5.0, the server can send a disconnect packet with a “Server shutting down” reason code:

msc {
    arcgradient="4";

    c [label="client"], b [label="broker"];

    c => b [label="connect"];
    b => c [label="connack(rc=0)"];
    ...;
    c => b [label="publish, qos=1"];
    b => c [label="disconnect(rc=139)"];
}

In this case, the client might wait for a while before attempting a reconnect, knowing that the server might not be available for a while.

Extensible metadata

The big change here is in addition to reason codes, each packet (apart from pings) can have properties. This is an extract of the full list:

Properties can be used to add extra information to responses, such as a reason string, or extra parameters to requests. A lot of the rest of the changes rely on properties because now we had a mechanism for adding that extra information to packets, we had to use it!

Request/response

The new request/response capability makes good use of properties. The requester subscribes to the topic it expects to receive responses on, then sets the value of the “Response Topic” property to that topic name. The responder simply uses that property to set the topic name for its response.

msc {
    arcgradient="4";

    c1 [label="client1"], b [label="broker"], c2 [label="client2"];

    c1 => b [label="subscribe, topic=resp"];
    b => c1 [label="suback"];

    c1 => b [label="publish, response_topic=resp"];
    b => c2 [label="publish, response_topic=resp"];

    ...;

    c2 => b [label="publish, topic=resp"];
    b => c1 [label="publish, topic=resp"];
}

The “Correlation Data” property can be used to set an id for each request, so that replies can be matched to requests by the requester.

Payload format indicator

There were fairly contentious discussions about how much flexibility there should be in payload format settings. Some were in favour of user definable payload formats. Others felt that if people could define their own formats it was no better than the current position, unless some body kept an approved list of format indicators and their meanings. That seemed a step too far for MQTT. MIME types were discussed, but the final approach is minimalistic – just two values: binary, as 3.1.1, or UTF-8 data.

Enhancements for Scalability

Improved error reporting helps scalability because exchanges between servers and clients become more efficient. Properties are again crucial to the following functions.

Simplified session state

One of the other big irritations with 3.1.1, along with the lack of nacks for publish commands, is the behaviour of the “clean session” flag. In earlier versions of MQTT, this started out as the “clean start” flag, where the session state was only cleaned up at the start of a session, not at the end. This was good for clients, because it meant you could ensure a clean starting point, and leave the session around in case you needed to reconnect. Not so good for servers, because clients would tend to leave the state lying around for ever.

Later on, this flag was changed to “clean session”, cleaning the session state both at the start and end of the session. Good for servers. For clients, if they want to ensure a clean slate to start with, but then want to have session state saved, they have to connect twice:

msc {
    arcgradient="4";

    c [label="client"], b [label="broker"];

    c => b [label="connect cleansession=true"];
    b => c [label="connack"];
    c => b [label="disconnect"];
    ...;
    c => b [label="connect cleansession=false"];
}

We knew we should fix this situation once and for all. The “clean session” flag becomes “clean start” once more – session state is only cleaned up at the start of the session. Then there is the “session expiry interval” property, a four-byte integer value in seconds which defaults to zero if omitted. If it is set to 0xFFFFFFFF (UINT_MAX), the session does not expire. To accomplish the above scenario:

msc {
    arcgradient="4", wordwraparcs=on;

    c [label="client"], b [label="broker"];

    c => b [label="connect cleanstart=true, expiry_interval=0xFFFFFFFF"];
    ...;
}

The MQTT-SN “offline keep alive” scenario is also catered for. By setting the expiry interval to a suitable non-zero value, the client can ensure that the session state is saved as long as it reconnects regularly. If the client disappears entirely, the session state will be cleaned up at some point. Both clients and servers are happy.

Shared subscriptions

To allow load balancing of high throughput topics, the concept of shared subscriptions is introduced to MQTT. Messages on these topics are sent to one of a group of subscribers rather than to them all. The subscriber indicates that the subscription is shared simply by subscribing to a special topic pattern:


$share/{ShareName}/{filter}

where ShareName is the name of the shared subscription group, and filter is the usual topic filter used in the subscribe request.

Optional server capabilities

Some server functionality is expensive to implement at large scale. In MQTT 5.0, the server can advertise the limitations on the functionality it provides in the connack properties. Some examples:

Retain Available
are retained messages supported?
Maximum QoS
the maximum publish QoS the server will accept
Maximum Packet Size
the maximum packet size the server will accept
Receive maximum
the maximum number of concurrent QoS 1 and 2 message the server will handle

Resource Constrained Clients and Performance Improvements

Various features fall into this category, including some already described. Some further examples follow.

Nolocal subscriptions

Up until MQTT 5.0, the publisher of a message will receive that message back if it is subscribed to the same topic. People often find this out in their first experience of writing an MQTT application, when they implement a shared chat room. There is now a subscribe option noLocal which when set, indicates that the publishing application should not receive its own messages.

msc {
    arcgradient="4", wordwraparcs=on;

    c [label="client"], b [label="broker"];

    c => b [label="subscribe topic=a"];
    b => c [label="suback"];
    c => b [label="publish topic=a"];
    b => c [label="publish topic=a"];
    ...;
    c => b [label="subscribe topic=b, noLocal"];
    b => c [label="suback"];
    c => b [label="publish topic=b"];
    ...;
}

Retained message control

Options on the subscribe request have been added to:

  • 0 = Send retained messages at the time of the subscribe
  • 1 = Send retained messages at subscribe only if the subscription does not currently exist
  • 2 = Do not send retained messages at the time of the subscribe

This could help particularly with the implementation of MQTT bridges from one broker to another.

Topic aliases

This capability exists in MQTT-SN, to reduce the size of the publish packet when long topic names are used. The publish request allows a numeric topic alias to be specified, which can be used in subsequent publish packets. Topic aliases on the client and the server are independent of each other, in much the same way as packet ids are.

msc {
    arcgradient="4", wordwraparcs=on;

    c [label="client"], b [label="broker"];

    c => b [label="publish topic=long_name,alias=1"];
    c => b [label="publish alias=1"];
    ...;
    b => c [label="publish topic=server_long_name, alias=1"];
    b => c [label="publish alias=1"];
    ...;
}

Topic aliases only exist for the lifetime of a TCP connection.

Specifying client limitations

To help protect implementations on small devices, the client can specify its limitations using properties on the connect packet. Some examples:

Maximum Packet Size
the maximum packet size the client can accept
Receive maximum
the maximum number of concurrent QoS 1 and 2 message the client can handle

It is an administrative action or decision on the part of the server to decide what to do with messages that it receives bound for a client for which that message exceeds the constraints. This is not particularly different from 3.1.1 where the message would be sent anyway, and then the client might be forced to disconnect as its only recourse. At a minimum, the server should probably emit a warning log message.

Eclipse Paho Progress

A release of the Eclipse Paho project is planned for June 2018 with its first implementations of MQTT 5.0. I first implemented a broker to test against in the Paho test project. It combines 3.1.1 and 5.0 implementations, and has been used by James Sutton to implement the Java MQTT 5.0 support. It is used in Travis and AppVeyor continuous integration tests for the MQTT 5.0 branches. Example output when you start it up is shown below.

My own C clients, embedded and main are planned to have a June release. The MQTT 5.0 implementations continue in the embedded mqttv5 and mqttv5 branches. Please do give your feedback or thoughts on these implementations as they progress via the GitHub issues:

Where are we with MQTT-SN?

This question was posed to me recently, with the added observation that it seemed no progress had been made in the last two years.  From my perspective this isn’t quite true.  Although the specification is still maintained by IBM, no movement has been made to standardize it at OASIS like MQTT has been. That is not to say that the subject of MQTT-SN has not arisen in the MQTT OASIS Technical Committee (TC) discussions, it has. But amongst the numerous improvements that we wanted to make to MQTT 3.1.1, and an ambitious timescale — publishing the new version MQTT 5.0 this year (2017) — addressing non TCP networks in MQTT was put to one side.

There is a continual wavering in my mind as to the importance of MQTT-SN. On one hand, edge computing platforms are getting ever more powerful, on the other, low power consumption is very important for battery power. The powerful edge processors are likely to have TCP stacks for which MQTT is appropriate: but there are still use cases for which low power use is the crucial factor. UDP, Bluetooth Low Energy (BLE) are typical transports used in this case.

If there is enough interest in a standardized MQTT-SN, its features could be addressed by the MQTT OASIS TC. There are a number of options. Incorporating MQTT-SN or its features into the main MQTT specification is one. However MQTT currently requires a reliable underlying network transport (TCP). To change that assumption we would have to consider very carefully the implications, which could take a significant amount of time. It has been suggested that MQTT-SN could be dealt with as a committee note: I’m not sure what form that would take.

In the meantime, I’ve been trying to get more MQTT-SN capabilities added to the Eclipse® Paho project. After I contributed an initial MQTT-SN packet library a couple of years ago, and the RSMB broker with MQTT-SN support to the Eclipse Mosquitto project at its inception, things stalled for a while (but not for a lack of interest on my part). There are forks of RSMB with fixes to the MQTT-SN support, and various gateways and clients around. One of those, written by Tomoaki Yamaguchi, had gained some support and I forget exactly how it happened, but Tomoaki has now contributed an MQTT-SN transparent gateway to Paho.

Over the past months this gateway has matured nicely. I recently used it to replace the use of RSMB in this experiment of Benjamin Cabé’s and it worked well. If we add BLE support to the gateway, then the Node.js BLE to UDP MQTT-SN forwarder in that experiment would not be required either. The gateway is written with the intention of allowing other transports to be added, so this should be eminently feasible.

I do remember seeing an email or post or something recently describing some other MQTT-SN components written by the poster. But now I can’t find or remember where I saw it. If this wasn’t a dream, and you know of such a thing, or it was you, please do let me know.

Where do we go from here?  Two avenues: practical implementations and the specification.  After the MQTT 5.0 standard is published, we can see if there is any interest in the OASIS TC in pursuing the application of MQTT to non-TCP networks.  In the meantime, I will continue working on and encouraging MQTT-SN contributions of the current specification to Paho and elsewhere.

If you are interested in MQTT-SN and would like to see it considered by the OASIS MQTT TC, or make any other comments to the TC, then you can use the mailing list.  If you know of any other implementations or have any questions, or suggestions for Paho, then I will be happy to hear of them

Using the Eclipse Paho “Test” Broker to Help Test MQTT

I may not have finished all my goals for the test material in the Eclipse Paho project, as outlined in this blog post of mine, but some components are still useful in their current state.

Recently my IBM and Paho colleague, James Sutton, needed to check the behaviour of the Java and Android clients when receiving an error code in response to an MQTT subscribe request. This error code, returned in the MQTT suback packet, was introduced in the last, and current, version of MQTT, 3.1.1. It takes the form of an 0x80 value in the granted QoS (Quality of Service) field, for which only 0, 1 or 2 are valid values – the set of integers which QoS can take in MQTT.

Now you could fire up a broker like Eclipse Mosquitto and configure it to disallow a subscription to a certain topic. If you know the broker well enough, this may be quick and easy. It’s possible you might have to do some fishing around, and figure out whether that broker is actually returning 0x80 as you wanted.

So James turned to the Paho test broker (startbroker.py in this repo.) (You must use Python 3 to run it, not Python 2). As it stands, this broker will return 0x80 if you try to subscribe to the topic “test/nosubscribe” (see line 322 in MQTTBrokers.py). That’s pretty easy. If you checked out that file, you will notice another two topics: “test/QoS 1 only” and “test/QoS 0 only”, which will return granted QoSs of a maximum of 1 and 0 respectively. These behaviours can be hard to elicit out of a standard MQTT broker.

Ultimately I guess I should make the specific topics which these responses are attached to configurable, but they aren’t right now.

There are some other characteristics of this broker, which single it out as a “test” broker rather than a product:

  • the goal of the coding is clarity rather than performance. You can tell me whether I succeeded
  • there is no persistence: if you want to simulate stopping and restarting a broker, this broker can just remain running, and disconnect all clients
  • MQTT specification conformance statements are embedded in the code, so that when a test suite is run against the broker, it can tell you which statements were encountered, and which weren’t.
  • the broker has parameters to choose behaviours which can vary but still conform to the MQTT specification:
    1. whether to publish QoS 2 messages on PUBREL or not
    2. whether multiple matching subscriptions result in one publication, or more than one
    3. whether queued QoS 0 messages are dropped if the client is disconnected, or not
    4. whether zero_length_clientids are allowed

    it’s possible more might be added in the future.

The main missing feature of this broker is TLS – but it does have WebSocket support. TLS is on my to do list, as will be updates for the next version of MQTT, 5, which we are working on, and is tentatively scheduled for completion next year.

As outlined in the blog post, this broker is meant to help with broker testing as well, as an oracle.

For some more information, see the Eclipse Paho website for the test tools.

You can use issues on this project to ask for new features or identify bugs, or pull requests to offer your own contributions.

Why doesn’t MQTT have a payload format?

I was talking to Andy Stanford-Clark on Friday, and asked him if he and Arlen ever considered using a payload format in MQTT.  I was interested to know this because we are discussing approaches to MQTT metadata in the standardization process.

Many of the industrial protocols that MQTT was intended to replace, enhance or work alongside go to great lengths to specify the details of the payload.  Data types, metadata, objects, all of these things may be defined in the data.  MQTT does none of these things – the payload is completely undefined.

Andy’s answer follows.

MQTT is intended as a transport, not a protocol.   It’s job is to get the data from one machine to another, not to define what that data is.   Industrial protocols have often conflated the two jobs, so that they are inseparable.  This makes them more complex and difficult to understand than they need to be.

MQTT can be used as a basis for a protocol which also defines the data to be exchanged, but the job of getting the data from one machine to another is already done.  Andy gave the example of a company who used MQTT to defined a new protocol for the first time.  Because MQTT took care of the connection between client and server and transporting the data between them, the process of defining the new protocol took half a day.  Andy asked them how long defining a protocol would have previously taken: the answer was “two weeks”.

Since MQTT was first defined in 1998, data serialization formats have come into and gone out of fashion.  XML was the “obvious” choice for a while.  JSON and Google Protocol Buffers may be the favourite options today.  MQTT is agnostic to them all, it can carry any of them, so it has not outlived its usefulness.

When people first learn about MQTT, they often ask about the lack of payload format, the implicit question being, what use is MQTT then?   The answer being, whatever you want to use it for.  Combine it with some data formatting and exchange sequences, then you have a protocol.  Maybe we will see some standardized protocols in the future which will be based on MQTT, using it for the data transport.

What Will The Internet of Things Ever Do For Us?

There have been any number of warnings in recent years about the potential negative impacts of what has come to be known as the Internet of Things. Warnings about security:

Their own devices
Hacking the planet

about privacy and the control of global corporations:

Against the Smart City – Adam Greenfield
The Epic Struggle of the Internet of Things – Bruce Sterling

and user interfaces amongst others.

These ideas have a long history in fiction, Philip K. Dick’s Ubik’s argument with his front door, the discussions with toasters in Hitchhiker’s Guide to the Galaxy and Red Dwarf, for instance. Now these warnings and stories have particular relevance because we are on the cusp of seeing them turn into reality.

I am conscious of the pitfalls we face, but I also have a positive outlook on the Internet of Things. When I started working with embedded devices and MQTT, it wasn’t called that – we were just getting data and command from and to devices. As time went on, the “Internet of” prefix became a shorthand phrase that we would use to describe connected objects of any sort — see “The Internet of Cows”. The “Internet of Things” phrase has caught on, though, for better or worse.

These are some of the obvious positive potential benefits:

  • medical monitoring – of pacemakers for instance
  • environmental monitoring and prediction – flooding, snowfall,
  • infrastructure monitoring – power, water, fuel

but I also have an image of a world where other things are possible, important or trivial:

  • opening the curtains with wave of my hand
  • nano machines that can cure disease from the inside
  • street furniture that changes colour to match my outfit (Only Forward)
  • painted portraits that respond to me and my questions
  • an active view of the sky on my bedroom ceiling
  • a car which can change colour each day to match my mood or fancy

which have all appeared in fiction before, whether as science or magic. I will probably be explaining the unnecessary if I recall Arthur C. Clarke’s aphorism “any sufficiently advanced technology is indistinguishable from magic”.

So I have an image of a world where magical acts are possible, where books I have read come to life. Where the intrusive technology we have today can fade into the background if we want it to. Where art, science and society and can find new ways to influence and improve our daily lives.

Of course we do need to be aware of the dangers that can potentially accrue. But whatever activities we are involved in have dangerous implications, from accidents or intentional criminal behaviour. So those dangers in themselves are no reason to dismiss the entire field Technology may also have awkward interfaces that cause more problems than they solve, which stems from a lack of forethought, of design.

Adam Greenfield, in his book Everyware: The dawning age of ubiquitous computing has considered just about all of the possibilities. He set out some principles for development and use of “Everyware”, which I quote directly.

Ubiquitous systems must:

  1. default to harmlessness
  2. be self-disclosing
  3. be conservative of face (not take actions to unduly embarrass users)
  4. be conservative of time (must not introduce undue complications into ordinary operations)
  5. be deniable (users must be able to opt out at any time)

I think it is integral to the above aims that the ethos driving these systems is openness:

so that control of our technological environment is not entirely left to our governments and corporations. The more transparent the whole process is, the better. (I do like to remember from time to time, however, that governments and corporations are people too.) Essential infrastructure needs to be secure from attack, from malevolent intent, but each of us should be able to choose what happens to data we own. I see reason to be optimistic that the advance of technology is empowering the individual as well as the organization.

Today we allow organizations to obtain data about us in exchange for free services we value. Maybe we should consider whether a service which charges but gives us control over our data is a better bargain.