Over a million developers have joined DZone.

IoT Platform Design Doc: Multiple Switches, One Light

DZone 's Guide to

IoT Platform Design Doc: Multiple Switches, One Light

Check out this new series by the folks at Cesanta providing you with a behind-the-scenes view into the development of the Mongoose IoT platform. This first one is all about the visual design of the "on/off switch" control.

· IoT Zone ·
Free Resource

Today, we are sharing the first of our design docs with you. If you remember, our design docs show a history of our engineering process and shed some light on how we developed the Mongoose IoT Platform. This one is all about the visual design of the “on/off switch” control.

This is the original doc—when we felt a little more context was needed, we added it in blue. Otherwise, it is the original document our engineering team used. So excuse the bad jokes!

Multiple Switches, One Light

Flip a switch, DWIM


Make an intuitive light switch button.

Problem Statement

Let multiple users have a widget on their phones controlling a light they might not see. They want to easily turn the light on and off, and they want to know the state of the light at all times. They might even stomp on each other's feet by switching the light on concurrently.

The control should be simple to understand and use. It should possibly require only one gesture (this could be a single tap). It should also use the least possible amount of screen real estate.

Multi-second latency between user action and actual state change is possible. The smart switch device might be unreachable. The user sooner or later will try to flip the switch while she can actually see the light bulb. If it doesn't turn on while the widget cockily claims otherwise, hilarity, and not necessarily the positive sort, will ensue.

Prefer the behaviour that matches the physical world the most.

Physical World Analogue

Multi-way Switches

Real life light switches with multiple control points are often built with this scheme:

IoT Platform Design Doc Switches 1

The light switch position thus doesn't tell whether the light is on or off. You usually don't care as long as the switch is in the room the light is in. In cases where the switch might control a light that's in another room, even the real world solutions have to resort to a way to visually represent the switch state. See next section.

Remote Status Reporting

Here's an example of a switch I have in my house:

IoT Platform Design Doc Switches 2

However, this works fine just because the status reporting is immediate. If we add any lag in the status reporting feedback, the user will be tempted to press the button again thinking the first press didn't go through. Let's explore that in the next section.


Games don't make you violent, lag does.

You flipped the switched, you don't expect a glitch.

In real life light switches just work. When it comes to the Internet of Silly Things....well, it may not work without a lag.


Q: How many software engineers does it take to change a light bulb?

A: None. That's a hardware problem.

I can't think of a light switch with a failure indicator; probably there are some specialized interfaces used by traffic light control centres. But, for the rest of us mortals, the most common failure indicator is:

IoT Platform Design Doc Switches 3


Look & Feel

IoT Platform Design Doc Switches 4

I delegate the actual UI style to Yaroslav. The oil icon is obviously a joke. Or is it?

The classic yellow warning triangle icon will probably do the trick.

The key idea is that the switch is divided into two parts:

  1. the desired (on/off)
  2. an optional issue indicator

Users don't care about small lags; they know technology sucks, and occasionally it will work just fine. However, when the lag exceeds that embarrassing moment of silence, wouldn't it be great if the app breaks the ice and just cheers you up with a merry spinner?

The delay before the spinner is shown depends on the actual use case of the switch and should thus be made configurable in the MUDL. (MUDL is another component of Mongoose and out of the scope this design doc.) 

What if the device is dead, e.g. the batteries are exhausted? Should it just spin forever until we hear back from the device? Does the spinner mean we're actually keeping on trying? Are we really?

I suggest that after some length of inactivity, and if possible, by taking into consideration other indicators such as device activity (as used by our device page device activity icon) we mark the switch as faulty and change the spinner to some other, more stable and more red icon.

This failure icon could also be actionable, e.g. open a browser page with troubleshooting info, or open a dashboard of the device activity and reported battery level etc.

Having a two status icons (spinner and failure indicator) doesn't add much complexity; actually, I'd argue that having a way to report known error conditions will make it easier to report day to day operational glitches in the right way.

If the user navigates away from the switch app and subsequently a failure condition happens, the user can be notified about it with the means appropriate to the given platform.

The rest of this Design Doc doesn't describe how this status reporting is to be wired in the Mongoose Universal Display Language (MUDL), sorry.


IoT Platform Design Doc Switches 5

For the sake of simplicity, let's consider the smart light bulb / smart switch as being completely passive: it receives commands to set its own state and if in doubt (e.g. after power up) it will ask for it (e.g. via Pub Sub).

If the smart light does have a say in its own state (e.g. a local switch or a local policy), that part of the logic will be treated as any other client (switch button, automated schedule).

Last Writer Wins Behaviour

Each user agent hosting a control will publish the switch state to a Pub Sub topic. The widget state is synced up by subscribing to the same topic.

Let's spell this out by using the MUDL primitives:

switch: Lamp 1
 publish: /lamp1
 retain: true
 subscribe: /lamp1

Repeating the topic twice is not nice. For this common setup we'll provide some syntactic sugar:

switch: Lamp 1
 topic: /lamp1


This kinda works, but what happens, for example, when your device is temporarily disconnected from the broker? Upon reconnection, a new subscription is made and we'll receive the last stored value, which might not be the last state the user selected on this client. The switch will flip back to the previous state and the user will be confused. Perhaps, in the meantime, the toggle event that was sent right before the disconnection was received by the broker, and the broker will send it back to the client (MQTT queues messages in order) causing yet another switch state flip.

Similarly, artifacts/glitches in the switch behaviour will happen when more than one concurrent client is operating the switch. Keep also in mind that the actual order that published messages reach the MQTT broker might not match the order the buttons were pressed on the various devices.

Furthermore, we cannot rely on timestamps to detect stale updates because of clock skew. Also, because passing timestamps around means we need to provide some more complicated logic, and if we're at it there is probably a better way (see next section).

However, all the clients (and the target lamp) will eventually settle on one state. This behaviour is something that we can work with but it doesn't do a good job at preserving the user's intent.

Concurrent Behaviour

Let's start with the syntactic sugar form:

switch: Lamp1
 topic: /lamp1
 multiway: true

This expands to:

switch: Lamp 1
 publish: /lamp1//count
 retain: true
 subscribe: /lamp1/+/count
 on-message: >
   function(topic, msg) {
     var client = /.*\/(.*)\/count/.exec(topic)[1];
     this.state = this.state || {};

     if (client != this.client_id) {
       this.state[client] = parseInt(msg);
     } else if (!this.restored) {
       this.count = msg;
       this.restored = true;

     var effective = this.count;
     for (var k in this.state) {
       effective ^= this.state[k];

     // the UI will react on the change of the value property
     this.value = !!effective;
on-change: function() { this.count ^= 1 }

Here I introduced user-defined logic in the widgets that are out of the scope of this document. We could as well hardcode this logic in the switch widget behaviour; this is just for illustrative purposes.

The custom widget property "state" was bad choice for the name, but I'm too tired to change it everywhere; again this is for illustrative purposes.


First, some ground rules, goals and terminology:

  1. client: user agent that hosts a given control instance.
  2. use only basic MQTT functionality (no persistent sessions and high QoS).
  3. network connection can go down at each client, at any time.
  4. the broker can go offline.
  5. user can still flip the switch and have her intent attended to.
  6. state should converge easily in case of lost messages.
  7. no server side support necessary.

The core idea can be summarized as:

A state-based increment-only modulo 2 CRDT where the CRDT state is stored in a Pub Sub "basetopic/#/count" topic hierarchy.

A CRDT (Conflict-free replicated data type) is an eventually consistent data structure that converges to the desired state. The desired state must account for every switch flip event that occurs.

Quite a mouthful! Let's break this down in the following 4 subsections.

Increment Only CRDT Counter

Let's first make a quick recap of how a CRDT counter is implemented:

Imagine a bunch of clients, each identified with a unique ID and keeping a local counter.

Each time a local counter changes, it gets sent to the other peers along with the ID of the client:

{id, counter}

Each peer receives the updates and builds a view of the global state:

{id1: count1, id2: count2, ...}

This is done by:

if(client !=this.client_id) {
this.state[client]= parseInt(msg);

This view contains the latest partial counters a given client knows about the other clients.

Keep in mind that if the state updates are broadcasted to all clients using one single shared channel, it's the responsibility of each client to ignore updates coming from itself, since they might contain an old state.

The actual value of the counter is simply a sum of all the partial counters, including the client's local count. At any moment in time, the local state might temporarily diverge from what other clients know, but it will eventually be known to all, thus all clients will converge to the same state.

If a connection loss occurs, each client can simply rebroadcast its current state.

Modulo 2

In case of a switch, all we care about is the parity of the counter. Since the mod 2 of a sum is the sum of mod 2 members, we can avoid keeping actual counters but just broadcast the current desired state of the switch (which, as in case of the real life light switch might not reflect the actual state of the target "light", it's just the position of the local switch).

A modulo 2 sum can be done by just xor-ing all the partial mod 2 sums:

var effective =this.count;
for(var k inthis.state) {
       effective ^=this.state[k];

Now, effective holds the desired state of the switch and we must tell the widget to render that:

this.value = !!effective;

Pub Sub storage

So, how do we broadcast the state map and let each client catch up the quickest?

A simple scheme is:

  1. client N publishes its local "counter" to: /lamp1/N/count and sets the retain flag.
  2. each client subscribes to /lamp1/+/count

The "+" component in the topic allows us to perform wildcard subscriptions. Each client will receive any message matching that pattern, including copies of the last messages when it first subscribes.

We thus mapped:

{id1: count1, id2: count2, ...}


/lamp1/1/count -> count1
/lamp1/2/count -> count2

Recovering Local State

The widget keeps a local count of the number of times it has been flipped. This is obviously not the same as the rendered state of the switch.

A client could persist this count locally or it could just obtain its own last count through the wildcard subscription to /lamp1/+/count.

For that we need an exception to our rule that ignores our own status update:

  if (client != this.client_id) {
       this.state[client] = parseInt(msg);
     } else if (!this.restored) {
       this.count = msg;
       this.restored = true;

In case of reconnection, the client will receive again the copy of all states including its own (possibly stale) state. In the meantime, the user might have flipped the switch and we don't want to lose that event, hence we have to only recover the state once.


The nice property about this scheme is that, regardless of networking glitches and broker reconnections, the switch state will account for all user interactions, like a real world multi-way switch would.

There are a few refinements to be made. For example, if a client is permanently disconnected for a long time, it would be confusing if when it gets back online its local state gets incorporated in the global switch state. This case falls into the "Failures" subsection of the third heading.

Issues with this scheme are:

  1. More complex
  2. Requires unique client IDs for each concurrent actor.
  3. Adds some structure to the topic names.
  4. The size of the state map grows with the number of clients. If the clients are all ephemeral (e.g. on a public demo like demo.cesanta.com) we need to coalesce the status map periodically, which implies we have to track idle clients.
  5. The physical switch also has to apply the same operations (summarized in "Passive code" example below) and thus also keep a state map.

On the other hand:

  1. The complexity is mostly conceptual, the actual operations are simple and most of the logic can be captured in one short JS snippet executed on every received message.
  2. With some logic hosted on the cloud (e.g. cloud functions) we can offload the heavy logic from the device, and let the device be controlled with a simple Pub Sub message (or a clubby method invocation that sets the state).
  3. The state is only 1 bit per user agent, so the device can easily be controlled by N devices using only N/8 bytes of storage, provided that the client IDs are small integers.
  4. The switch target doesn't need a persistent state. It could just forget about the state and obtain it from scratch with a new wildcard subscription.


This scheme can be generalised to work with sliders or up/down counter controls (e.g. volume controls) by using a PN CRDT counter, composed of two increment-only counters.

Server-side Support

The scheme proposed so far is nice because it doesn't strictly require any specialised server-side component. And, if we offer custom JS event handlers, it doesn't even require us to implement this particular logic in our "Universal Mobile App" (this is coming soon to Mongoose IoT Platform).

However, let's consider the possibility of having some "switch controller" backend code.

switch: Lamp 1
 publish: /switch_controller/lamp1
on-change:function(){this.count ^=1 }
 subscribe: /lamp1

Incidentally, this also shows a use case for the extra flexibility of specifying different action and source topics.

If the switch holds a by default a property like count (i.e. a local state) we might even avoid implementing on-change user defined callbacks. E.g:

switch: Lamp 1
 publish: /switch_controller/lamp1
   value: 0
 subscribe: /lamp1

Standalone Working Example

I tried this out with a standalone webapp where the core logic was:

class MultiSwitcher {
 constructor(url, topic, client, cb) {
   this.client_id = client;
   this.cb = cb;

   this.utopic = topic + '/+/count;
   this.etopic = topic + '/' + this.client_id + '/count';

   this.mqtt = mqtt.connect(url);

   this.mqtt.on('connect', () => this.mqtt.subscribe(this.utopic));
   this.mqtt.on('message', (topic, msg) => {
     this.receive(topic, JSON.parse(msg))

 flip() {
   this.count ^= 1;
   this.mqtt.publish(this.etopic, JSON.stringify(this.count),
                     { retain: true });

 receive(topic, msg) {
   var client = /.*\/(.*)\/count/.exec(topic)[1];

   this.state = this.state || {};

   if (client != this.client_id) {
     this.state[client] = parseInt(msg);
   } else if (!this.restored) {
     this.count = msg;
     this.restored = true;

   var effective = this.count;
   for (var k in this.state) {
     effective ^= this.state[k];

   // this assumes the UI library will debounce quick state changes
   // that happen when we receive the first batch of retained messages
   // from the wildcard subscription.

Passive Code

Simplified code that shows how a light switch can figure out whether it ought to be on or off:

var mqtt    =require('mqtt');
var client  = mqtt.connect('mqtt://localhost');var widget =

        client.on('connect',()=> {
 client.subscribe('/v1/'+ widget +'/+/count');


           var state ={};
client.on('message',(topic, message)=> {
var client =/.*\/(.*)\/count/.exec(topic)[1];
 state[client]= parseInt(message.toString());

});function render() {
var effective =0;
for(var k in state) {
   effective ^= state[k];
 console.log("Lamp is", effective ?"on":"off");
 // do the real thing, yay!
 // TODO: debounce this call
 GPIO.write(2, effective);


widget ,light ,client ,behaviour ,state ,device ,smart ,switch ,design docs

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}