According to Google trends, the Internet of Things started gaining popularity around 2013. Right now, it is a prominent topic, yet it is difficult to find case studies describing architecture of entire systems utilizing IoT to solve specific business requirements. This post is an attempt to demystify some of IoT tools and technologies, which can be used in a production system.
The system described in this article was built around the following business model:
PrintPrince provides printer rental services to other companies. Pricing is based on the amount of pages printed per month, thus PrintPrince’s main task is to keep printers operational. Currently, the main tool for printers’ maintenance and monitoring is CRM. All service activities are planned manually and printer status (pages printed, toner level) is unknown unless a technician checks it. PrintPrince decided to build an IoT solution that will allow to monitor, manage and make forecasts for a fleet of printers. It is essential for the system to integrate it with existing solutions at PrintPrince.
System architecture is based on Microsoft Azure cloud and uses Azure Iot Hub component for bi-directional communication with real devices.
Black arrows indicate main communication channel, red arrows indicate commands sent from cloud to the printer (cloud-to-device abbreviated C2D). The architecture includes:
- IoT Hub – provides two-way communication between printers and the cloud back-end. It also enables access control per device. It means that every printer must be registered in IoT Hub and will receive its own unique access key.
- Service Bus – there are 3 Event Hubs created within a single Service Bus namespace (blue blocks on the diagram above). They enable asynchronous communication between system components. They operate in a “publish-subscribe” manner by supporting consumer groups, which allow distribution of the input streams to multiple recipients.
- Machine Learning – responsible for forecasting the time left until a specific printer toner is drained. This is calculated with a “Boosted Decision Tree Regression” algorithm. The learning part was performed with a dataset containing 6-month of printing history from multiple devices. Each time a printer sends a message to the back-end, the forecast is recalculated using current toner level, number of days since last toner replacement and current day of the week.
- Stream Analytics – as the name indicates, it allows processing of the real time streaming data. There are multiple inputs available, aggregation, analysis and multiple outputs, which in this case are:
- Azure SQL database where all system events are registered,
- Event Hub responsible for C2D channel.
This component is programmed directly in Azure Portal with a language that strongly resembles T-SQL. One interesting feature is the ability to call a function defined in the Machine Learning component. Thanks to this, every event (or a group of events) flowing through the system can be analyzed by the trained machine learning model.
- Azure SQL database – all events in the system are registered in this database.
- Web App – hosts a standard ASP.NET MVC 5 web application – a dashboard for technicians and account managers. It has been built with Bootstrap, SlickGrid and Leaflet.js. There is also SSO authentication implemented with Microsoft.Owin.OpenId library.
- Web Jobs – they execute all the event-processing logic driven by data from Event Hubs. They are specialized services resembling micro-services. They work in the background as Continuous Web Jobs deployed to a separate Web App component.
Components outside of Azure cloud:
- Microsoft Dynamics CRM – this is the main data store for the web application and Power BI reports. This includes customer data, printer data (status, location, failure audit) and financial data. All operational data gathered from the printers in real time are synchronized to this system, so it represents the most up-to-date printers’ state.
- Microsoft Power BI – contains a set of reports, which aid account managers and analysts in their daily tasks.
IoT is not just about the Internet, it is also about “Things” – network printers in this particular case. It would be difficult to test and present system capabilities using real, large devices. With this in mind, two solutions have been implemented:
- Web emulator – this is a subpage within the dashboard which simulates a real device. It can send and receive events and it also can change its UI depending on the events’ contents. It also provides refill, failure-generation and print-mode features. The latter starts a printing animation and starts sending status events in “fast-forward” mode, just as if a real printer would when operating over a long period of time (weeks, months). When the toner is drained, the emulation is stopped. With this feature, it is possible to simulate a couple of toner drain cycles within minutes.
- Real device – this is a Pipsta thermal printer connected to a Raspberry Pi running a custom python program. It was prepared mainly for presentation purposes. Mini-printer can be controlled with an NFC-enabled smartphone. It has all the features of the web emulator, however, it gives much better feeling and understanding of the solution during a demo.
How it all works?
Here are the basic events flowing through the system.
Connecting the printer to a network (or starting web emulator) triggers a loop, which generates a “Connect” event every couple of seconds. This event updates the date of last connection in the CRM (via CRM Event Hub and CRM Event Processor). This provides the basic knowledge about printer being online and operational.
During printing, the printer sends an event with information about the printer status – it contains toner level, pages printed and printing mode. This data is stored in the SQL database via one of Stream Analytics outputs. At the same time, another output connected to CRM Event Hub synchronizes the data to the CRM (via CRM Event Processor).
This event means that a technician installed a new toner inside the printer. The message is handled in a similar way to the Connect message. The refill date is utilized by Machine Learning in order to do the forecasting calculation. The information is delivered to Stream Analytics via Printer Event Processor and Printer Event Hub.
This is one of D2C commands, which alter the state of the printer. It sets the printing mode, which can be Normal or Eco. The decision whether the mode should be changed is made within Stream Analytics and is based on current toner level. When the level is too low, the Eco mode will be activated in order to increase toner lifetime. This message is sent to Printer Control Hub via one of Stream Analytics outputs. Then, it is handled by Printer Control Processor and sent to IoT Hub. Routing the message to a particular printer is handled by IoT Hub based on device ID provided in the message header. The emulators can easily consume such message (and indicate they received it), as we have full control over the code that runs on them. In a real-world scenario, it would be useful to add a physical gateway component between printers and IoT Hub in order to translate the custom message (usually JSON) to a format understood by the printer (usually it should be a SNMP message).
Additionally, it is possible to generate “Quality Update” message on demand, directly from the web dashboard. The message is then added to Printer Control Hub. From here, the message flow is identical to the one generated by Stream Analytics. It proves that the modular architecture of the back-end allows to easily extend the system. It is achieved by multiple separated communication channels and multiple Event Hubs, each one having its own responsibility.
We faced a number of issues and challenges along the way. Whether it was implementation, configuration or deployment – there were always some unusual cases to solve. Azure Cloud is a powerful and interesting environment, yet some components turned out to be challenging to work with.
Major difficulty was the distributed nature of the whole system – testing, provisioning and deployment automation is much more complex than in a standard 3-layer web application deployed on a single server. Azure Resource Manager to the rescue, one might say. Unfortunately, not all components are easily exported to ARM templates. Some of them require searching for existing templates over the Internet and modifying them according to the current needs. Another problem was authentication when using Azure Powershell scripts. The new set of Cmdlets (those including Rm in the name) require user to manually sign in using a popup window. There is a workaround for this by creating and storing a user profile in separate file, however, such profile will expire after some time and must be re-created.
Implementation is another story. The system seems to contain really simple business logic and deals with just a few entities, yet the number of components is relatively high. This is the price for asynchronous processing and extensibility. When writing a new code, the asynchronous nature of the system must be always considered. Many messages are processed simultaneously in various applications and databases. There are no global transactions – they need to be implemented separately (e.g. using a Redis cache).
From a perspective, all the above challenges actually make such projects fun, motivating and enlightening, inspiring the team to reach out to new tools and technologies.