IoTReady - From Metal To Alerts With AWS IoT, ElasticSearch 7.9 and Kibana
AWS IoT and ElasticSearch with Kibana is a classic combination for quick, scalable visualisations, analytics and alerts. This stack has only gotten better in 2020.
"This post continues from our earlier post on AWS IoT, Timestream and Quicksight. We replace Timestream and Quicksight with ElasticSearch and Kibana.
ElasticSearch and Kibana are, probably, the most commonly used stack for data analysis and visualisation. ElasticSearch is a notoriously difficult software to host and manage yourself. The managed service from AWS has long been one of the more reliable options along side the managed cloud from Elastic.co, the creators of ElasticSearch.
A lot of necessary features like authentication and alerts were only available in the enterprise edition. Until, that is, the release of Open Distro For ElasticSearch by AWS. While causing a great deal of controversy and ethical dilemma, ODFE adds authentication, access control, alerts, anomaly detection and, now, notebooks to the open source version release.
This makes ElasticSearch and Kibana:
- more complete than Timestream & QuickSight with alerts and notebooks in the same UI
- more capable than Timestream & Grafana with anomaly detection and notebooks
- more extensible than using Timestream alone with the ability to ingest all kinds of data - not just timeseries.
What are we going to build?
- Use everything we built up to step 3 of our previous post.
- Set up an AWS IoT rule to route shadow updates to ElasticSearch
- Use Kibana to create visualisations and alerts
Create ElasticSearch Instance
We will use the AWS ElasticSearch service that sets up ElasticSearch and Kibana with ODFE. Since, this is just a demo we will set up a single t2.small node with 10GB disk size under Deployment type = Development and testing. We will use version 7.9 of ElasticSearch which is the latest on the service as of this writing.
"AWS seems to be tracking the releases much better these days. The latest release on Elastic.co is 7.10.1.
CRITICAL SECURITY NOTE
Make sure you read through the notes on network configuration and fine-grained access control for ODFE (not available for t2.small). For a production system, you should use an instance larger than t2.small to support encryption at rest and, by extension, access control.
I am using Public access here for the purposes of the demo as this instance will be live only for a couple of hours. Google ElasticSearch breaches for the ever increasing counts of exposed servers.
Create Index and Mapping
An index is the highest logical level within ElasticSearch. Think of these as similar to tables in a database.
- You need an index before ingesting data.
- You do not need data columns and types to be defined before ingestion.
- You need to define mappings for specific columns that you want to transform during ingestion.
- You need types and columns defined before analysis or visualisation.
Creating an index needs an HTTP PUT as shown below:
Note that we are specifying in the mapping that our timestamp field should be interpreted by ElasticSearch as a date (really, datetime) field.
That’s it, our ElasticSearch instance is ready to receive data.
Persisting Shadow Updates
Go to AWS IoT Core -> Act -> Rules to get started. We will create a new rule SendShadowsToElasticSearch. Like before, we will add a filter and an action.
We will need add back the timestamp field and we will also need the device_id. So, our SQL query will now look like this:
SELECTstate.reported.cpu_usage as cpu_usage,state.reported.cpu_freq as cpu_freq,state.reported.cpu_temp as cpu_temp,state.reported.ram_usage as ram_usage,state.reported.ram_total as ram_total,state.reported.timestamp as timestamp,clientid() as device_idFROM '$aws/things/+/shadow/update'
ElasticSearch is a first class citizen in the AWS IoT actions list. So, let’s select that action and configure it:
Note that we:
- Selected the ElasticSearch instance we created in the previous step
- Granted access to AWS IoT via an IAM rule
- Added an id using the built-in newuuid() function
- Specified our index
- Specified a type _doc for our data which is the default built-in type for new documents
As before, we will enable the CloudWatch action in case of errors.
Our rule will finally look like this:
Querying and Visualisation Using Kibana
Let’s start our simulators again and hop over to the Kibana to start analysing our data. The Kibana URL will be visible in your AWS ElasticSearch Console and should just be https://ES_DOMAIN/_plugin/kibana/.
Before we can query the data, we need to create an index pattern. Kibana should prompt you to create an index pattern as soon as you enter. If you missed that, go to Stack Management --> Index Patterns from the left sidebar.
Click on Create index pattern and walk through the wizard. Since your index has been receiving some data, the columns will already be detected. Ensure that you select timestamp as your date field.
With the index pattern in place, go to Discover in the sidebar to see your data streaming in.
If you do see similar output, we can continue to visualisations. If you don’t,
- Check the Cloudwatch Logs for errors
- Verify that your SQL syntax is correct - especially the topic
- Ensure your Rule action has the right index and an appropriate IAM Role
- Verify that your Device Shadow is getting updated by going over to AWS IoT -> Things -> my_iot_device_1 -> Shadow
- Look for errors if any on the terminal where you are running the script.
Visualisations & Dashboard
Once we start receiving data, visualisations within Kibana are easy to set up. Go to Visualisations in the sidebar and create a line chart for each of our metrics. Specifically,
- Select the index aws_iot_demo as data source
- For the Y-axis, select average as the aggregation and the metric, e.g. cpu_usage under field. Give a custom label for the axis if you want.
- To enable timeseries visualisation, add an X-axis bucket with aggregation Date Histogram, field timestamp and Auto interval.
- To enable distinction by device_id, add another bucket, this time with sub-aggregation Terms, field device_id.keyword and the default Order By options.
. device_id.keyword indicates that ElasticSearch is indexing the strings stored in the device_id column
Click on Update and your chart should look similar to this:
Create visualisations for the other metrics and add them to a Dashboard from the sidebar. Your dashboard should look something like this:
Unlike QuickSight, Kibana dashboards can be embedded on websites as snapshots.
Alerts in Kibana are independent of visualisations, unlike Grafana. They are created in two stages: monitors and triggers.
Monitors are queries that run on a schedule. Something like the one shown below. Queries can be defined using the UI/Visual Graph, via an extraction query or even, excitingly, anomaly detection (more on this in the next section).
Triggers compare the value returned by the monitor with specific thresholds. If the threshold is crossed, triggers can notify certain endpoints. At the moment, the built-in destinations are limited to Amazon SNS, Amazon Chime, Slack and Custom webhook. We will create a trigger without a destination so we can see the alerts in the UI.
Once an alert has been triggered, it will be activated and will show up on the Alerts Dashboard. Active alerts can be Acknowledged to let the system know that an operator is looking into the issue. Once the metric/query is no longer in breach of the threshold, the alert is Completed. For a detailed description, check the docs.
Anomaly detection is a relatively new feature in ODFE and is getting frequent updates. As of release 7.9 on AWS, you can set up
- detectors that query all or some portion of your data at regular intervals, and
- models that monitor specific fields or custom queries for anomalies.
Kibana notebooks are an experimental feature that let you build Jupyter-like notebooks with visualisations and descriptive text. It must be stressed that these are nowhere close to the functionality of Jupyter, which of course include the power of a complete programming langauge like Python. However, they are a start and we are excited to see how they progress
ElasticSearch and Kibana are large, commendable feats of engineering that power everything from our search engines to our on-demand taxi services. The stack’s versatility is clear from its suitability for time-series data too.
When comparing QuickSight, Grafana and ElasticSearch+Kibana, we would recommend:
Timestream + QuickSight if you
- prefer costs that more linearly scale with use
- prefer using AWS in-house offerings
- have data in multiple data sources that you need to analyse without duplication
- don’t need to visualise geospatial data
Timestream + Grafana if you
- need timeseries charts and alerts
- need gorgeous dashboards with easy (time) filtering
- don’t need to create complex BI style analyses across multiple data sources
- don’t need anomaly detection
ElasticSearch + Kibana if you
- need flexibility in the type of charts and analyses
- need anomaly detection
- need to index and search text data too
- don’t mind duplicating your data into ElasticSearch
- don’t mind paying fixed monthly costs for your instance
- have the ability to secure and monitor your ES instance
Ideas, questions or corrections?
Write to us at firstname.lastname@example.org
Tej holds a PhD in Wireless Hardware and has built and sold patents covering innovations in RFID and microwave processing. He has been building IoT products with enterprises, large and small, for nearly 15 years.