The internet of things (IoT) is one of the fastest-growing markets in a blazing technology marketplace. The ability to incorporate and analyze data from devices can be both a blessing and a curse. Data formats change at the speed of iterations, and the data volumes are growing with no end in sight.
There’s only one place that IoT data can be managed and controlled: streams. Data streams process data as it is transmitted by combining advanced middleware, in the domain of IoT, with advanced machine learning and artificial intelligence.
However, not all streaming technologies are created equal. Streams, from their earliest inception, have been designed to enable quick and efficient data movement, supporting a wide variety of use cases — from healthcare alerts to remote maintenance and smart homes. To accomplish these diverse use cases, streams must support a wide variety of data types and be prepared to support existing and emerging industry standards.
Today, as this market matures and finds greater application and acceptance of streaming data, the business and technical requirements are expanding. At a minimum, data streams must:
• Deliver real-time analytics.
• Include integrated data management, including data lineage.
• Provide real-time actions based on data.
• Perform real-time anomaly detection.
• Support all types of data, including ordered and unordered datasets.
• Carry critical tagging information with the streaming data.
• Support independent data locale.
• Embed data security.
• Offer add-on capabilities.
• Require speed and throughput.
The requirements above are table stakes for streaming technologies. Beyond these obvious requirements, I suggest three tips to help you adopt streaming technologies:
1. Build it for decision-making. Analytics has traditionally been a post-processing function, but not in the world of streaming. Streams extend analytics beyond providing critical data at critical times. Instead, the analytical model for streams relies on problem determination and action. This means streaming analytics can trigger alerts, orchestrate a call and provide real-time data to legacy applications for immediate business improvements.
For all of this to work, streams must support the latest compute models from GPUs to CPUs, especially given the increasing need to work physically closer to where the work is being performed. Finally, as compute and storage become more specialized, streaming technology will be required to support reference architectures that focus on queries or inferencing, consisting of APIs and procedures for advanced analytics.
2. Understand your business case. Traditionally, we brought data back to a central data warehouse or series of data lakes, where it was then analyzed. A host of technologies were created to support this structure, including Hadoop, data cubing solutions and multidimensional databases. The advent of IoT and streaming solutions changes this paradigm completely. Today, applications at the edge are built around inference models, allowing logical work to be performed closer to where the data is generated. No longer does data have to be backhauled to a central repository or to the cloud. IoT data can be scanned in real time to determine anomalies or provide data trends. Data that is noise can be stored locally and deleted at the source, representing dramatic network, compute and storage reductions.
The right streaming technology can recognize anomalies in your data with built-in logic, fast and accurate problem determination, and root cause analysis. Identifying a precise problem area without finger-pointing can help engineers resolve problems in record time, reducing costs and increasing customer satisfaction. Streaming solutions also should have the flexibility to establish a separate stream, providing predetermined formats of these anomalies for ingestion by machine learning or AI training models — automating analysis for highly complex and challenging problem areas.
3. Support for the real world. Streaming technologies must support the configuration of reduced programming and be highly adaptive to many environments. The ability to support unordered data processing is the norm today. However, there’s an increasingly critical requirement to support ordered data within streams; for machine learning and AI models, this is vitally important. Ordered stream processing must be inherent in the streaming platform. Knowing and supporting ordered streams can allow for greater accuracy in modeling, and it can dramatically improve problem determination and enable powerful control over the predictable sequencing of data.
The bottom line is that data streams represent the next great leap forward in converging data across the complex environments of cloud, hybrid cloud and edge computing. Data streams are here today, and with the advent of IoT, they are the wave of the future.