Machine to Machine – The Internet of Things – It’s about the Data
by Tim Hardy
There is now no question that we are beginning to use the technology that enables us to connect the ‘things’ that surround us, and to collect data from these ‘devices’ and use that data to improve the information that we have to make decisions. This might be a “pay-as-you-drive” insurance policy based on tracking speed, braking and cornering style, the time of day you tend to drive and the types of road you travel on. “Smart Meters” now provide real-time or near real-time sensors, power outage notification, and power quality monitoring as well as just meter reading. Shipping containers can not only be tracked but monitored for security through sophisticated blending of sensor feeds. Buildings –data centers for example – can provide data on not just temperature and humidity for environmental efficiency control, but also vibrations to support other uses such as disk failure prevention. All this information can be used to improve processes such as maintenance and supply chain operations. Growth of these devices is significant, and this means more software needs writing to run on them, more capability needs programming into them, and more data needs processing and analyzing. One of the fastest growing areas of health-care is remote patient monitoring, especially for chronic disease management. A remote sensor gathers source data (blood pressure, heart rate, blood levels, sleep patterns), and it is collected at the data center. The data is then analyzed and acted upon. In this scenario, reaction time is important, and ‘transactions’ cannot be lost.
As a result, we are left questioning:
- "How do we efficiently develop software to run billions of devices?”
- "How do we process the volumes of both structured and unstructured data, and analyze it effectively?””
- “How we do we architect the solution to be supportable and cost-effective?”
Any ISV or organization developing solutions for the ‘Internet of Things’ needs to consider the device, writing software for the device, connecting the device, maintaining the device and the software on it, processing, analyzing, and securing the data, presenting the resulting information and integrating it to operational systems, and of course ensuring the whole system is resilient, and key data is not lost. Currently the approaches taken have grown from specialist providers, and every solution is different. In architectural terms, these solutions are typically not scalable – the software is often specialist or proprietary, and the supporting systems have not been designed with today’s volumes in mind. This means expensive development, skills shortages, and challenging service levels.
Eric Goodness of Gartner in his blog, suggests the following are key solution areas today that need consideration: Asset tracking and location, asset monitoring and control, logistics control and optimization, asset serviceability, security monitoring, energy demand response, environmental monitoring, payment processing, notification and advertising, and regulatory compliance monitoring.
Typically these are seen in an industry context, for example: Government (Smart Cities), Utilities (Smart Metering), Transportation (Telematics, Security), Healthcare (Remote Patient, Equipment Monitoring), Construction/FM (Smart Buildings), Banking (POS).
However, despite these wide ranging uses and deployment patterns, the solutions are essentially the same. This means we can address the issues of scaling the development and the deployment by using a common architecture, a common programming model and a common deployment.
The Sensor Device: Embedded Java
FACT: The Java platform is the leading choice for more than 9 million developers, making it a massive ecosystem of tools, books, code, libraries, and applications running across a wide range of hardware and operating system platforms.
By using Java in sensor devices, the cost of development and ongoing support can be significantly reduced, as it is highly productive and developer skills are widespread. It is architected to support a full range of devices and platforms, ensuring cross-platform development and reuse. Java technology enables highly functional, reliable, portable and secure solutions for small to large embedded devices, which helps to accelerate time to market.
Competitive advantage is gained by focusing not on the underlying technology, but on delivering value-added services. Embedded Java ensures secure connections with other applications and back-end databases. Special features of the Java Virtual Machine (JVM) have been optimized for embedded use, so they will work with a wide range of devices, including many specialized systems, such as healthcare or meters. The Java platform is pervasive in the embedded space. Today, Java technology is already present in 5 billion SIMs and smart cards, 3 billion mobile handsets, 80 million TV devices, including every Blu-ray player shipped, and many other embedded solutions from printers and bank machines to e-book readers and cars.
Java offers industry-leading reliability, performance, throughput, security and cross-platform support, and Java's model of write once, run anywhere is very powerful for the diverse embedded space avoiding sensor device platform/vendor lock-in.
In particular, the Oracle Java Micro Edition (ME) Embedded Client is an application run-time that builds on the popular Java Platform, Micro Edition (Java ME) specification and has been designed for speed and efficiency on devices with limited processing power and memory like e-book readers, Blu-ray Disc players, Voice over IP telephones, televisions, set-top boxes, printers, residential gateways and more. Java ME provides device manufacturers with the full power of the Java language, a comprehensive set of APIs and industry-leading security. . Through its compatibility with the Java Platform, Standard Edition (Java SE), the Java ME Embedded Client inherits familiar Java features and benefits from a rich development ecosystem that enables Java developers to hit the ground running. An extensive range of tools, such as the NetBeans IDE, provide sophisticated power for creating and debugging applications.
For very small devices there is Java Card, which was originally designed for use primarily within smart cards, Java Card has evolved into a more general purpose platform that allows multiple, Java-based applications to run securely on devices with very limited footprints. This is ideal for smart meters, or perhaps even smaller solutions such digital tachograph cards, driver's licenses or biometric identity cards. Java also allows for embedded intelligence in devices that can be used, for instance, to perform direct adjustments in patient care, or analyze feeds from multiple sensors on the device, to help avoid false positive monitoring readings.
Because embedded solutions require long-term support, it is important that a software platform have a support commitment that reaches well into the future, something that is assured with Java.
Java in the sensing device helps ensure systems can be built efficiently, using the same language that can be used throughout the end-to-end solution, with device portability and ongoing support, regardless of what that device might be.
To see how Java can work on very small sensor devices, Oracle Labs created the SPOT project. An Oracle Sun SPOT device is a small, wireless, battery powered experimental platform. It is programmed in Java, allowing programmers to create projects that used to require specialized embedded system development skills. The hardware platform includes a range of built-in sensors as well as the ability to easily interface to external devices
It is available on the Oracle Store: https://shop.oracle.com/pls/ostore/f?p=dstore:product:3212184388460931::NO:RP,6:P6_LPI,P6_PPI:77098491099690831171217
Embedded Database and Synchronization Support
Sensor Devices are also data processing devices. This requires some form of data management and storage. This might be so data is stored then sent intermittently, or to process the data intelligently on the device. When we start looking at the requirements for zero maintenance and synchronization with master data (for perhaps rules based processing), it is clear that a database management system is required rather than just raw data storage.
The industry-leading open source, embeddable storage engine is Berkeley DB. Berkeley DB delivers the same robust data storage features as traditional, relational database systems, such as ACID transactions and recovery; locking, multi-process and multi-threading for high concurrency; Berkeley DB can manage databases in memory, on disk or both. It stores data as opaque byte arrays of data in key/value pairs indexed in one of the available access methods. Create, read, update and delete (CRUD) operations on these key/value pairs is done using the BDB get/put or cursor position-based APIs. Berkeley DB also supports the Java Collections and Java Direct Persistence Layer APIs. It is a small, less than 1 MB, library easily installed and configured along with your application. Berkeley DB was designed to operate a completely unattended fashion, so all administrative functions are controlled programmatically. It supports a wide variety of programming languages and operating system platforms. Berkeley DB is proven in millions of deployments, ranging from mission-critical, carrier-class applications to desktop and mobile device applications. Languages supported include C, C++, Java, Perl, Python, PHP, Tcl, Ruby, etc, and operating systems supported include Oracle Linux, Windows, BSD UNIX, Oracle Solaris, Mac OS/X, VxWorks and any POSIX-compliant operating system. This makes it ideal for local storage to support M2M/IOT deployments.
For Java devices, as discussed above, Oracle Berkeley DB Java Edition is an open source, embeddable, transactional storage engine written entirely in Java. It takes full advantage of the Java environment to simplify development and deployment. The architecture of Oracle Berkeley DB Java Edition supports very high performance and concurrency for both read-intensive and write-intensive workloads. Depending on your needs choose between Berkeley DB Java Edition's Direct Persistence Layer (DPL), Persistent Collections API, or simply store key/value pairs of arbitrary data. Additionally, the Database Mobile Server enables embedded data stored in devices running Berkeley Database to be synchronized with Data Center RDBMS products as well as managing the provisioning and life-cycle of on device applications. Berkeley Database with Database Mobile Server and embedded Java is a comprehensive secure device-to-cloud smart data solution answering the challenges of the rapidly emerging Internet of Things.
The Data Center: Data Management and Analysis
FACT: The US could easily produce a billion meter reads in an hour.
If we use Java in the sensor devices (see above), then connectivity is simple. We can control the feed and the structure to create an end-to-end two-way system. However there may be legacy devices, and for example in the Utilities industry (“Smart Grid”), an ever-increasing volume of data will come from an ever-expanding list of devices, and a “Gateway” of some nature is required to deal with the different way all the devices manage events such as Meter Status Check, On-Demand Read, Meter Commissioning, Meter Decommissioning or Exception Management.
Complex Event Processing
To process the data coming from the sensing devices needs technology that can support significant volume and apply business rules to manage the data. This is mostly structured data, and a system is likely to need to process many millions of events per second while maintaining very low and predictable latencies. This is the territory of ‘Complex Event Processing’, which allows applications to filter, query, and perform pattern matching operations on streams of data using a declarative, SQL-like language. CEP technology is being used in the power industry to make utilities more efficient by allowing them to react instantaneously to changes in demand for electricity, in the credit card industry to detect potentially fraudulent transactions as they occur in real time, and in capital markets for applications like order routing and algorithmic trading. This is now seen as a key component in processing M2M/IOT data feeds. The resulting data is then fed into a Data Warehouse (RDBMS) for analysis and operational action. For data that needs inclusion into the CEP rules, an in-memory data caching/grid approach should be used, such as Oracle Coherence, providing fast access to frequently used data.
Unstructured data coming from devices that are not controlled in the way that dedicated function meters or monitoring devices are create another problem, as the data cannot be processed in a structured, rule based way in CEP logic. Typically this data, perhaps from billions of mobile devices, is increasingly being collected into databases designed for the purpose, such as a “NoSQL” Database. This might then be organized and ‘reduced’ in size using a solution such as Hadoop for filtering, transforming and sorting and then integrated into a full Data Warehouse, with transactional and other data to get a full analytics view enabling operational action.
Analysis and Action
Both structured and structured data can then be combined to provide a complete view, enabling an ISV to build solutions that provide actionable business insight. Typically an analytic approach, such as the open source R statistical language, is used on the Data Warehouse, combined with a Data Mining algorithms and techniques. This can provide advanced fraud identification, for example, buying patterns, or more simply predictive preventative maintenance programs. Oracle Advanced Analytics, for example, includes Oracle R Enterprise and Oracle Data Mining inside the Oracle RDBMS, this enables analytic scaling as data volume increases by bringing the algorithms to where the data resides–in the database, accelerating data analyst productivity while maintaining data security and lowering the overall TCO for data analysis by eliminating data movement and shortening the time it takes to transform "raw data" into "actionable information". Spatial data (such as geo-location for tracking and telematics) can be combined for real-time action. Tools are also needed to use the data efficiently, deliver dashboards for reporting, analysis, modeling and forecasting, drive actions in real-time, and integrate with a portal, and provide appropriate security and authorization management. As well as combining transactional in the data warehouse, there will be a need for direct integration with transactional systems such as finance systems, CRM systems, logistics and asset management systems. This means a SOA approach, introducing a Service Bus to the architecture.
Private Cloud Platform
All these components need to be accessible from anywhere, although geographic/regional requirements need to be considered. Since service levels and security need careful control, this leads to an M2M Private Cloud platform being key to effective solution delivery. This platform needs connectivity, complex event processing, a data warehouse, bigdata technologies, a service bus, portal technologies, security management capabilities, sophisticated reporting and analytics, and significant scalability of processing and storage. Of course it all needs ‘assembling’ and ‘managing’. For example, Oracle provides all these technologies, and has engineered them to work together to achieve optimal performance levels and driving down costs and accelerating innovation. Oracle Exadata Database Machine provides a scalable database platform, Oracle Exalogic Elastic Cloud is optimized for running Java, Oracle Fusion Middleware and Oracle applications with high throughput and low total cost of ownership., Oracle Big Data Appliance provides a complete BigData solution and Oracle Exalytics Business Intelligence Appliance provides a complete in-memory real-time analytics and reporting Platform. These are managed end-to-end with Oracle Enterprise Manager (operations, configuration, cloud, diagnostics etc), with end-to-end data security covering authentication, authorization, encryption and separation of duties.
These systems are a complete cloud platform, with all of the hardware and software pre-integrated to provide high performance and easy manageability. The Oracle Linux Unbreakable Enterprise Kernel is used and delivers the best overall Oracle Linux performance available today and provides numerous features in the areas of hardware fault management, data integrity and diagnostics, including detecting and logging hardware errors before they impact the OS or application, and automatic isolation of defective CPUs and memory. Taking advantage of these enhancements, require no changes to existing Oracle Linux applications. The optimizations provide up to 60% higher workload and 50% reduced latency, which is key for a scalable, resilient M2M platform. Oracle Linux is based on the Fedora Core code base, which is also the foundation for Red Hat Linux.
Oracle Exalogic Elastic Cloud is ideal for deploying Java applications since it has been extensively optimized, tested and certified to run WebLogic Server, the world’s #1 application server. In addition, Oracle Enterprise Manager gives you simplified and consolidated management of all hardware, software, storage and networking components of the system. This helps reduce the TCO and ensure rapid delivery of M2M application capability.
There is significant opportunity to obtain value from the data that can be collected from the “Internet of Things”. However the growth of this opportunity is such that scale impacts architecture choices significantly. This means considering the whole solution from Client to Data Center and back again. What language do you use to ensure commonality and portability across all the platforms and elements in that system? How do you assemble all the components needed to process and analyze the data that results? How do you maintain the logic on the sensor device? How do you support the entire solution and provide appropriate service levels? How do you maintain service levels and charge for the use of the solution? How do you maintain security of the data from client through to analysis?
Oracle can provide the complete M2M/IOT platform, and this provides significant architectural simplicity that enables TCO reduction. For the client sensor devices, the value of embedded systems is increasingly driven by software, and Java’s platform independence, high level of functionality and security, mature tool chain, connectivity and scalability, and massive ecosystem put it on the top of the list. For the extremely resource-constrained devices, from 50KB to 500KB/1MB, Java Card is the most compact VM and brings dedicated security functionality. Java ME brings comprehensive device and network functionality access to small embedded devices (intelligent modules, meters…) and mass market mobile phones. It runs on devices starting with sub 1MB Java footprint, and with minimal CPU power. For the Data Center, Oracle’s Engineered Systems provide all the components to provide the functionality, security, the extreme scalability and operational efficiency needed for any organization needing a platform for building an M2M/IOT service.