Distributed Computing for IoT: Data Management in a Fog Computing Environment

Gerry Christensen

Edge Computing, Mobile Edge Computing, and Fog Computing

We are embracing the incidence of ubiquitously connected smart devices (wearable computing, smart metering, smart home, smart city, connected vehicles and large-scale wireless sensor networks), which are currently becoming the main factor of computing. Whereas the evolution of ICT has taken us from mainframes to PCs and back to the Cloud (e.g. from centralized to distributed to centralized computing), clearly distributed computing will have an increasingly important role in the digital economy. Edge Computing includes a number of technologies including Mobile Edge Computing, Fog Computing, and Cloudlets — to mention a few. In this article, we take a quick look at some of the Fog Computing and Data Management issues and challenges.

 

Related resources:

Article: The Internet of Things (IoT) Revolution: Are You Ready?

Course: Internet of Things Workshop

Research publications on IoT (look for ‘IoT’ in our Store‘s ‘Search Product’ field)

 

Mobile Edge Computing

Mobile cellular operators are making plans for Mobile Edge Computing (MEC), which enables Cloud computing capabilities and an IT service environment at the edge of the cellular network. It is a concept developed by ETSI (European Telecommunications Standards Institute) that aims to bring computational power into Mobile RAN (radio access network) to promote virtualization of software at the radio edge. MEC brings virtualized applications much closer to mobile users ensuring network flexibility, economy and scalability for improved user experience.

 

Fog Computing

Fog Computing is the extension of Cloud Computing to the edge of the network. While Cloud Computing works on the upper layer, and is mainly about centralized computing, Fog Computing works on the edge layers and decentralizes the work load mainly at the access points.

In a Fog Computing environment, a considerable amount of processing may occur in a data hub on a smart mobile device or on the edge of the network in a smart router or other gateway device. This distributed approach is rising in popularity due to the Internet of Things (IoT) and the immense amount of data that sensors generate.

 

Fog Computing and Data Management

IoT is going to be a big driver for distributed (Fog) computing. It is simply unproductive to transmit all the data that a bundle of sensors generates to the Cloud for processing and analysis; doing so needs a great deal of bandwidth and all the back-and-forth communication between the sensors and the Cloud can adversely impact performance. IoT will create enormous amounts of data – there is a need for distributed intelligence and so-called fast Big Data processing. Companies like Parstream (acquired by Cisco) recognize this and have built and are building solutions to support ESP and fast processing.

Another example is InfiniFlux, which searches and statistically analyzes billions of stored data, inserting millions of data within a second. Just like ParStream, InfiniFlux can collect, store, and analyze data in real-time and is suitable for IoT projects, telecoms messaging platforms, and FDS (fraud detection systems).

 

IoT will create enormous amounts of data, driving a need for distributed intelligent data management and so-called ‘fast’ Big Data processing.

Fog Computing Architecture

 

The above figure illustrates the notion of some data being pre-processed and potentially used in real-time whereas other data is stored or even archived for much later use in a more centralized Cloud infrastructure or platform environment.

 

Data Management in Fog Computing

Every communication deployment of IoT is unique. However, there are four basic stages that are common to just about every IoT application. Those components are:  Data collection, data transmission, data assessment, and response to the available information. Successful data management is therefore very important to the success of IoT.

Just as IoT has unique network requirements, it also has unique data management requirements.

Data Management for IoT can be viewed as a two-part system: Online/Real-time Front-end (e.g. distributed nodes) and Off-line Back-end (centralized Cloud storage). The Online/Real-time portion of the system is concerned with data management associated with distributed objects/assets/devices and their associated sensors. As we discuss later in this report, there are issues pertaining to the need for “fast data” and distributed intelligence to deal with this data.

The Front-end also passes data (in the form of proactive push and responses to queries) results from the objects/devices/sensors to the Back-end. The frequent communication between Front-end and Back-end is termed Online. The Back-end is storage-intensive; storing select data produced from disparate sources and also supports in-depth queries and analysis over the long-term as well as data archival needs.

Another challenge for IoT Data is simply Data Integration. Data from different sources (sensors, contextual data, social media feeds, etc.) must be put into context. Unless the semantics for data are in the data itself instead of being a part of the application, data integration can pose a monumental problem. This is an opportunity for Big Data Analytics to solve.

There will also be a need for advanced Data Virtualization techniques for IoT Data. Data virtualization is any approach to data management that allows an application to retrieve and manipulate data without requiring technical details about the data, such as how it is formatted or where it is physically located. An example of a leading company in this area is Cisco, whose Data Virtualization offering represents an agile data integration software solution that makes it easy to abstract and view data, regardless of where it resides. With their integrated data platform, a business can query various types of data across the network as if it were in a single place.

There are also data infrastructure issues to consider with IoT Data. Three important DB/infrastructure issues to consider for IoT Data Management are:

  • Hybrid Database Support: IoT database with flexibility to handle semi-structured, unstructured, geo-spatial and traditional relational data. The varied types of data can co-exist within one single database.
  • Embedded Deployment Database: IoT database often need to be embeddable for processing and compressing data and transmitting over and between networks. Good features to have are little or no-configuration at run-time, self-tuning and automatic recovery from failure.
  • Cloud Migration: IoT networks can store and process data in scalable, flexible Cloud infrastructure. The platform can be accessed using web-based interfaces and API calls.

Companies that solve these issues will be in a position to realize considerable revenue as many service providers’ data management/governance systems are not prepared to handle both the volume and special needs of IoT Data.

Editor’s Note: Gerry Christensen, Principal Eogogics Faculty specializing in IoT and related topics and a well-known author and consultant, has 25+ year experience in planning, engineering, product management, and business development for telecommunications networks, applications, and services. He also heads up Mind Commerce, which develops market intelligence research reports on emerging telecommunications and digital technologies and trends, including IoT, MEC, Fog Computing, and related topics. These reports, along with Eogogics’ own digital publications, are available from the Eogogics Store that features about two dozen publications on IoT related topics including this one: Broadband Application and Service Optimization: Mobile Edge Computing (MEC) and Fog Computing. For our full repertoire of report titles related to a given topic, just look for it using the Search Product window in the upper right of the Store page. We can also undertake custom research on a topic of particular interest that you don’t see an off-the-shelf report for.