Internet of Everything Advancement Study in Data Science and Knowledge Analytic Streams

The latest internet of everything (IoE) advancements in data elicitation and digital storage technology leads to a large heterogeneous data depository, in which the IoE data are stored in a column oriented relational framework. The main purposes of the research are to design and explore data models, frameworks, architectures, and algorithms on network-centric data, mainly IoE data to accomplish the data science and knowledge analytic tasks for Intellectual domain applications. Some storage incompatibilities are there in the relational structure of multi-objective IoE data base that creates threats to data integrity and consistency. In a large scale IoE database, huge numbers of rows are there along with limited number of columns. So, column oriented relational framework greatly improve the performance of IoE data base in terms of data depository and access management. Knowledge analytic is the major part of data science; Analytic is a never ending process because of progressive technological change requirements as well as the business change requirements. The beauty of Analytics is that two data scientist with same problem may come up with two different new solutions. So, in this work, I discuss the overall data science and knowledge analytic streams for an effective IoE database management and knowledge discovery. Keywords— column oriented database, IoE database, knowledge analytic, data depository, data science


I. INTRODUCTION
The IoE data base management system is defined as a data base platform that especially deals with large scale structured, un-structured, and semi-structured data. In almost all IoE applications, huge amount of data are dumped into the storage for executing further knowledge analytic functions to explore the potential insights. For example, the IoE objects, such as sensors, use in the entire aircraft generates a large log data at the rate of 20 TB per hour and dump into the storage of black box. The required data are accessed from black box as and when it is desired for further explorations. The digital storage technology is tremendously developed to accommodate large data scale; however the data accessing and knowledge analytics speed cannot be significantly improved to perform those operations in a timeline basis. So, time critical data access management and predictive knowledge analytics are still open challenges for largely unstructured IoE data.
The IoE has a most important influence on the Big Data background. The key awareness on IoE data science evolution is that every IoE object has an identifier and connects to each other. Now, bearing in mind the circumstances of trillions of such connections that may be producing massive volumes of data (IoE big data), and the competence of current data science and knowledge analytics mechanisms are going to be challenged. The IoE evolutionary network connects people, processes, places, and things to internet for communication in and around the universe. The IoE objects focus both physical and logical things. The logical things include process, framework, applications, software, and program, and the physical things include people, places, physical entities, and devices. The data of such physical and logical things constitute a comprehensive IoE data base, where the structured, semistructured, and unstructured data are available [1].
In an IoE data base, ERP and CRM data are considered as structured data, XML data are normally considered as semistructured data, and email documents, social web contents, pdf, ward, rich text documents are considered as unstructured data. The study reveals that in an IoE data base, around 80 % data are unstructured and have no pre-defined data models. Such un-structured data are textual, graphics, video, and symbols oriented. The spatio-temporal databases having the facts or events with time-stamps are also a part of IoE database. The rapid increasing of IoE big data applications in today"s IoE world progressively lead to several problem issues such as, data volume, velocity, varieties, and value. Analyzing and inferencing cognitive values (knowledge) from large scale IoE data base in a realtime basis is more challenging day by day with the extreme growing of volume, and varieties data that are associated with numerous IoE applications. Such IoE knowledge analytics and inference face a number of real-time problems such as, managing heterogeneous knowledge, transforming varieties data into knowledge, transforming knowledge into actions, transforming actions into cognitive decisions, and tuning the cognitive decisions to coordinate the IoE motivated applications [2].
The convergences of statistical and computational learning mechanisms have been researched to deal with the data science and knowledge analytic problems. Data science and knowledge analytic implements are also used for analyzing and exploring various operational tasks associated with the IoE big data submissions, such as-data transformation and analysis, data mining, knowledge discovery, semantic knowledge explorations, structural analysis, and many more. The machine learning technics are implemented in many areas of knowledge discovery and semantic knowledge analytics to explore the application intelligence. In almost all IoE big data applications, a huge amount of data is dumped into the storage that are highly redundant and unsuitable for the purpose of data analysis, modelling, information transformation, knowledge production, and the decision generation. A survey conducted by Par Stream shows that 94% of the organizations surveyed are facing challenges in IoE big data elicitations and analytics, and 70% organizations think that, the IoE big data analytics help to make better and more meaningful decisions for organizations [3], [18].
The main purposes of the research are to design and explore data models, frameworks, architectures, and algorithms on network-centric data, mainly IoE data to accomplish the data science and knowledge analytic tasks for Intellectual domain applications. The data science aims for knowledge analytic frameworks and algorithms to build and organize knowledge and insights that transform the real world domain applications into intellectual domain applications.
In the current and upcoming days, the data science and knowledge analytic tasks are gaining popularities across various Intellectual domain applications. The main aim is to get the insights from large scale network-centric data, such as IoE data that can be used to produce intelligence for the applications. In the current Intellectual domain applications, the network-centric data are highly unstructured and ambiguous, and create research challenges in inferencing the potential knowledge. The survey reveals that time to insight is slow, quality of insight is poor, and cost of insight is high for IoE big data applications, on the other hand, those Intellectual domain applications require low cost, high quality, and real time frameworks and algorithms to massively transform their data into cognitive values of goldmines. Such cognitive values are utilized as knowledge and insights for creating worth of the Intellectual domain applications.
The rest of this paper is organized as follows. Section II discusses the related work in IoE data storage, access, and analytics systems. Section III discusses the column oriented relational framework. Section IV highlights the analysis and discussion of IoE data science and knowledge analytic context. Finally section V concludes this paper.

II. RELATED STUDIES
Here we highlight an ongoing evolution history of DSA through its correlated operational functions and tasks. Peter Naur in 1960 uses data science as a substitute of computer science. In 1974, Naur uses data science as data processing methods for numerous applications [4]. In 1977, Turkey suggests data science as exploratory data analysis [5], [23]. In 1989-1996, more tasks/terms are included in data science, i.e. data classification, data mining, and knowledge discovery [6]. In 1997-2001, statistical computing is included as a part of data science. In 2005, Thomas H. Davenport, et.al, introduces the use of analytics and facts base decision making in data science [7], [26]. In 2010, Hilary Mason and Chris Wiggins, introduce the term machine learning in data science. In 2011, Harlan Harris discuses several data science techniques such as: statistics and machine learning; data interpretation, classification, and visualization [8], [26]. In 2012 to till date, the data science progressively integrates with several new technologies, such as-IoE, big-data, clouds, deep learning, extreme learning machine (ELM), and many more emerging technologies [23], [24], [25].
If we analyse the upcoming prospect of data science, we observe that it tends toward delivering both big-data processing and knowledge analytics, which are the most challenging aspects with the growing data dimension and diversity of numerous IoE driven intellectual domain applications. The data architecture of IoE relies on several NoSQL databases on Hadoop like platforms for batch processing of large scale data that consumes much more time; however the real-time or semi real-time data processing, management and knowledge analytics are much more thought-provoking tasks. Because, the current business intelligence platforms need the timely knowledge and insights to transform their business data into cognitive decisive goldmines in order to make huge revenues through minimizing the potential upcoming business risks. The NoSQL databases are not designed to execute the knowledge analytic tasks, which are the common minimal requirements for IoE driven intellectual domain applications.

III. COLUMN ORIENTED RELATIONAL FRAMEWORK
Relevant The IoE database is a spatio-temporal database that consists of event data related to the timestamps and geographical locations. For a specific time and location, the physical data value is true, but not unanimously true for all locations and timestamps. For example, for a specific location L001 and time stamp T001, the IoE data, i.e. environmental temp= 40 degree Celsius is true. In a large scale IoE database, huge numbers of rows are there along with limited number of columns. So, column oriented relational framework greatly improve the performance of IoE data base in terms of data depository and access management. Consider an IoE relational database "R" having following physical schema.
In the relation "R" (Table-1 In above two FDS, IoE object ID and Timestamp combine acts as a composite primary key for "R". Also, over geographical coordinate system, the Location ID can be uniquely identified through IoE object ID. Where R001, R002… are the row identifiers and uniqueness is maintained for each row of relation "R".
The multi objective IoE object is embedded with numerous sensors or sensing objects that produces heterogeneous physical data with different formats and structures, which leads to data elicitation and classification hazards and also further leads to predictive knowledge analytic hazards.

IV. CONTEXT OF DATA SCIENCE AND ANALYTICS
Science implies the gaining of knowledge from systematic study, so the data science might therefore imply a focus involving data, and by extension, statistics, or the systematic study of the organization, properties, and analysis of data and its role in inference, including our confidence in inference [9]. The promise of data science is that if data from a system can be recorded and understood then this understanding (knowledge and inferences) can potentially be utilized to improve the system [10]. Data Science is the extraction of knowledge from large volumes of data that are structured or unstructured; this is a continuation of the field data mining and predictive analytics, also known as knowledge discovery and data mining [11].
In data science, various tasks are associated that can be explored and integrated to design the application for numerous intellectual domains that are associated with the physical world. A correlated structure of data science is represented in figure 1. Statistical computing and visualizations are important tasks of data science, that include data manipulation and cleaning, importing and exporting data, managing missing values, data frames, functions, lists, matrices, writing functions, and the use of packages. Efficient programming practices and methods of summarizing and visualizing data are emphasized throughout the data science environment. Cognitive computing is an important concern of data science, where we build a new computational problem class to address the complex problem situations through a self-learning process. The cognitive computing of big data exploits the power of the several diversified technologies, such as mathematics, statistics, data science, computational science, etc., to build intelligent and insights for the intellectual domain applications.
KDD (Knowledge discovery on data) feats the way to integrate the data mining with the data analytics that makes the use of data science in numerous domain applications, such as business intelligence domain [12]. The machine learning becomes extremely important and useful in data science environment to deal not only objective with huge amounts of data and extract knowledge from it but also create trends in IoE big data analytics in increasing extensiveness with all levels of an organization. Domain analysis is an important concern that helps to analyze the important problem scenarios of an application domain associated with physical world. The domain analysis integrates the data domain with intellectual domain applications, such as business, healthcare, and industrial. Knowledge reengineering, analytics, and inferences are the progressive concerns of the data science to re-engineer the superseded knowledge base into a renovated knowledge base system that may ensure higher operational efficiency through making the knowledge base useful and operative. The knowledge analytic is a major part of data science that studies the historical data to research potential trends, analyzes the effect of decisions and events, evaluates the performance of complex problem scenarios, and aims to improve values through gaining knowledge and insights [13]. The knowledge analytic is the science of logical analysis that uses mathematics, statistics, computational intelligence, and other analytic tools to discover the potential knowledge and insights from large scale data science environment.
The Internet of Everything is the network of physical objects or "things" embedded with electronics, software, sensors, and connectivity to enable it to achieve greater value and service [14].
The IoE objects may be any logical or physical thing associated with the real world entities. The Logical things comprise several process, frameworks, software, apps, etc., and the Physical things consist of people, places, devices, etc. as described in figure 2. Three main requirements to construct an IoE object. Those are physical IP system for unique object identity, Radio transceivers for communication (uses protocol stake), and Sensing unit for data sensing from physical environment. The main aim is to easily place into any real world entity associated with the IoE applications.
IoE data science considers the data of everything's-placeprocess-device-people. IoE data science environment desires following connectivity among IoE objects: -any place connectivity, any process connectivity, any people connectivity, anything connectivity, any time connectivity, and constitutes comprehensive IoE vicinities.
Such connectivity leads to large heterogeneous data depository with incompatibilities among database frameworks. The incompatibilities, such as name, scale, structure, and level of abstraction, create exploration challenges to IoE data science environment to execute the analytic process. With the rapid increasing of IoE based intellectual applications, the networks of such billions of IoE objects constitute IoE data science vicinities, from where huge structured, semi-structured, and unstructured IoE bigdata are produced in a real time basis.
Several hazards are associated with IoE data science. Those hazards are-managing data Dimensions, managing data Diversity, managing dynamic data streams, managing data biases, noise and abnormality, managing data correctness and accuracy for apps use, and managing data longevity.
Several hazards are associated with knowledge analytics. Those hazards are-managing heterogeneous knowledge, transforming the data into knowledge, transforming the knowledge into actions, transforming the actions into cognitive-decisions, and tuning IoE knowledge base to regulate numerous intellectual applications, such as industrial, healthcare, and Business.
In this context, we discuss some relevant studies that are associated with numerous frameworks intended with diversified domain applications. A framework is a real or conceptual structure intended to serve as a support or guide for modelling progressive IoE-DSA functions for intellectual domain applications, and the intended functions are of cognitive, conceptual, theoretical, analytical, and logical varieties. In our dissertation we consider three intellectual domain applications, i.e. industrial, healthcare, and business intelligence domains. For the applications, some innovative problem requirements are identified, analyzed the problem requirements in term of proposed architectures, algorithms, functional explorations, structural analysis, mathematical analysis, implementation analysis, computational analysis, structural analysis, and modelled into operationally feasible application frameworks.
In context to the knowledge analytics for industrial domain application, we consider a sensor environmental case, where different works are analyzed and implemented through diversified data science and knowledge analytic (DSA) approaches, i.e. KDS-NN, KDS-GA, and KDS-DM [16], [17]. Dinesh Kumar et al., propose the implementation of KDS-NN approach that uses the back propagation algorithm to execute the data filtration operations at gateway level for a sensor environment [18]. Khanna and liu, propose the DSA implementation through KDS-GA approach that describes the genetic approach based pattern identification and activity monitoring application [19]. A number of other researches refer KDS-DM approach that involve in several DSA operations, such as, data-calibration, cluster, replication, reduction, elicitation, and cleaning, pattern-identification, extraction, modelling, and mapping operations in and around the sensor environment [20][21][22][23][24][25]. In those works, small data scale is considered to discover the big-values. However, for a large scale industrial automation application, we consider the prospective knowledge discovery and management of IoE big data, and study some relevant works that have the prospective over IoE big data platform [26]. The work in [27] emphasizes to design an IoE data management reference framework that can perform following DSA operations, i.e. data cleaning, storage, access analysis, and distribution operations. Furthermore, the work in [28] discusses some more DSA operations at semantic level, such as, semantic analysis, semantic derivations for knowledge discovery and intelligent decision making.
In context to the knowledge analytics for elderly healthcare domain application, several systems and devices are studied to model an operationally feasible framework for elderly activity supervision [29]. In work [30][31][32][33][34][35], the DSA functions of a cognitive IoE device or a cognitive sensor are analyzed that assist elderly to regulate the smart home appliances. Furthermore, the functional operations and implementations of several system and devices are analyzed that have the major roles towards DSA operations for elderly, such as, abnormal activity detection, fall detection, online activity monitoring, and emergency situation detection [36].
In context to the knowledge analytics and reanalytics for customer end and enterprise end business intelligence domain applications, the functions of several DSA operations and BI applications are analyzed to design various intellectual BI frameworks [37]. The work in [38], proposes a modern manufacturing service system that uses IoE and cloud computing technology for storage, analytics, and other DSA operations. We also analyze the work of [39], which focuses an e-commerce service application for BI process monitoring system. In order to implement the analytics in credit card fraud detection service, the work in [40] emphasizes the implementation of Bayesian learning system as a mechanism to execute the DSA operations for the said business intelligence service.
A number of innovative DSA operational analysis are considered for diversified BI service applications, such as, product life cycle management service that uses closed-loop PLM framework [41], transport logistic service that implements an IoE based ontology framework [42], and a supply chain management service that uses a cognitive based smart logistic framework [43].
The broad review on three diversified domain applications gives a vision to numerous DSA operations that can be implemented on IoE data science vicinities.
In business automation environment, IoE data science regulate several smart management tasks, such as, material logistic management, supplier chain management, product lifecycle management, compliance service work flow interoperations and management, proactive prediction of business security strategy, and much more.
The IoE data science regulates a large scale automated industry through generating real-time tactical and operational decisions and cognitive actuations, and thus, it can be effectively used in many industrial applications to regulate sensitive parameters, such as, machine load and distribution analysis, reliability analysis of machines, industrial safety analysis and monitoring, etc. [44], [45]. With the advancement of industries and IoE data science, real-time knowledge analytic framework have been considered in many contexts to automate an industrial process that involves a high degree of risk.
Therefore, based on risk quantification, we classify real-time data science applications into three different categories: business-critical applications, e.g., IoE applications in business intelligence monitoring, mission-critical application e.g., IoE applications in habitat monitoring, smart city monitoring, smart home monitoring, etc., and safety-critical sensor application e.g., IoE applications in industrial automations, healthcare automations, elderly activity supervisions, etc. Among the three applications, the highest degree of risk is measured in safety-critical IoE application [46].
The IoE data science also regulates the data of wearable and non-wearable computing devices and generate intelligence through analytic frameworks to transform into a smart environment in order to monitor several activities, such as human activity supervision; automated coordination of devices according to human activity in a smart home like environment; monitoring traffic congestions, social activities, environmental pollutions, water pollutions, citizen compliance tracking, wastage management, intelligent transportations, and other activity and services in a smart city like environment [47][48][49][50].
In the convergence of IoE-DSA several emerging technologies are progressively integrated. In our work, we use data of everything for internet of everything to analyze case base problem scenarios to model in the Applications. If we diagnose the emerging Technologies, we observe that several emerging technologies are in the innovation trigger. Several convergence technologies, such as, Data Science and Analytics, cloud, IoT, IoE, computational learning, and Range base National Language Query are in the peak of inflated expectations.

V. CONCLUSION AND FUTURE SCOPE
This discussion explored the progressive data science and knowledge analytic operations. In this work, I discuss data science and knowledge analytic mechanisms for analyzing and exploring different operational tasks, such as-data transformation and analysis, data mining, knowledge discovery, semantic knowledge explorations, structural analysis, and many more tasks. The machine learning technics are implemented in many areas of knowledge discovery and semantic knowledge analytics to explore the application intelligence. The main aim is to get the insights from large scale IoE data that can be used to produce intelligence for an application. The future work includes the further data science and knowledge analytic frameworks and applications that will have the potentiality to discover the big-value from disparate data sources irrespective of the data scales. Because managing and mining large scale disparate IoE data base is much more challenging along with its data science and knowledge analytic operations.

ACKNOWLEDGMENT
The author would like to express thanks to the Post Graduate Teaching & Research Dept., at School of Computing, Debre Berhan University, Ethiopia for supporting this research.