Friday, May 9, 2014

Enterprise Information Architecture



To understand the landscape of data we have to decompose it to make it simple to understand and work with. I classified it in layers. Each layer either identifies the way data is structured or the processes used to manipulate the data in this layer.

Raw Data Sources Layer


These sources are categorized in three types of sources:

1 – Databases of business applications that run the daily business and capture all business transactions like CRM, Billing, Ticketing, etc.  These types of data are always structured.

2 – Logs of monitoring systems like network traffic, call center , system logs. These types of data are normally unstructured.

3 – External Sources like social media and competition data.

Data Integration Layer


In this layer raw data from different sources is integrated together and transformed into a unified logical data model. Data quality measures are applied to the data in this layer to maintain accurate and consistent data. Data integration layer is a processing only layer which stores no data.

Core Data Layer


This layer is the most important layer and represents the corporate core data reservoir. It holds the data after processing it in the previous layer and makes it ready for the subsequent computing.

This layer consists of three systems:



1 – Master Data Management MDM

This system holds the unified and integrated version of main entities in the business. Customer profile is the first entity to be in this system. It can consolidate the customer profile and information from different systems those hold customer data (partially or full data) to support the decision maker to have a holistic view of the customer away from any operational system constraints. It provides a single and consistent version of customer data.

Product is the second entity that normally included in the MDM.


2 – Big Data

This is a Hadoop platform and used for massive parallel computing processes on structured and unstructured data.


3 – Enterprise Data Warehouse EDW

This system is the custodian of the unified logical data model that unifies all data from database sources and makes it ready for the subsequent layers.


Computational Layer



This layer is where intensive computing techniques are applied to the data held in the core data layer. This includes the following computing techniques:

1 – Multidimensional analysis which is the basis for BI cubes.

2 – Statistical Analysis

3 – Data Mining

4 – Real-Time Analytics (processing data streams as it is generated)

 Presentation and Visualization Layer



This is the delivery layer which presents the outcomes of the whole preceding layers to the business stake holder. It consists of the following:

1 – Reporting System which presents the detailed data reports to support the daily business users

2 – Analysis which is a typical BI cubes with advanced visualization tools

3 – Dashboard which support the management to review the business performance in a fast way.

4 – Predictive Models which are the outcome of the statistical analysis and data mining. It could be used by business stake holder for planning purposes or by other systems to apply these models in automatic decision making.

Governance Layer


This layer spans the data in its entire lifecycle whether it is in data stores or in processing layers. It controls how the data is accessed and stored. Also how and why it is processed.