To understand the landscape of data we have to decompose it to make it simple to understand and work with. I classified it in layers. Each layer either identifies the way data is structured or the processes used to manipulate the data in this layer.
Raw Data Sources Layer
These sources are categorized in three
types of sources:
1 – Databases of business applications that
run the daily business and capture all business transactions like CRM, Billing,
Ticketing, etc. These types of data are
always structured.
2 – Logs of monitoring systems like network
traffic, call center , system logs. These types of data are normally
unstructured.
3 – External Sources like social media and competition
data.
Data Integration Layer
In this layer raw data from different sources
is integrated together and transformed into a unified logical data model. Data
quality measures are applied to the data in this layer to maintain accurate and
consistent data. Data integration layer is a processing only layer which stores
no data.
Core Data Layer
This layer is the most important layer and
represents the corporate core data reservoir. It holds the data after
processing it in the previous layer and makes it ready for the subsequent
computing.
This layer consists of three systems:
1 – Master Data Management MDM
This system holds the unified and
integrated version of main entities in the business. Customer profile is the
first entity to be in this system. It can consolidate the customer profile and
information from different systems those hold customer data (partially or full
data) to support the decision maker to have a holistic view of the customer
away from any operational system constraints. It provides a single and
consistent version of customer data.
Product is the second entity that normally
included in the MDM.
2 – Big Data
This is a Hadoop platform and used for
massive parallel computing processes on structured and unstructured data.
3 – Enterprise Data Warehouse EDW
This system is the custodian of the unified
logical data model that unifies all data from database sources and makes it
ready for the subsequent layers.
Computational Layer
This layer is where intensive computing
techniques are applied to the data held in the core data layer. This includes
the following computing techniques:
1 – Multidimensional analysis which is the
basis for BI cubes.
2 – Statistical Analysis
3 – Data Mining
4 – Real-Time Analytics (processing data
streams as it is generated)
Presentation and Visualization Layer
This is the delivery layer which presents
the outcomes of the whole preceding layers to the business stake holder. It
consists of the following:
1 – Reporting System which presents the
detailed data reports to support the daily business users
2 – Analysis which is a typical BI cubes
with advanced visualization tools
3 – Dashboard which support the management
to review the business performance in a fast way.
4 – Predictive Models which are the outcome
of the statistical analysis and data mining. It could be used by business stake
holder for planning purposes or by other systems to apply these models in
automatic decision making.
Governance Layer
This layer spans the data in its entire lifecycle
whether it is in data stores or in processing layers. It controls how the data
is accessed and stored. Also how and why it is processed.
No comments:
Post a Comment