History of Data Warehousing
- Extend the transformation of data into information
- In 1990's executives became less concerned with the day-to-day business operations and more concerned with overall business functions.
- Data warehouse provided the ability to support decision making without disrupting the day-to-day operations.
Data Warehouse Fundamentals
# Data warehouse
= a logical collection of information
that are gathered from many different operational databases that
supports business analysis activities and decision-making tasks.
- Primary purpose- to aggregate information throughout an organization into a single repository for decision-making purposes.
# Database vs. Data Warehouse
~ Data warehouse
- Stores information from multiple databases, or application and external information such as industry information.
- enables cross-functional analysis, industry analysis, market analysis all from a single repository.
- support online analytical processing (OLAP)
~ Database
- Stores information for a single application.
# Extraction, transformation and loading (ETL)
= A process that extracts information
from internal and external databases, transforms the information using a
common set of enterprise definitions and loads the information into a
data warehouse.
# Data mart
= Contains a subset of data warehouse information.
Multidimensional Analysis and Data Mining
- Databases contain information in series of two-dimensional tables.
- In a data warehouse and data mart information is multidimensional and it contains layers of columns and rows
# Dimension - a particular attribute of information ( products, promotions, stores, category, region, stock price,date, time and weather
# Cube
= Common term for the representation of multidimensional information.
- Cube A represents store information (the layers), product information (the rows), and promotion information (the columns)
- Cube B represents a slice of information displaying promotion II for all products at all stores.
- Cube C represents a slice of information displaying promotion III for product B at store 2.
# Data mining
= The process of analyzing data to extract information not offered by the raw data alone.
- In order to perform data mining users need data-mining tools.
# Data-mining tool - Uses a variety of techniques to find
patterns and relationships in large volumes of information and infers
rules that predict future behavior and guide decision making.
* Example - query tools, reporting tools, multidimensional analysis tools, statistical tools and intelligent agents.
Information Cleansing or Scrubbing.
- A process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information
- Contact information in an operational system.
- Standardizing Customer name from Operational Systems
* Allows an organization to fix these types of inconsistencies and cleans the data in the data warehouse.
- Information that people use to support their decision-making efforts.
- Principle of BI enabler include :
* Technology
* People
* Culture.
No comments:
Post a Comment