a. Operational Data ( OLTP)
- Data that works
- Frequent updated and queried
- Normalized for efficient search and update
- Fragmented and local relevance.
- Point query, query access individual tables.
What is the salary of John?
What is the phone number of the person who is in charge of Depta
How many people are rated as excellent?
b. Historical Data (OLAP)
- Data that tells
- Very infrequent update
- Integrated data set with global relevance.
- Analytical queries that require huge amounts of aggregation.
- performance issue, need quick response time.
How is the trend in the past 2 years?
How is summary of something?
2. What is Data Warehousing?
- An infrastructure of manage historical data
- Designed to support OLAP queries involving gratuitous use of aggregation
- Post retrival processing(reporting)
4. Data Marts:
- Segments of OLTP
- Data Warehouse is a collection of data marts
a. Dirt Data
- Lack of standardization
- Missing or duplicate data
- Can not be fully automated
- Require data considerable knowledge
- Data analysis
- Definition of transformation rule
- Rules verification
- Backflow: re-populate data
Schema: Forming an integrated schema structure from different data source, cleaning data and managing from different DS.