Archive
2014 in review
The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.
Here's an excerpt:
The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 330,000 times in 2014. If it were an exhibit at the Louvre Museum, it would take about 14 days for that many people to see it.
What is ODS (Operational Data Store) and how it differs from Data Warehouse (DW)
I see lot of people discussing about ODS, and citing their own definitions and ideas about it. Some people also use the name as a synonym for a Data Warehouse or Factory Database. Thus, at times it becomes very difficult to tell or convince people while you are designing or architecting a DW/BI solution.
So, I thought to give some time to explain what actually an ODS is.
Simple definition: An Operational Data Store (ODS) is a module in the Data Warehouse that contains the most latest snapshot of Operational Data. It is designed to contain atomic or low-level data with limited history for “Real Time” or “Near Real Time” (NRT) reporting on frequent basis.
Detailed definifion:
– An ODS is basically a database that is used for being an interim area for a data warehouse (DW), it sits between the legacy systems environment and the DW.
– It works with a Data Warehouse (DW) but unlike a DW, an ODS does not contain Static data. Instead, an ODS contains data which is dynamically and constantly updated through the various course of the Business Actions and Operations.
– It is specifically designed so that it can Quickly perform simpler queries on smaller sets of data.
– This is in contrast to the structure of DW wherein it needs to perform complex queries on large sets of data.
– As the Data ages in ODS it passes out of the DW environment as it is.
–> Where does ODS fits in a DW/BI Architecture?
–> Classes of ODS (Types):
Bill Inmon defines 5 classes of ODS shown in image below:
– Class-1 ODS would simply involve Direct Replication of Operational Data (without Transformations), being very Quick.
– Whereas Class-5 ODS would involve high Integration and Aggregation of data (highly Transformed), being a very time-consuming process.