Adv DBs: Data warehouses and OLAP

Data warehouses allow for people with decision power to locate the adequate data quickly from one location that spans across multiple functional departments and is very well integrated to produce reports and in-depth analysis to make effective decisions (MUSE, 2015a). Data could be stored in n-dimensional data cubes that can be dissected, filtered through, rolled up into a dynamic application called Online analytical processing (OLAP). OLAP can be its own system or part of a data warehouse, and if it’s combined with data mining tools it creates a decision support system (DSS) to uncover/discover hidden relationships within the data (MUSE, 2015b). DSS needs both a place to store data and a way to store data.  The data warehouse doesn’t solve they “Why?” questions, but the “How?, What?, When?, Where?” and that is where OLAP helps.  We want to extract as much knowledge as possible for decision making from these systems, hence this explains why we need both in DSS to solve all questions not just a subset.  But, as aforementioned that data mining tools are also needed for a DSS.

Data Warehouses

Discovering knowledge through archives of data from multiple sources in a consolidated and integrated way is what this warehouse does best.  They are subject-oriented (organized by customers, products, sales, and not in the invoice, product sales), integrated (data from different sources in the enterprise perhaps in different formats), time-variant (varies with respect to time), and nonvolatile (new data is appended not replacing old).  Suitable applications to feed data can be mainframes, proprietary file systems, servers, internal workstations, external website data, etc., which can be used for analysis and discovering knowledge for effective data-based decision making.  Detailed data can be stored online if it can help support/supplement summarized data, so data warehouses can be technically light due to summarized data.  Summarized data, which is updated automatically as new data enters the warehouse, mainly help improve query speeds. So, where is the detailed data: offline (Connolly & Begg, 2015).  Looking into the architecture of this system:

The ODS, Operational data store, holds the data for staging into the data warehouse.  From staging the load manager performs all Extraction and Loading functions to the data into the warehouse, meanwhile, the warehouse manager performs all the Transformation functions to the data into the warehouse.  The query manager performs all the queries into the data warehouse. Metadata (definition of the data stored and its units) are used by all these processes and exist in the warehouse as well (Connolly & Begg, 2015).

 OLAP

Using analytics to answer the “Why?” questions from data that is placed in an n-dimensional aggregate view of the data that is a dynamical system, sets this apart from other query systems.  OLAP is more complex than statistical analysis on aggregated data, it’s more of a slice and dice with time series and data modeling.  OLAP servers come in four main flavors: Multidimensional OLAP (MOLAP: uses multidimensional databases where data is stored per usage), Relational OLAP (ROLAP: supports relational DBMS products with a metadata layer), Hybrid OLAP (HOLAP: certain data is ROLAP and other is in MOLAP), and Desktop OLAP (DOLAP: usually for small file extracts, data is stored in client files/systems like a laptop/desktop).

DSS, OLAP, and Data Warehouse Application

Car insurance claims DSS.  Insurance companies can use this system to analyze patterns of driving from people, what damage can or cannot occur due to an accident, and why someone might claim false damages to fix their cars or cash out.  Thus, their systems can define who, what, when, where, why and how per accident against all other accidents (they can slice and dice by state, type of accident, vehicle types involved, etc) they have processed to help them resolve if a claim is legitimate.

References: