Adv DBs: Data warehouses and OLAP

Data warehouses allow for people with decision power to locate the adequate data quickly from one location that spans across multiple functional departments and is very well integrated to produce reports and in-depth analysis to make effective decisions (MUSE, 2015a). Data could be stored in n-dimensional data cubes that can be dissected, filtered through, rolled up into a dynamic application called Online analytical processing (OLAP). OLAP can be its own system or part of a data warehouse, and if it’s combined with data mining tools it creates a decision support system (DSS) to uncover/discover hidden relationships within the data (MUSE, 2015b). DSS needs both a place to store data and a way to store data.  The data warehouse doesn’t solve they “Why?” questions, but the “How?, What?, When?, Where?” and that is where OLAP helps.  We want to extract as much knowledge as possible for decision making from these systems, hence this explains why we need both in DSS to solve all questions not just a subset.  But, as aforementioned that data mining tools are also needed for a DSS.

Data Warehouses

Discovering knowledge through archives of data from multiple sources in a consolidated and integrated way is what this warehouse does best.  They are subject-oriented (organized by customers, products, sales, and not in the invoice, product sales), integrated (data from different sources in the enterprise perhaps in different formats), time-variant (varies with respect to time), and nonvolatile (new data is appended not replacing old).  Suitable applications to feed data can be mainframes, proprietary file systems, servers, internal workstations, external website data, etc., which can be used for analysis and discovering knowledge for effective data-based decision making.  Detailed data can be stored online if it can help support/supplement summarized data, so data warehouses can be technically light due to summarized data.  Summarized data, which is updated automatically as new data enters the warehouse, mainly help improve query speeds. So, where is the detailed data: offline (Connolly & Begg, 2015).  Looking into the architecture of this system:

The ODS, Operational data store, holds the data for staging into the data warehouse.  From staging the load manager performs all Extraction and Loading functions to the data into the warehouse, meanwhile, the warehouse manager performs all the Transformation functions to the data into the warehouse.  The query manager performs all the queries into the data warehouse. Metadata (definition of the data stored and its units) are used by all these processes and exist in the warehouse as well (Connolly & Begg, 2015).

 OLAP

Using analytics to answer the “Why?” questions from data that is placed in an n-dimensional aggregate view of the data that is a dynamical system, sets this apart from other query systems.  OLAP is more complex than statistical analysis on aggregated data, it’s more of a slice and dice with time series and data modeling.  OLAP servers come in four main flavors: Multidimensional OLAP (MOLAP: uses multidimensional databases where data is stored per usage), Relational OLAP (ROLAP: supports relational DBMS products with a metadata layer), Hybrid OLAP (HOLAP: certain data is ROLAP and other is in MOLAP), and Desktop OLAP (DOLAP: usually for small file extracts, data is stored in client files/systems like a laptop/desktop).

DSS, OLAP, and Data Warehouse Application

Car insurance claims DSS.  Insurance companies can use this system to analyze patterns of driving from people, what damage can or cannot occur due to an accident, and why someone might claim false damages to fix their cars or cash out.  Thus, their systems can define who, what, when, where, why and how per accident against all other accidents (they can slice and dice by state, type of accident, vehicle types involved, etc) they have processed to help them resolve if a claim is legitimate.

References:

Business Intelligence: OLAP

Within a Business Intelligence (BI) program online analytical processing (OLAP) and customer relationship management (CRMs) are both applications have strategic uses for the company and are dependent on the data warehouse to help analyze multidimensional datasets stored in them to provide data-driven solutions to queries. They are both systems that require data analytics to turn all the multidimensional data into insightful information. OLAP’s multidimensional view of the data warehouse data sets can occur because it is mapped onto n-dimensional data cubes, where data can then be easily rolled up, drilled down, slice and dice, and pivot (Conolly & Begg, 2014). OLAP can have many applications outside of customer relationships.  Thus, OLAP is more versatile compared to CRMS, because CRMs are more targeted/focused with their approach, analysis of the customer relationship to the company/product.  CRMs main goal is to analyze internal and external data stored in the data warehouse, to come up with insights like “predicted affinity to buy” of a customer, the “cost or profit” of a customer, “prediction of future customer behavior”, etc. (Ahlemeyer-Stubbe & Shirley, 2014).  The information gained from the CRM can empower employees at the company on a customer’s affinity towards a product to either sell similar items or items of the result in a market basket analysis.

OLAP is the online analytical processing application, which allows people to examine data in real time from different points of view in aid to driving more data-driven decisions (McNurlin et al., 2008).  With OLAP, computers can now make what-if analysis and goal-based decisions using data. The key ability of OLAPs systems are to help answer the “Why?” question, as well as the typical “Who?” and “What?” questions (Conolly & Begg, 2014).  Connolly and Begg (2014) further explain that OLAP is a specialized implementation of SQL. Unfortunately, data queried is assumed to be static and unchanging.  Hence, the low volatile aspect of a data warehouse, with multidimensional databases is ideal for OLAP apps.  They value of the data warehouse does not come from just storing the right kind of data, but through making and conducting analysis, to solve queries that will in the end help make data driven decisions that are the best for the company.  According to Conolly & Begg (2014), OLAP tools have been used in studying the effectiveness of marking campaigns, product sales forecasting, and capacity planning.  However, it is of the opinion of Conolly & Begg (2014) that data mining tools can surpass the capabilities of OLAP tools.

CRMs, on the other hand, focuses a wide range of concepts revolving how companies store, capture and analyze customer, vendor, and partner relationship data. Information stored in CRMs could be interactions with customers, vendors or partners, which allow the company to gain insights based on previous interactions and could even be grouped/associated into different customer segments, market basket analysis, etc. (Ahlemeyer-Stubbe & Shirley, 2014). CRMs can assist in real time with making data-driven decisions with respects to a company’s customers (Mcnurlin, Sprague, & Bui, 2008).  The goal is to use the current data, to help the company build more optimal communications and relationships with it customers, vendors or partners.  Both internal and external data of the company is usually added to the data warehouse for the CRM. Through the use of the internet, companies can study more about their customers and their noncustomers, to aid a company to become more customer centric (McNurlin et al., 2008).  McNurlin et al. (2008) stated a case study with Wachovia Bank purchasing a pay-by-use CRM system from salesforce.com.  After the system was set up within six weeks, sales reps had 30 more hours to use on selling more bank services, and managers can use the data that was collected by the CRM to tell the sales reps which customers would have the highest yield.

References: