Analytic Vision

Resources

Business Intelligence & EPM Terminology

Jump To: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

 

A
Agent
This is a rules-driven, automated application that searches the data and sends an alert when a particular pattern is found.

Aggregations
In data warehousing, aggregations refer to summary levels of data. While this can be done to save storage space, it is typically done to improve performance on data access for end users. In a relational data warehouse there may be several levels of aggregated fact tables; however, maintenance concerns should limit a database architect from overdoing this.

Multidimensional OLAP databases are very good at handling a large variety of aggregated hierarchies. For this reason it is becoming more common to see hybrid solutions implemented to leverage most aggregations in a multidimensional database, with the lowest levels of detail left for drill-thru into relational fact tables.

Alert
This is a message that is sent automatically by a computer system when identified thresholds or events occur. The monitored data and events can be inside or outside of the company. Internal key performance metrics can be monitored for variance to budget, while outside events such as the posted price of key raw materials or news events on big clients can also trigger an alert. Alerts can typically be delivered in a variety of forms such as e-mail or pager, or can even update a database.

Analysis Services
This is the OLAP database and data mining product provided by Microsoft that comes as a component in SQL Server 2000. This OLAP database has the noted flexibility of be able to develop data marts as Relational OLAP, Multidimensional OLAP, or the combination Hybrid OLAP. This allows designers to take advantage of pre-summarized aggregations, or run in the dynamic ROLAP mode to provide near real-time data warehousing information.

Attribute
Additional information included in a dimension table that is not used in defining the hierarchical levels of the dimension.

 

B
Business Intelligence Tools
This term represents a variety of software products that provide easy-to-use reporting and analysis capabilities for business users. While there are several variations of these tools, they can be categorized as follows:

 

  • Enterprise Reporting: This is a web-based reporting solution that generates scheduled and on-demand static reports and queries that can be viewed online or automatically e-mailed to subscribed users, and can be saved in several formats, such as HTML, PDF, or Excel.
  • Online Analytical Processing OLAP: This is a web-based analytical tool that allows user to interact with the data by double-clicking and drilling down on dimensions and performing a variety of other analysis.
  • Data Mining Tools: This type of tool allows a computer to run batch jobs to perform advanced analysis algorithms to identify correlations, patterns, and trends in large data sets. Typical examples of use are grocery store basket analysis and marketing campaign customer grouping.

C

Calculated Member
This refers to computed metrics built into OLAP databases such as Microsoft Analysis Services or Hyperion Essbase. These formulas can involve components from different dimensions and are much more powerful then traditional relational database SQL calculations. While Hyperion Essbase offers a large number of pre-built functions, Analysis Services offers a scripting language called MDX that allows developers to build its functions.

Cell
A cell is a single point in a cube, which is an intersection of a single member from every dimension.

Cellset
A set of data returned from a cube with an MDX query. A cellset in the multidimensional world is equivalent to a rowset in the relational database world.

Changed Data Capture
In a database replication, changed data capture occurs when only the data that has changed since the previous replication is copied.

Changing Dimensions
This refers to anytime a level or attribute of a dimension needs to be updated. This is one of the most challenging design issues for star schema data modeling. This issue is often referred to as the handling of Slowly Changing Dimensions. There are three ways outlined by Ralph Kimball (and others) to handle this situation:

 

  • Type 1: Change the data in the dimension table record. This approach replaces the old value with the new. Although the easiest to implement, it effectively eliminates any history of the old values existence.
  • Type 2: Create new records in the dimension table with the new information. This allows for the tracking of the old and new versions of the given dimension record; however, having multiple occurrences of a given dimension member, such as a customer, can cause some relational query confusion for end users.
  • Type 3: Design extra columns in the dimension record to hold values before and after the change. While this identifies the prior value, it will not hold long-term history.

Various authors (including Ralph Kimball) also suggest the splitting of dimensions into separate dimensions when a consolidated dimension would result in a high level of dimension change. The design decision should be based on end-user needs in regards to the changing dimension elements.

 

Clickstream Data
This is data that is generated by all visitors to a web site and is primarily captured in log files. Web servers capture a large amount of data in the process of receiving requests for web pages. This data includes page served, time, source of the request, type of browser making request, etc. When analyzed, this data provides information about the behavior of individuals who are browsing the Internet. It can help businesses analyze where visitors are coming from, what type of visitors are most likely to buy certain products, what type of web pages are most attractive, etc. This information is essential for analyzing the effectiveness of Internet ad campaigns and, in general, for finding ways to improve the effectiveness of Internet commerce.

Conformed Dimension
A dimension that is attached to more than one fact table. This is the key to Ralph Kimball’s star schema design bus architecture. This ensures that there is only one version of a given dimension, such as a product dimension, and it will link to any fact tale that requires product information. This is how integrated fact tables can share metrics from each other and present one version of truth to the user across different subject areas of the business, thus creating an enterprise data warehouse.

Cube
This is the popular slang term for a Multidimensional or Hyper Cube. While most OLAP databases typically have far more dimensions then the three sides of a cube, the three-dimensional concept of a cube makes it easy to explain the intersections of various dimensions levels when first training users of an OLAP database.

 

D
Data Cleansing
There are often errors and inconsistencies in source data being brought into the data warehouse. The process of transforming the data into a complete and consistent set of information is often referred to as Data Cleansing.

Data Mart
A data mart is a database that has the same design characteristics as a data warehouse, but is focused on a single subject area of data within an enterprise. While there are varying definitions on how data marts fit into a data warehousing solution, we prefer the following:
The data warehouse gathers all the information from the various legacy systems. Specialized data marts are then created with a subset of the information in the data warehouse. These data marts are easier to use because they have only the particular information the specific user group needs. The use of several data marts also allows the querying load to be spread among several different computers. This can reduce network traffic. It is common for these data marts to be developed in OLAP cubing products such as Hyperion Essbase and Microsoft Analysis Services.

Data Migration
The movement of data from one environment to another. This happens when data is extracted from a legacy system, and is transformed and loaded into a data warehouse.

Data Mining
This is a business intelligence process that allows a computer to run batch jobs to perform advanced analysis algorithms to identify correlations, patterns, and trends in large data sets. Typical examples of use are grocery store basket analysis and marketing campaign customer grouping.

Analyzing data involves the recognition of significant patterns. Human analysts can see patterns in small data sets. Specialized data mining tools are able to find patterns in large amounts of data. These tools are also able to analyze significant relationships that exist only when several dimensions are viewed at the same time.

Data Quality Assurance
The process of checking the quality of the data being imported into the data warehouse.

Data quality assurance is one of the greatest challenges in the process of data warehousing. If the data-based knowledge generated by the data warehouse is to be trusted, the data entered into the warehouse must be complete and accurate. Data quality can be a challenge for several reasons: The data is being consolidated from a variety of legacy sources that may have differing key structures, user terminology, or un-edited values. There also might be times when all the data is not received from one of the legacy systems. This could make comparisons between time periods invalid. A significant portion of time in the development process should be set aside for setting up the data quality assurance process and implementing whatever data cleansing is needed. In a production environment, there should be a data quality report generated after each data warehouse load. There should be provision for rolling back the load (or reloading) if data quality testing indicates that the data is unacceptable.

Data Scrubbing
There are often errors and inconsistencies in source data being brought into the data warehouse. The process of transforming the data into a complete and consistent set of information is often referred to as Data Scrubbing or Data Cleansing.

Data Transformation
The modification of data as it is moved into the data warehouse. This modification process includes a variety of tasks such as Data Cleansing, denormalization of dimensions, processing calculated fields, changing data types for consistency, replacing codes with values, aggregations, etc.

Data Warehouse
A data warehouse is a centralized data repository that collects, organizes, integrates, and makes data available for the purpose of reporting and analysis – to give management the ability to access and analyze information about its business. The data is typically stored in a dimensional data model design that is optimized for ease of use and performance with business intelligence reporting and analysis tools.

Data Warehousing
The process of planning, gathering requirements, designing, developing, testing, deploying, using, managing, auditing, maintaining, and enhancing data warehouses and/or data marts.

Decision Support System (DSS)
A computer system designed to assist an organization in making decisions. The Decision Support Systems and Enterprise Information Systems of the 1980s and early 1990s were forerunners of today's Business Intelligence Tools.

Density or Dense
A key aspect of designing and optimizing multidimensional database cubes is to understand which dimensions are Dense. A Dense dimension is likely to have a high percentage of its members populated, for every row of source data loaded. Typical dense dimensions are time, scenario (actual vs. budget), and accounts.

Dimension – OLAP
A Dimension is one of the analytical perspectives that can be used to analyze the data in an OLAP cube; common examples include products, customers, and time. Dimensions contain one or more hierarchies, which have levels for drilling up and drilling down in the cube.

Dimension – Relational Table
A Dimension relational table holds the analytical perspectives used to analyze data in Fact tables; common examples include products, customers, and time. Dimension tables have a primary surrogate key that is used to join them to the fact table, and hold all attributes that describe the given dimension. There typically are more attributes in the relational dimension table then are built into the corresponding OLAP dimension.
The dimension tables in a star schema are intentionally de-normalized. The level fields and the attribute fields contain data that is duplicated in many of the records. This normally does not add significantly to the amount of storage space needed in the database, because the overall size of each dimension table is very small when compared to the size of the fact table.

Dimensionalization
The process of transforming data into a de-normalized dimension table in a star schema design.

Drill Down and Drill Up
The ability to move between levels of the hierarchy when viewing data with an OLAP browser.

  • Drill Down: Changing the view of the data to a greater level of detail.
  • Drill Up: Changing the view of the data to a higher level of aggregation.

Multidimensional analysis (OLAP) tools organize the data in two primary ways: in multiple dimensions and in hierarchies. Drilling down and drilling up allow an analyst to move down and up the hierarchies to see how the information at the various levels is related. After looking at the sales totals for a store's departments, the analyst may want to drill down to see the individual sales for each employee in one of the departments. Then the analyst may choose to drill up to view how this store's total sales compare to other stores in the same region.

DTS (Data Transformation Services)
An ETL tool provided as a part of Microsoft SQL Server. It provides a design environment for creating data transformation applications.

 

E
Enterprise Information System/Executive Information System (EIS)
A computer system designed to assist an organization in making decisions. The Decision Support Systems and Enterprise Information Systems of the 1980s and early 1990s were forerunners of today's Business Intelligence Tools.

ETL (Extract, Transform, and Load)
ETL refers to the process of getting data out of an operational data store (Extract), modifying it to be accurate, complete, and consistent (Transform), and inserting it into the data warehouse or data mart (Load).

 

F
Fact Table
In a star schema, the central table that contains the individual metrics for the given subject area is called the Fact table. The Fact table for a sales star schema is typically loaded from the invoice transactions. There are two types of fields in a fact table: 1) The surrogate foreign keys that join the fact table to the associated dimension tables in the star schema, and 2) the metrics that describe and value the given subject area of the business.

The granularity of the fact table is one of the most significant design decisions in creating a data warehouse. The fact table should be designed at the lowest level that users require to see the data. One of the design challenges is that every metric must have a value in all associated dimension tables in the star schema; this means that the grain must be consistent within the fact table.

 

G
Granularity
The level of detail of the facts stored in a data warehouse. This is the lowest level of detail that users will be able to access in the data warehouse.

 

H
Hierarchy
Organization of data into a logical tree structure. Dimensions can have one or more hierarchies. A Time dimension, for example, could have a Calendar hierarchy and a Fiscal hierarchy. Hierarchies contain levels, which organize data into a logical structure. Moving between the levels of a hierarchy is called drilling up and drilling down.

Hybrid OLAP (HOLAP)
A combined use of Relational OLAP (ROLAP) and Multidimensional OLAP (MOLAP). In HOLAP, the source data is usually stored using a ROLAP strategy and aggregations are stored using a MOLAP strategy. This combination usually results in the least amount of storage space and the fastest cube processing.

Hyper-Cube
A cube with more than three dimensions; also known as Cube and Multidimensional Cube. A cube is an object with three dimensions. A hyper-cube is a cube-like structure with more than three dimensions. In the world of OLAP, hyper-cubes are nearly always simply referred to as cubes.

 

I

 

J

 

K

 

L
Legacy System
An operational computer system that has been around for a long time. Sometimes organizations have several legacy systems that have been developed at different times by different people for a variety of purposes. The data in these systems is usually mutually incompatible and sometimes inaccurate. One of the biggest challenges of the data warehousing process is to bring data out of the various systems where it currently is located and organize it to fit together in the data warehouse.

Leaf Level
The lowest level in a dimensional hierarchy in an OLAP cube. This is typically the level that data is loaded into an OLAP cube before it is calculated.

Level
The common term used to describe each tier of a hierarchy in a dimension. For example a simple Time dimension could have levels for Year, Quarter, Month, and Day.

Linked Partition
In Hyperion Essbase, a linked partition sends users from a cell in one multidimensional database to a cell in another database. A linked partition gives users a different perspective on the data.

Local Cube
A subset of an OLAP cube that has been downloaded from the server onto someone’s PC to allow for analysis while disconnected from the network.

 

M
MDD (Multidimensional Database)
A multidimensional database stores data in a different fashion then the traditional two dimensional relational databases. A multidimensional database has several dimensions that are each made up of one or more hierarchies. While this sounds similar to a relational star schema database, the difference is that the multidimensional database pre-aggregates many levels of the combinations of hierarchies for the many intersections across all of the dimensions. While this is not necessarily the most efficient way to store data, it can provide ideal response time for users interacting with the data. Instead of waiting for an SQL statement to execute and aggregate the requested result set, the aggregated totals can often just be extracted and displayed from the multidimensional database. This performance is ideal for analyzing information on the fly.

MDX (Multidimensional Expressions)
The querying language for Microsoft OLAP cubes. MDX has some similarities to SQL, but has many unique features. The following query returns a cellset with the names of the store regions on the columns, the names of product families on the rows, and the profit displayed in the cells:

select
[Stores].[Region].Members on columns,
[Products].[Product Family].Members on rows
from SalesCube
where ([Measures].[Profit])

Measure
A numeric value stored in a fact table and in an OLAP cube.

Member
One of the data points for a level of a hierarchy of a dimension. For example each of the twelve months is typically a member of the Time dimension.

Member Property
An attribute of a level that is available for OLAP querying. In Microsoft Analysis Services you can create member properties for any level. These member properties can be referenced directly in MDX queries, and they can also be used for creating virtual dimensions.

Metadata
Metadata is control information about the data warehouse and the data in the warehouse. Technical metadata is used by the data warehouse administrator to implement and control the data movement propagation operations. Business metadata helps warehouse users find information in the data warehouse without knowing the underlying implementation of the database. Also known as data that describes the data in the warehouse.

Multidimensional Analysis
Also Known As: OLAP (On-Line Analytical Processing) This type of analysis allows the user to interact with the data by dynamically changing the queried results with convenient mouse click functions. Multidimensional data is organized hierarchically, allowing users to "drill down" for more detailed information, "drill up" to see a broader, more summarized view, and "slice and dice" to dynamically change the combinations of dimensions that are being viewed.

Multidimensional Cube – Another term used for Cube.

Multidimensional Database – See MDD.

Multidimensional Online Analytical Programming (MOLAP)
An OLAP database that stores data and aggregations in a multidimensional database structures.

 

N
Nonvolatile
Data that does not change. Data is stable in a data warehouse. More data is added, but data is never removed. This enables management to gain a consistent picture of the business. Nonvolatility is one of the original defining characteristics of a data warehouse.

Normalization
The process of organizing data in accordance with the rules of a relational database.

In a completely de-normalized database the customer name and address information would be stored every time a customer made a purchase.

In a normalized database each customer's name and address would be stored only once, in a separate table. Every purchase record would have a reference to the customer table to indicate which customer was involved.

A fully normalized database is usually the most efficient design for an On-line Transaction Processing System. A data warehouse, with its emphasis on efficient retrieval of data, often benefits from some intentional de-normalization. See the discussion of the Star Schema.

 

O
OLAP (On-line Analytical Processing)
This type of analysis allows the user to interact with the data by dynamically changing the queried results with convenient mouse click functions. Multidimensional data is organized hierarchically, allowing users to "drill down" for more detailed information, "drill up" to see a broader, more summarized view, and "slice and dice" to dynamically change the combinations of dimensions that are being viewed.

OLTP (On-line Transaction Processing)
This is the common description for an operational system. It is usually a highly normalized database that handles interactive data entry processes such as entering invoices and journal entries.

 

P
PivotTable Services
The tools that provide access to Microsoft's Analysis Services data in Microsoft Excel.

Private Dimension
In Microsoft Analysis Services, a dimension that is restricted in use to one particular cube. Shared (conformed) dimensions are very useful in creating a unified data warehousing structure. You can create a dimension once and use it in several different cubes. Private dimensions are useful in those situations where you want independent cubes. If you change a private dimension, that change only affects a single cube, whereas the change of a shared dimension can have implications for many cubes.

Processing
In Microsoft Analysis Services, the building of cube and dimension structures and cube aggregations. Cubes need to be processed before they can be browsed. For cubes, an incremental process only adds new data into the cube, while a full process reprocesses all of the data to the cube.

 

Q

 

R
Relational On-line Analytical Processing (ROLAP)
OLAP that stores data and aggregations in a relational database.

Replication
The physical copying of data from one database to another. In data warehousing replication takes place as data is moved from the on-line transaction processing system into the data warehouse. Replication also takes place if one or more data marts are being populated with data from the data warehouse. There are several software tools that have been developed to handle replication into a data warehouse. These tools give the ability to transfer data out of and into a variety of database management systems. Many of these tools also provide data transformation and data cleansing capabilities.

Heterogeneous replication occurs when the source and the target database are not the same database management system.
Data migration is the movement of data from one environment to another – as happens when data is brought from a legacy system into a data warehouse.

Bidirectional replication is the ability to copy data in both directions between two databases.

In changed data capture only the data that has been changed since the last replication is copied.

In synchronization all the data stored in the database is replicated.
Replicated Partition
In Hyperion Essbase, a replicated partition is a copy of a portion of the data from a source multidimensional database that is stored in a target multidimensional database. This provides the ability to combine data from multiple multidimensional databases into a single database.

 

S
Scale, Scalable, and Scalability
The ability of a computer system or a database to operate efficiently with larger quantities of data. Scalability is often discussed in situations when multiple processors are joined together. The system scales well (or is scalable) if doubling the number of processors also doubles the speed at which the system performs its tasks. The extra work involved in coordinating larger systems usually prevents them from being fully scalable – so that going from one to two processors would increase the total speed by less than a factor of two.

Schema
The logical organization of data in a database.

Shared Dimension
In Microsoft Analysis Services, a dimension used by more than one cube. In general, a dimension that is used by more than one cube is called a conformed dimension.

Shared Member
In Hyperion Essbase, the same member can be built into alternate hierarchies and aggregations within the same dimension. This provides an easy and efficient way to build alternate rollups. The data is only stored once, and the "shared members" are pointers to the original data point.

Slice, Slicer, Slicing
The limiting of a cellset to data for a single member from a particular dimension. Slicing in MDX is similar to filtering in a relational database. In an MDX query, the WHERE clause is called the slicer.

Slice and Dice
The ability to move between different combinations of dimensions when viewing data with an OLAP browser. Multidimensional analysis tools organize the data in two primary ways: in multiple dimensions and in hierarchies. Slicing and dicing refers to the ability to combine and recombine the dimensions to see different slices of the information.

Slowly Changing Dimensions (SCD)
A dimension that has levels or attributes that are changing on an occasional basis. See Changing Dimensions to understand the standard approaches to updating slowly changing dimensions.

SQL (Structured Query Language)
The standard language for accessing relational databases.

Snowflaking
Normalization applied to the dimension tables of a star schema. The star schema is a very simple database design, which clearly presents the multidimensional character of the data and allows for rapid querying of the data in a data warehouse. In snowflaking, some of the fields of the dimension tables are split off into separate tables. This achieves a higher level of normalization, but makes the database design more complex and can reduce the performance and ease of use for Business Intelligence Tools.

Sparsity and Density, Sparse and Dense
The degree to which the cells of a cube are filled with data. One of the primary challenges of storing multidimensional data is the degree of sparsity that is often encountered. When many dimensions are considered with a fine grain of detail, most of the cells will be empty.

Star Schema (Business Definition)
A method of organizing information in a data warehouse that allows the business information to be viewed from many perspectives. The star is a picture of the way the data is being stored. The basic factual information is in the middle of the star. The points of the star represent various perspectives from which the factual information can be viewed.

Star Schema (Technical Definition)
A database design that consists of a fact table and one or more dimension tables. Each of the dimension tables has a single field primary key which has a one-to-many relationship with a foreign key in the fact table. The star schema is an intentional simplification of the database design that would be achieved by following the standard rules of normalization. The dimension tables are often flattened, to allow for more efficient querying (see "snowflaking").

Structured Query Language (SQL)
The standard language for accessing relational databases.

Summary Tables
Tables used to store summarized or aggregated data. Summarized fact tables can be created to enhance performance in a ROLAP solution; however, creating many of these can make future modifications complicated.

Synchronization
The copying of all data in a database replication.

 

T
Time-Balance-Average
In Hyperion Essbase, this is a tag placed on a measure when the value displayed should show the average value for the time period selected. This means that if someone is viewing a Time-Balance-Average measure for quarter one, the system will automatically average the values for periods one, two, and three, instead of aggregating them.

Time-Balance-First
In Hyperion Essbase, this is a tag placed on a measure when the value displayed should show the first value for the time period selected. This means that if someone is viewing a Time-Balance-First measure for quarter one, the system will automatically show the value for period one, instead of aggregating them.

Time-Balance-Last
In Hyperion Essbase, this is a tag placed on a measure when the value displayed should only be for the most recent time period holding data. A typical example for this is a balance sheet account. If someone is looking at the balance sheet at the end of quarter one, they should see the Accounts Receivable value for period 3, and not the aggregation of all three periods.

Time-Variant Data
Data that is identified with a particular time period. Time-variant is one of the original defining characteristics of a data warehouse.

Transparent Partition
In Hyperion Essbase, a transparent partition allows users to access data from the multidimensional database source as though it were stored in the multidimensional database target. However, the data is stored at the source, which can be in another application or in another Analytic Services database or on another Analytic Server. This provides the ability to combine data from multiple multidimensional databases into a single database.

 

U

 

V
Virtual Cube
The term used in Microsoft's Analysis Services (OLAP Services) for a cube that is created from portions of one or more base cubes. A virtual cube is similar to a view in a relational database. It can be used for security purposes, giving users access to only some of the dimensions and measures. It can also be used to show data from separate cubes at the same time. Virtual cubes are much more useful when you have shared dimensions and measures that are common to all the base cubes that are used.

Virtual Dimension
The term used in Microsoft's Analysis Services (OLAP Services) for a dimension that is created from one or more member properties in another dimension.

 

W

 

X
XML (eXtensible Markup Language)
A method of sharing data between disparate data systems, without needing a direct connection between them.

XML for Analysis Services
An XML schema that can be used to communicate with a Microsoft Analysis Server.

 

Y

 

Z
 

Back to Top

Success Stories


Cree, Inc.
Integrated Planning Solution Utilizing Hyperion Planning, Capital and Workforce Planning with Oracle’s E-Business Suite

Intercontinental Hotels
Enterprise upgrade of the Hyperion Essbase & EPM platform to version 11
4944 Parkway Plaza Boulevard, Suite 450
Charlotte, NC 28217 PH: (800) 676-2313 FX: (704) 909-2339