Getting started – Requirements Gathering

Every project has a circular development path.

Building a business intelligence solution is no different. You need to get the requirements of your business users, build a prototype warehouse and a cube, conduct UAT and restart the process based on the feedback.  Let’s start with he first step Requirements gathering.

The aim of this is to find out what the business users want to use the data warehouse for, what information are they after to help them in their decision-making. Find out the use cases. Start by asking them a simple question: “what do you want to get out of the solution?” This will tell you what their aim is. Now break this down into what questions they want answered trough this solution. An example could be “what is the trend of sales for item A over the last year?” This gives clues that historical trending is required, information about sales down to the level of each item sold.

Identify all the questions that need to be answered. Note down what business group each question relates to. This will help recognise the sources of data and any points of contact. Once you feel that most of the questions have been asked, start considering one business group at a time.

Each business group will have their own specific set of requirements. Think of this as building a data cube for each business group. This is the Kimball approach that was discussed in earlier post. As an analyst building a BI solution think of all the information you need and ask those questions. Dig deeper and find any more questions that the group needs answered. We need to extract what the facts or the grain need to be.

Then find out what they would like to group the data by. This will start to define the dimensions of a mart. Find out the meta data description which will help you define the data transformation rules.

The process I have described above can be broken down into 5 steps:

  1. Define Subject area and Objective
  2. Brainstorm Business Usage Scenario
  3. List business questions for each scenario
  4. Derive the measures from the business questions
  5. Determine the dimensions and build a model

If you have any questions, please post a comment below and I will try to help you out.

BI, Business

Data Warehousing – A Process Overview

Data warehouse is a very common term that you will come across in the BI field. A data warehouse is simply a data store in its most basic definition. This data store is used for analysis and reporting and not for operational use.

The data warehouse typically has 3 layers: staging, integration and access layers. The staging layer usually contains a raw data dump from an operational database. The integration layer is where the raw data is restructured and links the data in a meaningful manner. The access layer is used for presentation and is usually for business clients.

The data warehouse stores all the business data for a corporation. This can be subdivided and stored into smaller stores known as Data Marts. Data Marts are typically subdivided by business function or logical reporting categories. If a single large Data Warehouse is implemented then this is known as centralised approach. The data is centrally stored and is accessed by all business functions. Implementing multiple data marts is a decentralised approach; as there is no single store with information from all business functions.

It’s beginning to sound like the two approaches are very different but they have a similar purpose. In fact, Ralph Kimball and Bill Inmon are the authors behind the decentralised and centralised approaches respectively. Inmon defined the idea of centralised storage and a top down approach. A top down approach starts with consideration of all the requirements at an enterprise level without breakdown by business function. Kimball defined a bottom up approach leading to decentralised storage with multiple Marts. A bottom up approach starts with individual business functions and their specific requirements. After all the marts are built these are then linked together to deliver an enterprise-wide solution.

Even though these two approaches to building a DW are different both involve 3 common stages: Integration, Analysis and Reporting. Integration and Analysis both involve manipulation of data, whereas Reporting is more about presentation of the data. It is possible to perform further analysis in the reporting layer.

The tools used for development usually support these three layers. One of the industry leading tool sets is provided by Microsoft. The SQL Server with Business Intelligence Development Studio (BIDS) supports the development of the above mentioned layers. The SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS) and SQL Server Reporting Services (SSRS) together form the MS BI stack.

SSIS is used to get data from an operational database and transform it into a dimensional or ER data store. SSAS allows building of a multi-dimensional Cube using the new DW model and perform calculations and aggregations. SSRS then allows generating reports from the Cube with some graphics.

In the next post we will take a closer look at the dimensional model (Kimball) for data warehousing and define some basic terms.