Getting started – Requirements Gathering

Every project has a circular development path.

Building a business intelligence solution is no different. You need to get the requirements of your business users, build a prototype warehouse and a cube, conduct UAT and restart the process based on the feedback.  Let’s start with he first step Requirements gathering.

The aim of this is to find out what the business users want to use the data warehouse for, what information are they after to help them in their decision-making. Find out the use cases. Start by asking them a simple question: “what do you want to get out of the solution?” This will tell you what their aim is. Now break this down into what questions they want answered trough this solution. An example could be “what is the trend of sales for item A over the last year?” This gives clues that historical trending is required, information about sales down to the level of each item sold.

Identify all the questions that need to be answered. Note down what business group each question relates to. This will help recognise the sources of data and any points of contact. Once you feel that most of the questions have been asked, start considering one business group at a time.

Each business group will have their own specific set of requirements. Think of this as building a data cube for each business group. This is the Kimball approach that was discussed in earlier post. As an analyst building a BI solution think of all the information you need and ask those questions. Dig deeper and find any more questions that the group needs answered. We need to extract what the facts or the grain need to be.

Then find out what they would like to group the data by. This will start to define the dimensions of a mart. Find out the meta data description which will help you define the data transformation rules.

The process I have described above can be broken down into 5 steps:

  1. Define Subject area and Objective
  2. Brainstorm Business Usage Scenario
  3. List business questions for each scenario
  4. Derive the measures from the business questions
  5. Determine the dimensions and build a model

If you have any questions, please post a comment below and I will try to help you out.

BI, Technology

Dimensional Modelling

I’m sure you have come across terms like Facts, Measures, Dimensions, Attribute hierarchies, etc. We will define these terms and see how a dimensional model is built.

A GRAIN is the smallest piece of information that needs to be presented in the warehouse solution.It is the most atomic piece of information in the data warehouse. It is best to give an example: For retail systems, a grain can be the items per transaction per customer. This is the most exact information that the warehouse will be able to provide. In defining the grain we define the fact too.

A FACT or MEASURE is something of interest. Something that we need to measure to give us an indicator to base our business decisions on. It is the most atomic piece of information used for aggregations. For example, for an inventory system this would be the items in stock; so we count (aggregate) the items to give us number of items. The fact here is an individual item, we measure the number of items. Another example can be . The piece of information of interest is customer and thus we would aggregate on this to find he number of customers enrolled or number of customers who shop at a particular store.

A DIMENSION is the information that we want to group the facts by. In the case of the inventory system, we can group the items by the type of item; eg. food, furniture, electronics, clothes, etc. For the fact customers we group them by the type of membership they hold or their city of residence. A dimension holds information related to the facts we measure.

Every dimension has attributes. The attribute usually is a property of the grouping or a sub category. For the dimension of geography, possible attributes are: continent, country, state, city. This information is grouped and described as “geography”.

This should give you an idea of what the terms mean. Once you understand these terms, you are ready to translate the business requirements into a warehouse model. In the next post we will look at requirements gathering and how to get the business users to narrow down their high level objectives.

BI, Business

Data Warehousing – A Process Overview

Data warehouse is a very common term that you will come across in the BI field. A data warehouse is simply a data store in its most basic definition. This data store is used for analysis and reporting and not for operational use.

The data warehouse typically has 3 layers: staging, integration and access layers. The staging layer usually contains a raw data dump from an operational database. The integration layer is where the raw data is restructured and links the data in a meaningful manner. The access layer is used for presentation and is usually for business clients.

The data warehouse stores all the business data for a corporation. This can be subdivided and stored into smaller stores known as Data Marts. Data Marts are typically subdivided by business function or logical reporting categories. If a single large Data Warehouse is implemented then this is known as centralised approach. The data is centrally stored and is accessed by all business functions. Implementing multiple data marts is a decentralised approach; as there is no single store with information from all business functions.

It’s beginning to sound like the two approaches are very different but they have a similar purpose. In fact, Ralph Kimball and Bill Inmon are the authors behind the decentralised and centralised approaches respectively. Inmon defined the idea of centralised storage and a top down approach. A top down approach starts with consideration of all the requirements at an enterprise level without breakdown by business function. Kimball defined a bottom up approach leading to decentralised storage with multiple Marts. A bottom up approach starts with individual business functions and their specific requirements. After all the marts are built these are then linked together to deliver an enterprise-wide solution.

Even though these two approaches to building a DW are different both involve 3 common stages: Integration, Analysis and Reporting. Integration and Analysis both involve manipulation of data, whereas Reporting is more about presentation of the data. It is possible to perform further analysis in the reporting layer.

The tools used for development usually support these three layers. One of the industry leading tool sets is provided by Microsoft. The SQL Server with Business Intelligence Development Studio (BIDS) supports the development of the above mentioned layers. The SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS) and SQL Server Reporting Services (SSRS) together form the MS BI stack.

SSIS is used to get data from an operational database and transform it into a dimensional or ER data store. SSAS allows building of a multi-dimensional Cube using the new DW model and perform calculations and aggregations. SSRS then allows generating reports from the Cube with some graphics.

In the next post we will take a closer look at the dimensional model (Kimball) for data warehousing and define some basic terms.

BI, Business

What is BI & DW?

Business Intelligence refers to the techniques used in gathering, identifying, extracting and analysing business data. Business data, for example can be revenues, adoption of a product or can be data related to a process such as manufacturing.

The aim is to provide business users a set of self service tools to easily analyse the raw data for decision making. The whole field revolves around being able to make decisions swiftly by looking at historic, current and future trends.

To facilitate this, some tools are essential. Data Warehouse is one of them. Data Warehouse means exactly that, a vast data store, organised on such a way as to deliver he required data to the users on demand and with the ability for the user to build custom views and reports on top.

BI is a fast growing field in various industries such as information technology, manufacturing, financial services and even the services ( travel and retail, etc. ) industry. Business rely on understanding their customer, market trends and predicting the future trends to base heir business decisions.

In the next post we will look at the tools available to build BI solutions and explore some of the concepts.

BI, Business, featured, Technology

A new direction

I know my blog has been quiet for a while and I’ve been very busy. Now I aim to write a post on a fairly regular basis.
My topic of choice is Business Intelligence and Data Warehousing. A completely different topic to what this blog has been about.

However I will start off easy with basic concepts and hen dice deeper with more advanced topics and implementation examples.

The first post in the series will be published soon.

Keep checking or follow me on Twitter for updates.