Home Tags Posts tagged with "Datawarehouse"

Datawarehouse

A few years ago when I was searching for a free dimensional modelling tool, through some forum; I saw this fantastic Dimensional Modelling tool.  I thought I was the only one who doesn’t know about this workbook, but surprisingly many others are not familiar with this tool. It is not new; it’s been there for ages. It was something created as a resource than a tool, so buggy 🙁 Let me not over complicate this by calling a tool; it just a super simple Excel workbook with macros.

You can download the file from Kimball Group website. As Kimball Group explains, it has the annotated data model. With the above link, you can download a zipped file which has a blank model and a pre-populated model with the detailed explanation of each sheet. It is already very well documented. There are few changes I have made to file, so it works for me. Here I am just going to show, how I used it and how it could be so useful in our day to day BI development world and what are the changes I made (So I can remember my changes for next use 😉 )

As a BI Developer, our work progresses through different stages. Once we have requirements ready, next step is to create that perfect model, well it will never be perfect. Most of the times model changes a lot. But making sure we document the changes and make our fellow developers happy could be a difficult task. Not every project will have lush of a data modeller, developer gets involved in modelling many times, that’s when Kimball Modelling Spread Sheets comes handy

Kimball Modelling Tool:

In simple words, this workbook helps us to generate scripts to create the database for the data warehouse. We give all the properties of the database in the workbook and use a macro to generate the script based on the given properties. I am going to walk through by taking MDWToolkit_Datamodel_4.0_2008R2 Blank as an example. Let’s create a simple Product dimension and Sales Fact Table using this workbook.

2016-10-24_09-22-13_02.jpg

First three sheets (Home, ReadMe and ChangeLog) in the file has credits, notes and some useful information about the database we are going to create. I prefer to delete ReadMe and Change Log sheets and create my Notes and Versions sheets.

Home sheet is where we give Database name and schema name for views. I created a copy of Blank dimension and Blank Fact which already have dummy rows populated. Each sheet has four categories.

2016-10-24_21-33-24.jpg

1 – where we give the target table information

 

2 – Column Information

 

3- Target Table Columns information

 

4 – Souce Table columns Information

Workbook already includes notes. However, I would like to highlight few properties/options which I consider useful.

Table Display Name – It used for views, such as here my Table name DimProducts. Apparently Dim does not make sense to business users, so whatever I gave in Display name will become view name, and I am going to use views in my CUBE.

BizFilterLogic – Very useful when we have table level filter which we need to use while extracting data from your source in the ETL

GenerateScript – Y / N, Yes mean include this table in the script, and N mean do not

Column Display Name – Again same as Table display name used in the view creation script.

Column Display Folder – Display folder of the column, this is excellent for Folder option in SSAS CUBE. This feature could be used to exclude columns from auto generated view.  N.B. In the pre-populated model, it has value Exclude From The CUBE, but the underlined code only prohibits columns which have value exclude. So either change the macro code or alter the value in the cell.

Column Description – Great way to explain what the value of each column means. It is very nice to have a very documented database so that it would not be a black box

ETL Rules – Again another convenient option, this enables ETL developers to understand what exactly needed for a particular column. I use this column more like, if there is a filter or anything you want to filter at the source I put them under BizFilter or Table level comments and transformations I wanted to do inside SSIS will go under ETL Rules. 

All Source Fields – All these fields could be  a bit overwhelming or can be very useful. It depends on the type of data warehouse you are creating. If it is a single source than yeah, this is very helpful, but if you have several sources, then I don’t think this can be very handy. However, I like source field name, because if we have power users who are already familiar with source column names but we renamed that column with friendly names, then you can use this column as a tool tip or just for documentation purpose.

That all, by clicking Generate script button from the home sheet, we can create entire database design script. If you look at database, table and column properties, we can see extended properties at all levels

2016-10-26_01-26-09.jpg

The downloadable from Kimball site does not include extended properties for views and does not let you create the view when your table schema does not default to dbo. Description extended property do not get populated. I updated few bits of VB with my very little knowledge of VB Script. For example, adding extended properties for views which will enable us to sync descriptions using BIDS Helper. I think, so far in my life, I used VB only when I worked with this workbook, that’s why I need this blog post, so I don’t need to remember it.

Once we done with creating our original database using the workbook, you can import the database into SSDT.  You may ask, why don’t we just create in SSDT? Of course, we can, but editing in Excel is a lot easier than editing in SSDT. We can use SSDT Schema compare functionality to merge changes quite easily.  Now we have a very well-populated database

2016-10-26_01-44-20

Once all done we can quite easily create an interactive data dictionary using your favourite reporting tool.

2016-10-26_02-13-30

So if you are a consultant who works on several data warehouse projects, then this can be very helpful. You create your model one time and data dictionary, documentation, etc. For next project are all there, you just have to refresh your existing SSRS or power BI model for the data dictionary. You will know where to look when you are creating ETL or CUBE using change log sheet. You can even put more business logic in the spread sheet or whatever you want. This is better than creating several pages of Pdf document, which no one ever read.  If you are good with VB, not like me; then you can extend it much more and look into some of these known issues…

Known Issues:

  • If you want to create a foreign key to a column, and if that table does not have default schema; then the script macro created will not work
  • If you chose the first column to exclude in the view, it would anyway include it, just because that’s how macro designed.

Workbook with all the changes I have made can be download from here – https://1drv.ms/x/s!Avm7gbgZtlMlpjENQrWZX-lpoEl9. Hope it is useful, let me know what you think?

Prathy ?

2 comments
0 FacebookTwitterPinterestEmail