Intelligent BI : Rahul Agrawal's BI Blog: September 2009

Wednesday, September 30, 2009

Disable and Enable Index

Microsoft introduced the concept of disabling an Index in SQL Server 2005.

They syntax to disable the index is as follows:
ALTER INDEX indexname ON tablename DISABLE
e.g.
ALTER INDEX idx_TEST ON dbo.TEST DISABLE

However, contrary to popular belief syntax to enable the index is not
ALTER INDEX idx_TEST ON dbo.TEST ENABLE

The correct syntax to enable an Index is
ALTER INDEX idx_TEST ON dbo.TEST REBUILD

Remember the keyword REBUILD for it.

Monday, September 28, 2009

The Evolution and Future of Business Intelligence

A good article explaining the evolution of Business Intelligence taking an example of a hypothetical Telecom Company and how it emerges during the different stages of evolution.

BI Gen I : Reporting using the Enterprise Data Warehouse using slicing and dicing. Helps in Reporting at Speed-of-Thought.
BI Gen II : Giving real time Reporting capability to the Data Warehouse using Dashboards and Scorecards. Ability to analyze events as it happens.
BI Gen III : Reporting should not only be Real-Time but also be Actionable. Automated decision making.
BI Gen IV : Simulating Business Strategies based on past available information. Run hypothetical scenarios on real data and analyze results as if it were real.
BI Gen V : Optimize business using Artificial Intelligence as well as Traditional Statistics. Make it Autopilot.

Read the complete article here...
The Evolution and Future of Business Intelligence by Rob Walker

Sunday, September 27, 2009

OLAP Jokes to tickle your Funny Bone

I came across this collection of really funny OLAP Jokes.

Posting some of them..

Q: Why did the dimension take all day to take off its suit and put on a pair of jeans?
A: It was a slowly-changing dimension

Q: Why couldn't the hierarchy become a tightrope-walker?
A: It was unbalanced

Q: Why did the dimension keep itself to itself?
A: It was a private dimension.

Q: What's the porn director's favourite MDX function?
A: BottomCount

Q: What MDX function makes Previous Year Growth and Market Share blush?
A: StripCalculatedMembers

Q: How did the BI developer send his backed-up database to a colleague on the other side of London?
A: In a .cab file

And my favorite one.

Q: Was the cube straight or gay?
A: Neither, it was BI.

You can read more at the Chris Webb's BI Blog....

Friday, September 18, 2009

US Government IT Dashboard

The US Govenment in June launched a IT Dashboard to keep a track of their IT Spending. This is an extremely good example of use of Business Intelligence by a Federal Government to bring more transparency and acccountability in the way they work.

The Dashboard was developed in flat six weeks time. It is highly interactive and allows you to drill down to individual departments, their investments and presents a scorecard of their performance.

Check out the site at http://it.usaspending.gov/

Surrogate Key in Data Warehouse

Simply put, Surrogate Keys are substitution to original Primary Key of a Dimension Table. They still act as a Primary Key but are Data Warehouse generated unique Integer values.

One may argue if these are simply Substitution then why can't we simply use the original Primary Keys. Well, it has its own advantages:

What if the Source Primary Key Changes? (We agree this is an unlikely situation, but yes this can happen)
What if data is integrated from Multiple Sources? (How do we uniquely identify a row in this case. Because both system may have their own key, and the key can be repeated in both systems. Well, this is not possible if we use a GUID column, because as the name says it is Globally Unique Identifier. But we will come back to this later.)

Thus, as the Surrogate Key is generated and maintained by the Datawarehouse System and not the the Source/Transactional Sytem, we can easily overcome these issues. The Surrogate Key acts as a bridge or substitution over the Source System.

It also helps us in dealing with Dummy Values. Normally -1 or 0 is used to denote a Dummy Value. (e.g. If you receive Sales without Promotions you can put -1 as a value for "No Promotions" instead of NULL Values). While handling large queries joins between Integer values are more efficient than NULL Values.

Another important characteristic of Surrogate Keys are it should be meaningless. No Intelligence should be derived from these keys. A lot of source system has keys where there are hidden meanings in the Keys. (e.g. A source key value XXYYYZZZ1234 the first 2 characters may refer to country code, the next 3 department code and so on.) This may be good for a Transactional System but not for a Data Warehouse environment. If we come across such scenarios it is always good to bifurcate the meanings and store it into different columns using ETL. So that later they can be referred as individual columns in a SQL Query which will be more efficient.

Another great advantage of Surrogate Keys are it helps optmize space. These days a lot of Source System has their Primary Keys as AlphaNumeric Values or GUID Columns. A GUID column occupies 16 bytes and it can be anything for a AlphaNumeric value. Whereas an integer value generally uses 4 bytes. Thus, in comparison to GUID it saves 12 bytes. This may look small in isolation or when you simply see a Dimension Table. But Imagine a saving of 12 bytes over a 100 million rows Fact Table, which is pretty moderate. This means a saving of around 1 GB and we are talking only one column. What if the table has 1 Billion rows, it will save around 11 GB and if you have multiple GUID columns.. Huh!!!

Thus, we see the benefits of having a Surrogate Key over a normal Primary Key. It also goes by the following names in many environments.

Alternate Key
Non Natural Key
Meaningless Key
Integer Key
Artifical Key