Building a Data Warehouse, Part 5: Application Development Options
- Part I: When to build your data warehouse
- Part II: Building a new schema
- Part III: Location of your data warehouse
- Part IV: Extraction, Transformation, and Load
In Part I we looked at the advantages of building a data warehouse independent of cubes/a BI system and in Part II we looked at how to architect a data warehouse’s table schema. In Part III, we looked at where to put the data warehouse tables. In Part IV, we are going to look at how to populate those tables and keep them in sync with your OLTP system. Today, our last part in this series, we will take a quick look at the benefits of building the data warehouse before we need it for cubes and BI by exploring our reporting and other options.
As I said in Part I, you should plan on building your data warehouse when you architect your system up front. Doing so gives you a platform for building reports, or even application such as web sites off the aggregated data. As I mentioned in Part II, it is much easier to build a query and a report against the rolled up table than the OLTP tables.
To demonstrate, I will make a quick Pivot Table using SQL Server 2008 R2 PowerPivot for Excel (or just PowerPivot for short!). I have showed how to use PowerPivot before on this blog, however, I usually was going against a SQL Server table, SQL Azure table, or an OData feed. Today we will use a SQL Server table, but rather than build a PowerPivot against the OLTP data of Northwind, we will use our new rolled up Fact table.
To get started, I will open up PowerPivot and import data from the data warehouse I created in Part II. I will pull in the Time, Employee, and Product dimension tables as well as the Fact table.
Once the data is loaded into PowerPivot, I am going to launch a new PivotTable.
PowerPivot understands the relationships between the dimension and fact tables and places the tables in the designed shown below. I am going to drag some fields into the boxes on the PowerPivot designer to build a powerful and interactive Pivot Table. For rows I will choose the category and product hierarchy and sum on the total sales. I’ll make the columns (or pivot on this field) the month from the Time dimension to get a sum of sales by category/product by month. I will also drag in Year and Quarter in my vertical and horizontal slicers for interactive filtering. Lastly I will place the Employee field in the Report Filter pane, giving the user the ability to filter by employee.
The results look like this, I am dynamically filtering by 1997, third quarter and employee name Janet Leverling.
This is a pretty powerful interactive report build in PowerPivot using the four data warehouse tables. If there was no data warehouse, this Pivot table would have been very hard for an end user to build. Either they or a developer would have to perform joins to get the category and product hierarchy as well as more joins to get the order details and sum of the sales. In addition, the breakout and dynamic filtering by Year and Quarter, and display by month, are only possible by the DimTime table, so if there were no data warehouse tables, the user would have had to parse out those DateParts. Just about the only thing the end user could have done without assistance from a developer or sophisticated query is the employee filter (and even that would have taken some PowerPivot magic to display the employee name, unless the user did a join.)
Of course Pivot Tables are not the only thing you can create from the data warehouse tables you can create reports, ad hoc query builders, web pages, and even an Amazon style browse application. (Amazon uses its data warehouse to display inventory and OLTP to take your order.)
I hope you have enjoyed this series, enjoy your data warehousing.