Insights

Power BI Mistake #3: Why hoarding unnecessary data makes your model cluttered and slow

This is the third in our series on common Power BI mistakes.

You can find the other articles here

In this article we’re looking at a simple, easy-to-make but nevertheless important error that should be treated as a cardinal sin and avoided at all costs. That is: including unnecessary data in your model. 

Including items in your data model that will never be used in analysis or in relationships is like keeping all your old clothes that no longer fit and disappeared from style years ago. If it no longer fits or causes you embarrassment when exposed to the wider world, you would donate it to charity, so why wouldn’t you have a clear-out of all your unnecessary data too? 

How does removing unused columns improve a data model?

If only it was as simple as that jumper you bought 3 years ago taking up a hanger and a bit of space in your wardrobe, but this is data.

Unused columns will clutter your model for end users, take up unnecessary memory space, and worsen the performance of reports, so let’s have a look at how removing them makes everything better.

Reduces horizontal space

Firstly, removing unused columns will reduce the horizontal space taken up by the tables in your model. Fewer columns equals less data equals more memory saved. 

Saving memory will have a positive impact in multiple areas, from the time it takes for your visuals to load or your model to refresh, to the amount of free CPU you have on your Premium capacity node. 

These last two items are important to consider, as although report users moaning at slow-loading visuals is bad enough, it’s nothing compared to the wrath of another Power BI model owner whose model refresh has just been throttled by yours. Or the budget holder for analytical platforms who needs to magic up some cash to purchase a beefier node with more CPU.

Figure 1: Reducing horizontal space

Reduces vertical space

Secondly, removing unused columns provides the opportunity to minimise the vertical space of your model through re-aggregating fact tables. This gets rid of a chief culprit of unnecessary data: keeping a unique row ID in a fact. 

For example, imagine an order ID. Useful in a database when trying to locate a particular order, but not so much when running analysis on your best-selling products. 

So why keep it? By removing it, you can roll up your fact to a more appropriate level for analysis, such as orders per month, per product.

Figure 2: Reducing vertical space

Removes clutter

How many users will complain about having fewer items to sift through to find the desired column for their visual? I’d wager very few. Moreover, their reports will likely be more concise and structured, with fewer distracting columns of categorical or string based data to include in their charts.

Eliminating unnecessary data

In general, I would suggest only keeping columns used for analysis and table relationships, this means your keys and date fields, as well as any columns used for aggregations, calculations, categorisation or groupings. 

Here’s how to remove unnecessary, unused data from your model: 

  1. Resist all requests from end users to “show me all the data” — they only want to extract it to Excel for some navel gazing.
  2. Forensically rip through your model and assess the use case for each column, before you move into a test phase.
  3. Gather in-depth requirements from your stakeholders that tell you exactly how and at what level they plan to use your data. This will confirm the columns and granularity necessary for reporting. 

Wrapping up

Much like clearing out old garments from your wardrobe, eliminating unnecessary data from your model will free up space, provide a tidier user experience, and prevent sluggish performance.

Also, I could’ve probably made more out of the clothing and wardrobe analogy, but sometimes you just have to let things go. (See what I did there?)

Get the most from Power BI

Our Power BI experts know the platform inside-out and can help you leverage the full spectrum of its capabilities.

View Power BI Services