How traditional data warehousing concepts have changed/should change with Power BI?

10 Years ago, when I started my career as a BI engineer the first task I was given was to study Kimball methods on dimensional modeling. At that time data warehousing was a specific domain within IT industry and knowing Kimball data warehousing concepts was a must for people who worked in data and analytics. However, with Power BI become increasingly popular among both IT and business users in data and analytic field, I feel that certain DW concepts have changed/ should change by its influence.

What makes the difference?

Well, back then, almost all the reports were created by the IT department and only the IT department knew how the data is calculated, populated and which tables/fields contain what information. However, with Power BI, Business users are more like to create their own reports and perform their own analysis; IT department only needs to provide data to them. I believe this change has impacted profoundly on the design and the implementation of a data warehouse in an organization.

image

Figure 01: How the usage of a DW has changed with Power BI

Apart from that certain features available in tools like Power BI has made some of the data warehousing concepts invalidated and some features need new concepts to be introduced to data warehouse design and implementation. Let’s talk about how Power BI has changed DW design patterns.

Star or Snowflake

First and foremost, it is required to decide on whether to use Star Schema or Snowflake schema in dimensional modeling when designing a DW. In traditional data warehousing this decision is purely base on the data volume and which method is easier to implement. However, it commonly accepted that Power BI works best with Star Schema compare to Snowflakes due to how data is stored in tabular engine and how relationships are maintained. Apart from that Star schema is simple compare to Snowflake and that makes business users life much easier when they perform data modeling and analysis using Power BI.

Surrogate Keys

In Data warehousing, there are couple of ways to implement surrogate keys.

  1. Auto increment integer values
  2. Auto generated unique identifier values
  3. Hash value of natural keys

In method 2 and 3, surrogate keys are populated using string values and when it comes to Power BI, these columns act as high cardinality fields. As we need to use memory in the most optimal way within Power BI, it is ideal to avoid high cardinality columns as much as possible. Therefore, I believe that using auto-increment integer or small integer values as surrogate keys is the best methods if you are using Power BI with the Data warehouse.

Fact Table Surrogate Key is another concepts which has changed with Power BI. Having high-cardinality columns with no business value only consumes precious, limited memory available within Power BI.

Accumulated /Periodic Snapshots Fact Tables

In traditional data warehousing, there are couple of fact table types.

  1. Transaction Fact Tables
  2. Periodic Snapshot Fact Tables
  3. Accumulated Fact Tables

The idea is to extract the most granular level data from source systems and store in transactional fact tables and create periodic and accumulated facts based on that data by grouping data for different periods and stages. Populating these two types of fact tables needs additional ETL jobs. However, with Power BI, data is by default aggregated and implementing time series calculations are straight forward. Along with that, features such as Power BI aggregation, implementing and maintaining periodic and accumulated fact tables has become a resource overkill.

Precalculated Time Intelligence Columns

Implementing  calculated columns for time intelligence values such as YTD/MTD fields within is a common practice within traditional data warehousing. Since time series values are pre calculated, it does not incur any performance issues when business users query the DW. However, with Power BI, time calculations are easy to implement and vertipaq engine is capable of calculating these values on the fly in no time. Hence pre-calculated time series fields are no longer required in a data warehouse.

Table/Filed Naming

In data warehousing, we use to certain naming conventions when we name tables and fields and those naming conventions are only used among technical users. We never worry about how readable and how much it aligned with business terms as we know business users will never going to touch the DW. However, things has changed with Power BI as anyone can explore the DW now. Using suffix or prefix such as Dim/Facts to name tables makes business users life hard. Moreover, having camelCase, Pascal Case naming for tables, columns is not ideal and should use more readable business naming which can easily be identified by business users. Apart from that maintaining same names between primary keys and foreign keys helps to model in Power BI as it automatically creates relationships based on name and data type.

Date Dimension

Date dimension is the most important dimension in any data warehouse. In DW implementation, we maintain a separate ETL process to populate and update the date dimension. However, Power BI has multiple ways to implement a data dimension and it does not require any manual intervention from user. Therefore, unless there is a business requirement, I believe having and maintaining a data dimension is no longer mandatory if you are only using Power BI to analyze data in the DW.

ETL Audit Fields

Having audit information such as Data Load Date/Update Date is very helpful in fact tables as Power BI incremental refresh can leverage that information. Apart from that business users can have a visibility on how up to date their data is and when and who made changes to data. Therefore, they don’t require to comeback to IT team when there is any doubt on accuracy of the data.

Simple Calculations

When I was implementing DW back then, I used to store all the calculated fields within Fact tables irrespective of how simple/ complex those calculations are. Implementing calculated fields such as Min, Max, Average within ETL process and storing those fields in DW is no longer required as Power BI handles those calculations off-the-shelf.

Conclusion

If there is anything in this world which does not change is changing. Some of the DW concepts has changed and new concepts has come to play with the tools such as Power BI. When Power BI team add more features with the product, I believe that Data warehousing concepts will change even more in near future. Well, we might not need a data warehouse at all.

I hope this post gives you some idea on how Power BI has changed some of the traditional data warehousing concepts. Thanks for reading and stay safe. Cheers!

Power BI: One for All? All for One?

unnamed

This is a question I always had, but with composite models supports Direct Query for  Power BI datasets, I thought I want to write about it. I feel that with the use of composite models feature in Power BI, we can break Power BI development into two parts, front-end and back-end development. The question I had was “Power BI Developer”: Is it one role or one person is performing multiple roles? Don’t you think a Power BI developer puts different hats when developing a single report?

A decade ago within software industry, a software engineer had to perform different roles such as UX Engineer, UI Engineer, software developer , Data base administrator, Infrastructure Engineer, Business Analysist and all other roles we now find in software development world. But things have changed over the time and we see each person only perform one specific task within a software development project. However, if you look an organization, you will see that a Power BI developer is performing all the roles which I’m going to discuss in this post. I believe this one for all approach leads to most problems we see in Power BI implementations within lot of organizations. Let’s see what are the different roles performed by a Power BI developer today.

Power BI Back-End Developer

A role of a Power BI back-end developer is the most critical role. A properly designed model can make a huge difference in an organization as it makes end users life easy. A Power BI backend developer must know:

  • Data modeling techniques such as star schema, snowflake schema.
  • Good knowledge about DAX query language and concepts associated with it such as row context, filter context, etc..
  • Good knowledge on M-Query language.
  • How to use different relationship types and how relationship direction works.
  • How tabular engine works: Formula engine and storage engine.
  • How vertipaq engine stores data.
  • Standards and best practices on data modeling.
  • How to troubleshoot performance issues.
  • How to use Power BI external tools for data modeling.

I have seen how not having a developer with a solid knowledge about all above areas creates problems for Power BI implementations.

Problem with the one for all approach:

  • Badly written Dax codes causing to performance issues.
  • Bad data model designing could lead to performance and usability issues.
  • Zero or little attention given to standards within data models, naming conventions makes business users life harder.
  • Badly written M-queries leads to data loading issues.
  • Incorrect use of calculated columns, datasets leads to bigger data models which leads to performance issues when refreshing data.
  • One dataset for a one report approach leads to complexities and creates strains on source systems.

Power BI Front-End Developer

It doesn’t matter how nice a data model design is, Business users normal don’t pay much attention to the data model design unless they see any performance issues in reports. What they really like to see is how nice their reports are? How easy to use those reports? how quickly you can navigate between reports/ dashboards and visuals? How easy to obtain insights from a report? To cater all these user expectations a Power BI front-end developer must know

  • Proper usage of colors, font-styles, and icons.
  • Good understanding on different features such as buttons, images, bookmarks and drilldown/drill through features available within Power BI
  • UX concepts such as layout of a report.
  • Proper usage of background image, shadow, and all other UI related properties.

When a Power BI developer doesn’t have a good taste in UI, it can leads to lot of problems for a Power BI implementation which could ultimately lead to business users creating their own reports or going back to excel.

Problem with the one for all approach:

  • Poorly designed dashboards/reports
  • Bad choice of colors and fonts leads to wrong interpretation of data.
  • Bad User EXperience for business users.
  • Wrong use of visuals makes it hard to get insights out of a report/dashboard for business users.

Power BI Administer

Power BI administrator is a separate role which mainly focus on maintaining and monitoring the Power BI environment with an organization. A Power BI Administrator must have a good knowledge on

  • How different tenant settings works within admin portal.
  • How to maintain and plan premium capacities within the tenant.
  • What are the different security options available within Power BI tenant level to make sure organizational data is secured.
  • How to monitor usage patterns within the tenant.

Problem with one for all approach:

  • Security related issues as a developer control whole organizational data and distribution of data.
  • Data security issues as little to no attention given to security within the Power BI tenant
  • Misuse of resources within tenant by business users.

Business Analyst

From my experience, there is often a gap between the business user and the Power BI developer. While business users don’t understand technical details behind a report, report developer don’t clearly understand business scenarios business users try to  analyze within a report.  However, in most organizations, it is the developer who talks with the end user and try to get requirements for reports and provides different options to them on how to analyze data.  When the developer perform the role of a business analyst he must need to

  • Be a good listener and understand the business cases explained by the business users.
  • Provide suggestions to business users about how to use different Power BI features to analyze data effectively.
  • Help business users to create a meaningful data stories using Power BI features.

Problem with one for all approach:

  • Invalid interpretations of data.
  • Invalid or meaningless story telling within reports.
  • Wrong calculations, drill through and drilldown scenarios.

Conclusion:

cute-businessman-team-help-each-others-weight-lifting-busin-cute-businessman-team-help-each-others-weight-lifting-flat-128498787

In post I’m not saying that an organization must have multiple people working within Power BI team and each should have only one task to do. But I believe that each member in a data and analytic team must put more time and focus to be an expert in at least one role within a team. For an example, there can be one developer who is an expert in data modeling within a team while there can be another person who has good sense on UI and UX.

Again, All for one or one for all? Have a bit of thinking around this. Thanks for reading and stay safe! Cheers!

Who should get Power BI Premium per User Licenses in your organization?

With the introduction of the Power BI Premium per User licensing, it can be observed that most small to medium organizations have become excited about it more than anyone. The main reason for this is even though they had some use cases which can be accommodated by Power BI Premium, they simply could not afford to pay 4000 USD per month for it.  However, with this new licensing, they can cater those requirements while not spending thousands of dollars.

It creates another problem.

Personally, I don’t believe that small organizations will replace all the existing Power BI Pro licenses with PPU licenses.  While we still don’t know about how much a PPU license costs, I believe replacing all pro license from PPU will cost similar or more than the minimum premium capacity plan. Then the next problem arises: who will benefit from a PPU license in your organizations?  What are the uses cases to buy PPU licenses for you? That is what I try to cover in this post.

Super Users

In an organization, there are two kinds of BI users. First category is basic users who views and analyses reports. They normally go through the reports generated by someone else for them and rarely perform bit of  drilldown and slicing/dicing and filtrations within those reports. For them, information provided within their reports are more than enough in most cases. On the other hand, any organizations have Super BI users.  They always try to bring more data/ analysis to their reports and try to perform deep analysis on the data they have. Specially within teams like business analysists, marketing, and accounting , you will be able to identify these super BI users. These super users can benefit from Power BI premium only features more than anyone.

PPU Features they can benefit:

  • Bigger data models
  • AI capabilities
  • Premium only data flow features such as direct query, linked entities, etc.
  • Deployment Pipelines
  • Higher dataset refresh rate (48 /day)

Power BI developers

Even in a small organization, I have seen at least couple of dedicated developers who creates reports/ dashboards and data models for normal business users. Even though Power BI is a self service BI tool, most of the non technical business users still depend on a technical user to create Power BI reports/dashboards for them . These developers do all the preparation and data modeling within an organization. 

PPU Features they can benefit:

  • Deployment Pipelines
  • Premium only data flow features such as direct query, linked entities, etc.
  • XMLA endpoint connectivity
  • Bigger Data models

Users who love papers more than a screen

In most organizations, you will meet business users who still love paginated reports. They want pixel prefect reports, mostly long tables, matrix, and they like to see all the information in an one place as they used to do couple of years ago. Additionally, most organization have uses cases where they need to print reports and distribute among certain external/internal business users.

PPU Features they can benefit:

  • Paginated reports

Business users who want real-time information

if you ask from business users how frequently you wants there reports/dashboard to be updated, 99% will say that they need real time data. In my experience, 90% of those business users do not have real -time analytical requirements and those requirements can be catered by refreshing their data once or twice a day. Nevertheless,  some people do require to have their reports/dataset updated near real-time. Those users can benefit from a PPU licensing.

PPU Features they can benefit:

  • Higher dataset refresh rate (48 /day)
  • Automatic page refresh

Automatic refresh dashboards

In some organizations, they have displayed beautiful dashboards which refresh automatically when new data comes to source systems.  As a consultant, creating a Power BI reports/dashboards which needs to be displayed in big screens and automatically updates without any user interaction is not an uncommon requirement for me. Having one PPU license is enough to cater this requirement.

PPU Features it can benefit:

  • Automatic page refresh on direct query

Apart from all these user groups and scenarios, other users can continue to use Power BI pro licenses. While certain premium only features such as usage-based aggregation optimization are available with PPU licensing, I don’t believe it makes much of a different for most of the business users within a small organization.  Thank you for reading and stay safe! Cheers!!!

How to create a Power BI Dataflow Programmatically

well, you must be wondering what is this all about. Why would someone want to create a Power BI Dataflow programmatically when you can do it very easily using Power BI service? But there are scenarios where we must do little extra to achieve certain challenges. Let me explain what my scenario was.

I wanted to implement a Power BI template which analyze data generate from a product. This product is a PaSS product and it uses SQL server as the back end system.Each customer has its own database and database schema is almost same for all the customers .

clip_image002

The problem I had was, although all the database had same schema when it comes to transnational tables, there was a one important table with different schema for each database. That table was “Customer” table. In this product, “Customer” table is created by product it-self and all its attributes are configurable by users. In other words, if the “Customer” table of database1 have an attribute called “EmailAddress”, database2 might have the same attribute as “Email” or may be “EmailAddr”. Now in this case, my Power BI template will not work across all the databases as each database have different schema for same “Customer” table. That is where my Dataflow generator comes to play.

By the way, did you know that you can create a Power BI Dataflow by uploading a JSON file? Well you can, only thing you need to do is go to “Import Model” section and provide a valid JSON.

clip_image004

As the first step, I needed to create a template JSON file.I did not want to write everything from the scratch. Therefore, I created a Dataflow using Power BI service and the export as a JSON. This is the JSON I’m going to modify using .Net code to cater my requirement.

clip_image005

Once exported, I got a JSON file like below and in it, I had to change highlighted areas programmatically. First section is query section and it contains the source SQL statement of the Dataflow. Second section is attribute section which represent name and data type of selected attributes. .

image

I created the windows form application shown below to generate Customer Dataflow entity. When I want to rollout a Power BI report, all I have to do is to map database field name to entity type and generate a JSON and then upload it to Power BI service.

clip_image009

In document attribute, I changed “Source” to “Query” and added a native SQL query I created dynamically based on the attribute selected. As you can see, in the select statement attributes are separated from “#lf ,”.

clip_image011

I’m not planning to show all the codes I written in this application show in here. If you think this a valid scenario for you and my badly written code will helps you, comment here and I can share my code with you. Once I export the modify the JSON string, I can upload it to PowerBI using “Import model” option. However, if you do not set “allowNativeQueries” property within the JSON to “false”, Power BI server will throw an exception as shown below when trying to import the Dataflow. If you get this error, make sure you refresh the browser before re importing the corrected JSON file. You can set this property manually by opening the JSON file or within your application before importing.

clip_image013
clip_image015

As I said before, this is not a common scenario. But this can be useful when someone want to enforce some business logic to Dataflow creation or when bringing flex fields into a Dataflow. Again, if anyone need the code, just comment below. Thanks you for reading and stay safe!

8 Popular myths about Power BI

image

As a part of working as a consultant, we get to talk with both business and technical users and that is something I really enjoy about this job. One thing I noticed while talking them is that there are certain things they believe about Power BI which are far from being true and some of those misunderstandings have led them to stop using Power BI within their organizations. In this post, I’m going to talk about these misconceptions and clarify them as much as possible.

1. Power BI can only handle up to 1GB of data volume

This is something I find most people get confused. Let me clarify how data volume limits works on Power BI. if you have a Power BI Pro license, Power BI service allows you to publish maximum of 1GB size PBIX file. It does not mean that a Power BI file can not store data more than 1 GB volume. If the PBIX file is your computer, only size limitation is available memory in your machine. it can be 10GB or 15GB. When it comes to Power BI Premium, maximum size of a PBIX file you can publish is 10GB (depend on the capacity) and it can grow up to 12 GB within the Service. Most importantly, remember that Power BI uses columnar data storage and hence claims to achieve up to 10X compression. Therefore, if you have 5GB data in the source system, I believe you should still be able to fit that into a single PBIX file within Power BI Service even if you are a pro user.

NOTE: With the introduction of Large Data Model for Power BI Premium (still in preview), your data model can store data up to 400GB after compression. You still can not upload a PBIX file more than 10GB.However, it can grow up to 400GB within Service. Thanks for Reza Rad for pointing this out.

Large Data Models in Power BI Premium: https://powerbi.microsoft.com/en-us/blog/large-models-in-power-bi-premium-public-preview/

https://docs.microsoft.com/en-us/power-bi/admin/service-premium-large-models

2. You cannot share a Power BI report to anyone outside your organization

This is again something I have heard from people frequently. Yes, Power BI is tightly coupled with Azure AD when it comes to authentication and authorization. However, it does not mean that you can’t share content with people outside your organization.  Check below link to know how to share reports outside your organization.

Share outside the organization:  https://docs.microsoft.com/en-us/power-bi/collaborate-share/service-share-dashboards#share-a-dashboard-or-report-outside-your-organization

B2B Sharing: https://docs.microsoft.com/en-us/power-bi/admin/service-admin-azure-ad-b2b

Publish Sharing: https://docs.microsoft.com/en-us/power-bi/collaborate-share/service-publish-to-web

3. You need to buy a license to use Power BI service

Some people think that only Power BI desktop is free and if you want to user Power BI service, you need to purchase licenses. It is not completely true. You can use Power BI service free as far as you are not going to share your reports with others, or you are not going to access other’s reports. As far as you are using Power BI service to manage your own content, you don’t need to purchase Power BI licenses to use Power BI Service.

4. You need to have Office 365 to use Power BI

Well, this is very confusing area and some believe that you need to have Office 365 licensing to use Power BI. That is not true. You still can use Power BI and purchase Pro licenses even if you are not part of O365. However, note that Power BI uses Azure AD for authentication and authorization. Therefore, once you use Power BI as a non O365 user, an Azure AD will be created for you behind the seen and you will have to manage users manually.

Licensing in Power BI: https://docs.microsoft.com/en-us/power-bi/admin/service-admin-licensing-organization

Use Power BI without O365:

https://community.powerbi.com/t5/Service/Using-Power-BI-service-without-an-Office-365-account/td-p/131287/page/2

5. Power BI is just a data visualization tool

Isn’t Power BI just a visualization tool? This is something I hear from lots of business users. Trust me, it is not. It is all depend on how you use Power BI. You can obviously use Power BI to visualize data. Nevertheless, Data modeling capabilities, data sharing capabilities, Dataflow, CDS and Certified datasets makes it more of a business intelligence platform rather than just a visualization tool.

6. Get Power BI and it will solve all your data problems

Sometimes I overhear people say that now we have Power BI, we don’t need to worry about all the data and analytical requirements . At the same time, I have seen data engineers complain by saying that they don’t get proper recognition as business users only look at fancy Power BI reports and give all the credits to report developers and business analysists. Power BI is not a magic tool. You still need a proper data solution to your organization. Components like Data Lake, Data Warehouse, Master Data Management and Data Quality solutions  still play a huge role in catering data analytical needs of an organization. Power BI is just a part of a big picture.  

7. Import mode is always better than direct query

This is something I have seen always all the time when people develop reports.  Although the default choice in Power BI is Import mode, it does not mean it is better than Direct Query mode. Performance wise, it may be (again may be) better than Direct Query mode, but you need to sacrifice certain important things such as real time data analysis, scheduling / maintaining data refreshes and data volume. To be honest, you might not notice a difference in report rendering time when you use Direct Query instead of Import Mode in most cases.

8. One report, one model

I wouldn’t call this is a myth per se. it’s something most people unaware of.  When someone learn to use Power BI, you bring the data to PBIX file and create a report using that data. That practice continues and people create so many data models and schedules to refresh for all those models within Power BI service.   Most people don’t use Share datasets, Certified/Endorsed datasets to create Power BI report. You can use one model to create multiple reports and trust me, that is how you should do it.

Shared Datasets: https://docs.microsoft.com/en-us/power-bi/connect-data/service-datasets-share

Certified Datasets: https://docs.microsoft.com/en-us/power-bi/connect-data/service-datasets-certify

Well, that is it for this post. As always, if you think I have missed something or I got something wrong, free free to add those in comment section.  Thank you for reading and have a wonderful day!!

How to fit an elephant into a fridge? (Only for Power BI Users)

Well, how many of you have faced memory exceptions when loading data into a Power BI model? How many of you have worried that a Power BI file might exceeds 1GB limit when publishing to the service? How many of you have faced problems due to time consuming data refreshes? Few days ago, I faced all those problems and this post is all about how I tried to overcome those issues as much as possible.

image

What makes your PBIX file big and what makes it throws memory exceptions? It all comes down to data and calculated columns. In Power BI, since it uses columnar database, tabular engine tries to compress data as much as possible. It provides optimum compression when a column has low number of unique values (Low- Cardinality). When no of unique values is high (High-Cardinality), it takes more space in both memory and disk.   Let us see what we can do use this precious memory as effective as possible.

1. Remove all unwanted columns within Power Query Editor

First thing is first. When you load data to your Power BI report, try to filter out all unwanted columns. Most tables, datasets have lots of fields coming from source systems and most of those fields might not be meaningful for any analysis. You can either exclude those columns when you write a select query or can use Power Query editor to remove before you load data to the model.

Remember hidden columns still take space in your model. Rather than hiding columns within Power BI, consider remove those within Power Query Editor.

2. Remove unwanted rows

This may seems obvious. But how often we filter out unwanted rows when loading data to a model. Analyze your data and identify invalid data scenarios, historical data which are not meaningful to business users, etc. and filter those records within your query or Power Query editor.

3. Avoid high cardinality columns

Single high-cardinality column in one of your biggest table can take 75-80% of total space within the model.  Keep your eye on text fields such as email addresses, usernames and comments within large tables as those are the main culprits inside your model.  Try to remove those within Power Query editor.

Another option is to break a high cardinality column into two low cardinality columns. For example, if you have a column with invoice numbers like “INOP:76857”, you can break this into two columns, one with “INOP:” and other one with “76857”

4.Avoid Large table to Large Table joins inside model

Let’s say you have two large tables and you need to join these two tables within the model. As an example, If you have Order Header table and Order Detail table and both tables have millions of records, these two tables take space within your model separately. Rather than bringing both tables to the model and then making relationships, you can join these two tables within your select query and then bring it as a one dataset. This way you can save space you need for primary keys and foreign keys; which are obliviously high-cardinality columns.

5.Date-Time/Time fields

Datetime fields as well as Time fields can easily be high cardinality columns within a large table. To overcome that you can break data-time column in to two columns. one for date part and other one for time part.

Additionally, If you can compromise precision of those data, you can round time field into nearest 15 min or 1 hour based on your analytical needs. That way you change a high cardinality column into a low cardinality column.

6. Remove/ hash unique-Identifier

Unique identifiers in source systems are always causes issues to me. I personally don’t know why people use a unique identifier as the primary key of a table. When you have a large fact table with a guid field, it can take up to 70-80% of total memory used by that table.  if possible, exclude it before loading to the data.

Another thing you can try is to hash unique-identifiers into integer. You can use a function like BINARY_CHECKSUM  for this conversion. However, it must be a deterministic hash function to avoid any data issues. 

7. Avoid calculated columns

It is not only data, calculated columns also take lot of space in your model. If you have multiple calculated columns, some of which may be act as an intermediate step to create another column, consider removing those columns. Consider replacing a calculated column using a measure whenever possible. You can use iterating DAX functions to use row context in side the measure.

8. Aggregations and Composite Model

If there is a possibility, consider using Power BI aggregations to bring only summarize data to the model and keep detail data in source system. That way you can keep your model small while not compromising analysis you want to perform in the report.

9. Incremental refresh

Now that incremental refresh feature is available for pro users, you can stop full refreshing large transaction tables in Power BI.  It reduces refresh time as well as avoid unwanted strain on source systems.

10. Use Vertipaq Analyzer/DAX studio to understand your data model

Vertipaq Analyzer and DAX studio allow you to analyze your data model and gives you lots of details such as data size, data types, cardinality about each field in a table. With that information you can understand which tables/columns uses more memory in the model and you can act base on that.

image

Vertipaq Analyzer: https://www.sqlbi.com/tools/vertipaq-analyzer/

DAX Studio: https://daxstudio.org/

Well, that is it for this post. If you have any other ideas, please comment so that someone else can benefit from that. Thank you for reading and cheers!!

y

Are you a Power BI Pro user? Let us fight for what you are missing

Among lots of cool things about Power BI, one thing I really like is that as community we can provide ideas to Power BI team. Not like some other products, Power BI team puts some serious thoughts and processes behind those ideas, and they try hard to make those ideas become reality. For example, few months ago, incremental refresh was only available as a Premium feature and due to high demand from community, Power BI team decided to rollout it for Pro users as well.

image

In this post I’m going to list out top premium only features (as of now) worth fighting from as community. Obviously, my top list might not be your top list. Nevertheless, if you feel that particular feature is valuable for you, your client or to the community, please vote for it using the link given against each feature.

The motivation behind writing this blogpost is bit of a disappointment I had after announcing Power Deployment Pipelines feature only for Premium Capacity. Personally I believe that most of the business users cannot afford to pay for Power BI premium licensing, and this feature was something I was really looking forward as it makes my life easy as developers and it allow me to bring CI/CD practice to Power BI at last.  Enough said, let’s go to premium only feature list.

1. Paginated Reports

Paginated reports, in other words old school SSRS reports, is something business users love and find very useful. It does not matter how fancy your dashboard or reports looks like. They still ask for that pixel perfect reports which they have been presenting as a printed copy to the board.  Since this feature is not available in pro, I have seen some clients still maintain their existing SSRS server just to carter that requirement.

image

Vote here:  https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/35959420-paginated-reports-please-make-it-available-in-pr

2. Deployment Pipelines

The very motivation behind this post. If you are a pro user, you need to manual copy PBIX file among different folders every time you need to deploy your changes to either QA or Production. Apart from that you need to change all connection information manually each time you do a deployment.  However, with this feature, you can deploy reports/datasets with the matter of few clicks.

image

Vote here: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/40452829-power-bi-deployment-pipelines-preview-for-pro-us

3. Machine Learning Modules

If you have a premium capacity you can use in-built AI functions within Power Query editor against your data. Apart from that, you can even use Azure Cognitive service to bring power of AI to your report. For example, if you have a text field with comments from users, you can perform sentimental analysis on that comments while loading data. Like the idea? You know what to do.

image

Vote here:  https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/37338424-automl-available-for-pro-users

4. Read/Write XMLA Endpoints

With the introduction of XMLA endpoint, premium users now can use tools like Visual Studio, SSMS to create and modify Power BI datasets. With the introduction of this, cool features we used in analysis service such as translations, perspectives and calculation groups can now be enabled in Power BI datasets. Additionally, having access to the tabular database behind a Power BI dataset, XMLA endpoint enables lots of other possibilities which are restricted in Power BI.

Learn More: https://docs.microsoft.com/en-us/power-bi/admin/service-premium-incremental-refresh

image

Vote here: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/39756142-allow-xmla-access-to-pro-content

5. Incremental Refresh for Data Flow

Although Power BI team has rolled out incremental refresh for datasets, still incremental refresh is not available to data flows for pro users. 

image

Vote here: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/39754699-incremental-refresh-in-dataflow-for-power-bi-pro-u

6. Change Detection

In 2020-April version Power BI team announced that premium users now can refresh a Power BI report when there is a change to its data sources (Direct Query Only). This feature enables users to set automatic report refresh only when it’s required.

Learn more: https://powerbi.microsoft.com/en-us/blog/power-bi-desktop-april-2020-feature-summary/#_Change_detection

When I check for an idea for this, I could not find anyone has requested this for Pro users. May be because this feature is just 1 month old. Nevertheless, if you feel you need as a pro user, go and create a new idea for that.

7. Direct Query on Data Flow

Direct query for data flow is as of now premium only feature.  By the time I write this post, I could not find an idea to make it available for Pro users.  You can create an idea for that if you find it is important to you.

Well, as most of you, I’m also a pro user and I believe as community we can fight to get at-least some of these features enabled to make our life easier and give a better solution to our clients. Give your ideas, vote for ideas and contribute to the growth of the product we all love. Thanks for reading and stay safe!!

Must-have (awesome) tools if you are an advanced Power BI developer

I think we can all agree that Power BI is an awesome tool. However, even an awesome software can use some improvements and that is why Power BI team gives monthly updates with all these new cool features.  For me, Power BI has two set of users: Business Users and Developers. Business users are mainly focusing on creating simple data models and analyzing/visualizing data. Whereas developers (or advanced users) focus on creating complex models/reports, handling the development cycle of  a project and improving/analyzing performance etc. For those advance users, in-built capabilities of Power BI are not sufficient in certain scenarios. To fill that gap, 3rd party tools have come to play. In this post I am going to discuss mostly used free 3rd party tools which I believe must-have if you are an expert Power BI developer.

DAX Studio

My favorite!!  If you are writing complex DAX calculations in your Power BI model, this is a must-have tool. This allow you to write, debug and analyze your DAX queries.  With IntelliSense and DAX formatter, it allows you to write queries easily and cleanly. But the coolest thing about this tool is amount of information it provides about the query execution such as Query Planning and Server Time. This information allows you to analyze the performance of your queries and enhance accordingly.

image

image

Download link: https://daxstudio.org/

Main Features/Functionalities:

  • Query execution plan
  • Server timing
  • DAX function panel
  • IntelliSense for DAX
  • DAX query formatting
  • Load Power BI Performance data

Tabular Editor

This a nice tool if you want to develop/modify a Power BI model as an advanced user. If you are coming from SQL server analysis server background, this tool will help you to bring that knowledge into Power BI development.  However, be careful when you modify a Power BI model using Tabular Editor as this is designed mainly for SSAS tabular model development and hence might not be compatible with Power BI.  It is recommended you save your model as a Power BI template and then open using Tabular editor to modify it.

image

image

Download link:https://github.com/otykier/TabularEditor/releases/tag/2.9.8

Main Functionalities:

  • Advance Power BI model features as calculation groups and translations
  • Viewing and organizing tables/fields/measures and relationships easily

Power BI AML Toolkit

If you have used or heard about BISM normalizer, this is an extension of it. This tool allows you to compare two PIBX files and shows the difference between two models. Most importantly, it allows you to push changes from one dataset to another in a single click. If you have premium capacity, this can connect to Power BI service and show the difference between local version and deployed version. Not only that, again if you have premium capacity, this allow you to update the data model in the service, only the changes (meta data) without reloading data. How cool is it?

image

image

image

Download link:http://alm-toolkit.com/

Main Functionalities:

  • Compare two Power BI datasets for changes
  • Update one Power BI dataset from another
  • Export differences between two datasets to excel

Power BI Helper

Power BI helper allows you to understand a Power BI model. It extracts the meta data of a Power BI model and visualize in a nicely organized way so that you can understand impact and usage easily. Apart from that it can be used to document your Power BI model. If you put a description to each object inside the model, this tool can extract all meta data and description and will generate a documentation about the model.

image

Download link: https://powerbihelper.org/

Main Features:

  • Model Analyzer
  • Model Advisor
  • Modelling Advisor
  • Export meta data as a document

Other tools

Vertipaq Analyzer

More detail : https://www.sqlbi.com/tools/vertipaq-analyzer/

DAX Formatter

More detail :https://www.daxformatter.com/

Covering all the features of these awesome tools from a single blog post is not a feasible task. Therefore, I suggest you try it out by yourself and see how much it helps to your day-today development tasks. At the same time, I will update this post if I come across anything new. Thank you very much for reading and stay safe!. Cheers!!

Export all Power BI built-in themes

From 2019 – December version, Power BI allows to export existing themes as theme files. With this, rather than creating a theme file from the scratch, we can export existing theme and modify according to our requirement. However, if this has few limitations. First, it does not allow to export the default theme. As you can see in below image, if we select “Default” theme, export current theme option is disabled. For all other built-in theme, you can select the theme and then export it. However, this means you need to apply that theme one by on to the report, and then export one at a time.

In this post, I’m going to show you a quick and easy way to export all Power BI themes within a minute. It include exporting the default theme as well.

For that first you need to enable the preview feature : “Store datasets using enhanced metadata format” in Power BI.

More about new Power BI metadata format: https://docs.microsoft.com/en-us/power-bi/desktop-enhanced-dataset-metadata

To enable it, Click “Files” -> “Option and Settings”-> “Options” -> “Preview Features”. Once you enable that preview feature, you might have to restart Power BI to make it effective.

After restating Power BI, create an empty Power BI template file using clicking “Files”-> “Export”-> “Power BI template”. You don’t require to load any data or create any visuals inside your Power BI file for this.

Next, go to the saved “.pbit” file and change it’s extension to .zip. In other words, if your file name is “Demo.pbit” , rename it to “Demo.zip”. Now unzip the file you created and you will see set of files and sub folders inside the extracted folder. Check the image below.

If you go the “..\Report\StaticResources\SharedResources\BuiltInThemes” folder, you will see all the built-in theme files are available inside the folder as JSON files. You can open these files and customize based on your requirement.

If you want to export “Default” theme which is not available from Power BI UI, you can find it inside “..Report\StaticResources\SharedResources\BaseThemes” folder. As shown in the image below, “CY19SU12.json” is the Default theme file and you can modify this file and export back to Power BI as your custom default theme.

Hope this short post help you to play with Power BI themes. Thank you for reading and stay safe! Cheers!!

Best practice guide for Power BI users – Part 3: Designing Reports

In this blogpost I’m going to talk about how to design an effective Power BI report.  While report designing is a highly subjective area, I will try to make it more generalize as much as possible and point out common mistakes I have seen in different reports. If you are new to Power BI, I also recommend you to have a look at Power BI gallery. It contains lots of beautiful reports developed by community and you can get a good understanding about how to improve your report designing skills.

Power BI Gallery: https://community.powerbi.com/t5/Data-Stories-Gallery/bd-p/DataStoriesGallery

Previous related articles:

Part 1:https://asankap.wordpress.com/2020/04/18/best-practice-guide-for-power-bi-users-part-1/

Part 2: https://asankap.wordpress.com/2020/04/21/best-practice-guide-for-power-bi-users-part-2/

Characteristics of an effective report

What makes a report an effective report? That is a tricky question. Let’s try to understand it from both a business user and a developer perspective. I believe an effective report should have following characteristics.

Accuracy

  • This is one of the key aspects of a report. A report should represent data accurately. Although data is accurate, improper representation can make inaccurate impression to the business users. In Power BI, this can be related to how filters are implemented, how interaction between visuals works, scale of the axis, condition format etc. Small things like showing last data refreshed date in a report helps a lot to improve the accuracy of a report.

Clarity

  • A report should provide a clear view of data. Business user should not spend lot of time to understand how a report works. A clear report means business users do not have to perform unwanted maximizing/minimizing charts, zooming in/out and slicing/dicing etc.

Consistency

  • Consistency plays a big role when developing a report. Once business user gets used to filters, colours, layout and images within a report page, it makes easy to go between reports/ report pages if same look and feel is maintained across all reports/ pages.

Extendibility

  • A report should be able to accommodate change-requests coming from the business users.  It can be adding more data sources, adding more analysis or visuals to the report etc. Any changes should not break existing analysis and should be able to handle with minimum effort.

User-friendliness

  • Always remember who the end users of your report are. They are business users, not IT people. Therefore, your report should be self-explanatory and should be able to navigate and analyze. Adding bookmarks, right visualisations and drill though, drilldown capabilities always help you to enhance user-friendliness of your report.

Corporate look and feel

  • Your report must go with the existing corporate look and feel of the business user’s organisation.  You can achieve this by creating a colour theme which uses colours of organisations logo as main colours of the theme. Additionally, you can request good quality logo images, banners from business users which are used across the organisation. If possible, try to incorporate font styles used within the organisation to your custom Power BI theme.

Logical layout

  • Layout of a report should be logical. Remember how human eye works when it sees something on a screen. Our eye follows F pattern or Z pattern when looking at the content of a web page. It is said when the content is mostly textual it follow F pattern and if the content is mostly visual, it follows Z pattern. In both cases, our eye focus to the left top corner first. Check the image bellow and spot red areas. In your report page, Red areas should have most important information such as summery values in a card, important breakdowns in a bar chart like wise. Blue areas should have least important information such as a breakdown in a pie/donut chart.
Untitled Diagram 2

What are the things you need to consider when designing a Power BI report?

Project Structure

When it comes to structuring your reports, it is recommended to have one Power BI Model for one data mart. You can build all reports making live connections to the data model. This way all the calculations, formatting and source data mashup will be in once place and it will be consistent across all reports. This helps you to achieve extensibility of the report. Refer the image below.

Untitled Diagram

Report design

Within an organization, all the reports should have consistent look and feel and colour theme unless business user request otherwise. Beginning of the project, colour theme, fonts, layout must be decided and should be applied across all the reports.

Reports/Sheet/Visualization naming

Each report must be named based on the analysis done with it. Business user must be able to understand the content of a report by looking at its name. Example: Customer Churn Analysis

Similarly, report pages should be named based on overall analysis it’s done. You can use the title of report sheet as the name of the sheet. Example: Churn Summery, Churn by Product Detail Analysis. Avoid using default page names in Power BI such as Sheet1, Sheet2, etc.

Each visual must be named based on analysis it performs. Example: Product breakdown by Sales, Top 50 Customers by Sales. While sometimes Power BI generate a title to visual by fields it has, it is always better to use custom title which can be identified by business user easily.

Visual layout

It is recommended to use report design layout from the beginning of the project and same layout must be used across the organization unless business user request otherwise. Placing visuals in correct location and order is very important.

Filter layout

As a best practice, it is recommended to use left most corner and right top corner as filter area in Power BI canvas. As humans read from top to bottom and left to right, start top right corner for most important slicers and rest can go from top to bottom in left side of the panel.

Background image

Personally, I don’t recommend to using a background image as it adds unwanted complexity to the report. However, you can still use a background which helps with branding for the customer. However, note that a background image can cause unclearness in visuals and values. Therefore, try to keep the image transparent as much as possible and test the clearness for most used themes as end user might ask to change theme at any time during the implementation.

Company logo

It is recommended to get company logo from business user rather than downloading it from internet. Request an image with exact size and format at the beginning of the project. Place to logo in left top corner of the canvas unless the client specified otherwise.

Default filters

It is recommended to set default values for a filter when possible. If filers are kept in non-selected status, it could cause performance issue and hit to source systems as Power BI try to extract large data volume every time a business user opens a report. Try to implement dynamic Year and Month Filter using DAX and set it as default filter for Year and Month.

Disable unwanted interaction

Interactions is a powerful feature in Power BI. However, interactions should only be used when it’s required and meaningful. I personally believe that you should disable all the interactions and then enable one by one after checking whether it provides valuable information to business users. Unwanted interaction can cause performance issues and can provide wrong information to business users.

Well, that’s it for this post. In next post I will discuss more about best practices in version controlling and release management in Power BI. Thank you for reading and stay safe. Cheers!!