Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Microsoft Power BI Cookbook

You're reading from   Microsoft Power BI Cookbook Convert raw data into business insights with updated techniques, use cases, and best practices

Arrow left icon
Product type Paperback
Published in Jul 2024
Publisher Packt
ISBN-13 9781835464274
Length 598 pages
Edition 3rd Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Greg Deckler Greg Deckler
Author Profile Icon Greg Deckler
Greg Deckler
Brett Powell Brett Powell
Author Profile Icon Brett Powell
Brett Powell
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Installing and Licensing Power BI Tools 2. Accessing, Retrieving, and Transforming Data FREE CHAPTER 3. Building a Power BI Semantic Model 4. Authoring Power BI Reports 5. Working in the Power BI Service 6. Getting Serious About Date Intelligence 7. Parameterizing Power BI Solutions 8. Implementing Dynamic User-Based Visibility in Power BI 9. Applying Advanced Analytics and Custom Visuals 10. Enhancing and Optimizing Existing Power BI Solutions 11. Deploying and Distributing Power BI Content 12. Integrating Power BI with Other Applications 13. Working with Premium and Microsoft Fabric 14. Other Books You May Enjoy
15. Index

Creating Dataflows

Power BI dataflows are a feature within the Microsoft Power BI service that allows users to transform and combine data from various sources, perform data transformation and cleansing operations, and then store the results in inexpensive Azure Blob Gen2 Storage.

Dataflows also use the Power Query M language and are constructed using a web-based version of Power Query Editor. In effect, dataflows are cloud-based versions of queries that you would otherwise create in Power BI Desktop.

Getting ready

To prepare for this recipe, you will need a Power BI Pro license or trial license. See the Thinking about Licensing recipe from Chapter 1, Installing and Licensing Power BI Tools. Create a new workspace in the Power BI service. This workspace should be a Pro workspace and not a Fabric or Fabric trial workspace. Fabric workspaces include the Dataflows Gen2 capabilities, which are constructed almost identically to dataflows.

In addition, you should complete the Installing SQL Server and AdventureWorks Database and Creating an On-Premises Data Gateway recipes from Chapter 1, Installing and Licensing Power BI Tools. Ensure that your data gateway is installed and that you are signed in.

How to create a dataflow

To create a dataflow, log in to the Power BI service at https://app.powerbi.com/, navigate to the Power BI Pro workspace created in the Getting ready section of this recipe, and then follow these steps:

  1. In the upper-right corner, select New and then Dataflow, as shown in Figure 2.24:
A screenshot of a computer

Description automatically generated

Figure 2.24: Selecting a new dataflow to be created

  1. On the Start creating your dataflow page, click the Add new tables button in the Define new tables area.
  2. On the Choose a data source page, select SQL Server database.
  3. On the Connect to data source page, enter the connection settings and connection credentials for your SQL Server, as shown in Figure 2.25. If your gateway is running on the same computer as your SQL Server, you can use localhost for the server name. Otherwise, specify the name of the computer running SQL Server. Also, specify the database—in this case, AdventureWorksDW2022. Leave Connection set as Create new connection, choose your data gateway, and enter your Windows or SQL Server authentication information. When finished, click the Next button.

Figure 2.25: Data source connection information

  1. On the Choose data page, mark the checkboxes next to FactInternetSales, DimCustomer, DimDate, and DimProduct, and then click the Transform data button.
  2. A web-based version of Power Query Editor is presented. Here, you can transform your data exactly like using Power Query Editor in Power BI Desktop. In our case, we can just click the Save & close button, as no data transformations are necessary.
  3. In the Save your dataflow dialog, enter a name for your dataflow, such as AdventureWorksDW2022 Internet Sales, and click the Save button.
  4. In the upper-right corner, a notification appears. Click the Refresh now button to retrieve the data from SQL Server.

Once refreshed, the dataflow can be used within Power BI Desktop. To do so, open Power BI Desktop, click Get data in the ribbon on the Home tab, and choose Dataflows. The created dataflow and associated tables appear in the workspace within the Navigator dialog. You can select the desired tables and click the Load button.

How it works

Dataflows ingest data from source systems and store that data as entities in Microsoft Common Data Model-compliant folders. This entity data is stored within Azure Data Lake Storage Gen2. You can think of entities as tables.

A refresh schedule can be created for a dataflow in the same way that Power BI Desktop queries can be refreshed within the Power BI service. Refreshes update the data within the Azure Data Lake Storage Gen2 infrastructure. Power BI Desktop reports connecting to the dataflow pull data from the data entities stored within Azure Data Lake Storage Gen2 folders.

Using dataflows, performance can be improved for both Power BI reports as well as the source data systems. Since only the dataflow interacts with the source system versus potentially many different Power BI Desktop queries, this potentially reduces the load on the source systems. In addition, particularly if the source systems are on-premises, refreshes for Power BI reports can be improved, since network latency for retrieving data via a data gateway is removed and Power BI semantic models are refreshed from data already present in the Microsoft cloud.

There’s more...

The decision to use dataflows versus queries in Power BI Desktop depends on a number of factors, including your data preparation needs, collaboration requirements, and the scalability of your solution. Here are some considerations when choosing between a dataflow and a query in Power BI:

Consider Power BI dataflows in the following scenarios:

  • Centralized data preparation: If there are multiple reports or dashboards that require the same or similar data preparation steps, using a dataflow can centralize these transformations, ensuring consistency across various reports.
  • Reuse of data transformations: Dataflows allow the reuse of data transformation logic across different semantic models. If you find yourself applying similar transformations to the same tables in multiple semantic models, dataflows can help in creating a standardized set of transformations.
  • Collaboration: When multiple users or teams need to collaborate on data preparation, dataflows provide a shared environment. This allows for collaboration on the data preparation process, and changes made by one user are reflected for others.
  • Scalability: Dataflows are suitable for handling large volumes of data and complex data preparation scenarios. Dataflows can be optimized for performance and can be scheduled to refresh data at regular intervals.
  • Integration with Power Platform: When working with other Power Platform services like Power Apps and Power Automate, using dataflows can facilitate data integration across these services through the Common Data Model.

Consider Power BI Desktop queries in the following scenarios:

  • Single report-specific transformations: If data preparation requirements are specific to a single report and won’t be reused in other reports, using Power BI Desktop queries can be more straightforward.
  • Ad hoc data exploration: When you are in the process of exploring and analyzing data for a specific report, using Power BI Desktop queries allows for more ad hoc and interactive data exploration.
  • Lightweight transformations: For relatively simple or lightweight data transformations that don’t require extensive reuse or collaboration, using Power BI Desktop queries may be sufficient.
  • Personal projects: When working on personal projects or small-scale reporting tasks, using Power BI Desktop queries provides a more self-contained environment.

In short, dataflows are beneficial for scenarios requiring centralized and reusable data preparation logic across multiple reports or dashboards. In addition, if collaboration and performance are important, dataflows are likely a better choice. Power BI Desktop queries are more suitable for individual, report-specific data transformations, especially in scenarios where extensive reuse or collaboration is not a priority. The choice ultimately depends on the specific use case and requirements.

See also

You have been reading a chapter from
Microsoft Power BI Cookbook - Third Edition
Published in: Jul 2024
Publisher: Packt
ISBN-13: 9781835464274
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image