Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Power BI Machine Learning and OpenAI
Power BI Machine Learning and OpenAI

Power BI Machine Learning and OpenAI: Explore data through business intelligence, predictive analytics, and text generation

eBook
€20.98 €29.99
Paperback
€37.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Power BI Machine Learning and OpenAI

Requirements, Data Modeling, and Planning

You begin your journey by assessing the requirements and data for your project. The use case will be a fictional scenario, but everything will be built using real data from the Federal Aviation Administration’s (FAA) Wildlife Strike Database. The data is real, the topic can be understood by anyone, and the findings within the data are interesting and fun. According to the FAA’s website, about 47 animal strikes are reported daily by aircraft. These incidents can damage airplanes, potentially endanger passengers, and negatively impact wild animal (especially bird) populations.

For the use case, you have been assigned to provide your leadership with tools to do an interactive analysis of the FAA Wildlife Strike data, find insights about factors that influence the incidents, and also make predictions about future wildlife strike incidents and the associated costs. The primary goal of your project, predicting the future impact of FAA Wildlife Strikes, will require building some Power BI machine learning models.

Before uploading data to Power BI’s machine learning (ML) tools, you’ll need to create tables of data that will train the ML models. There is an old saying about data and analytics: “Garbage in, garbage out.” Software as a Service (SaaS) machine learning tools are easy to use, but you still need to feed them good-quality curated data. Identifying the right training data and getting it into the right format are crucial steps in an ML project.

This project will encompass data exploration, data transformation, data analysis, and additional downstream data transformations before you begin working with Power BI ML tools. You are already an experienced business intelligence (BI) professional and Power BI user, and now you are ready to take your skills to the next level with ML in Power BI!

Power BI supports connections to source data in many different formats, ranging from relational databases to unstructured sources to big flat tables of raw data. Countless books have been written about the best ways to structure and model data for different use cases. Rather than dive into the specifics of data modeling, for this book, we will begin with two simple assumptions:

  • Most of the time, a star schema design will provide the most efficient storage and query performance for business intelligence data models
  • Basic ML models, such as the ones you will build in this book, are usually created with a flattened table

Just to be clear, not every solution will follow these assumptions. Rather, these assumptions are generalizations that can provide you with a starting point as you approach the design of a new data model. Quite often, there will not be a perfect answer, and the optimal design will be dictated by the types of queries and business logic that are generated by the end consumers of the data model.

If you’ve never heard the terms star schema and flattened data before, don’t worry! The book will progress at a pace that is intended to help you learn and will also stay at a level that makes sense when you review the FAA data. Let’s browse the FAA Wildlife Strike data and decide upon the best data modeling strategy for your new project!

In this chapter, we will take the following steps so that you can understand the data, think through how it will be used, and then formulate a preliminary plan for the data model:

  • Reviewing the source data
  • Reviewing the requirements for the solution
  • Designing a preliminary data model
  • Considerations for ML

Technical requirements

For this chapter, you will need the following:

Reviewing the source data

You begin your journey by digging into the source data that you will be using for your project. Let’s get started!

Accessing the data

The source data that you will be using for this book is real data from the United States FAA. The data contains reports filed when aircraft struck wildlife. There is a website providing details, documentation, updates, and access instructions at this URL: https://wildlife.faa.gov/home. The URL (and all URLs) will also be linked from the affiliated GitHub site at https://github.com/PacktPublishing/Unleashing-Your-Data-with-Power-BI-Machine-Learning-and-OpenAI in case changes are made after this book has been published.

If you’d prefer to follow along using the finished version of the content from this chapter rather than building it all step by step, you can download the PBIT version of the file at the Packt GitHub site folder for Chapter 1: https://github.com/PacktPublishing/Unleashing-Your-Data-with-Power-BI-Machine-Learning-and-OpenAI/tree/main/Chapter-01.

Within the FAA Wildlife Strike Database website, you can navigate to this link and run basic queries against the data and familiarize yourself with the content: https://wildlife.faa.gov/search. There are also two files linked from this page that you can reference while reviewing the source data.

At the time of this book’s writing, the second heading on the web page is titled Download the FAA Wildlife Strike Database, and it has a link titled Download that allows you to download the entire historical database along with a reference file. You can download the files from the FAA site for the purposes of this book. There will also be a Power BI PBIT file containing the results of the efforts of this chapter at the GitHub repository. A PBIT file is a Power BI template that can be populated with the files that you download from the Packt GitHub site. If the FAA data ever becomes unavailable, you can still proceed with the contents from the GitHub site to recreate the contents of every chapter.

The files you will be using from the FAA are as follows:

Exploring the FAA Wildlife Strike report data

The wildlife.accdb file is in an Access file format that can be opened with many different tools including Microsoft Access, Microsoft Excel, Power BI, and many more. For the purpose of this book, you will open it using Power BI Desktop. Power BI Desktop is available as a free download at this link: https://powerbi.microsoft.com/en-us/downloads/.

  1. First, open up Power BI Desktop. Once it is open on your desktop, select the Get data drop-down menu from the ribbon and click on More… as shown in the following screenshot:
Figure 1.1 – Connecting to data with Power BI Desktop

Figure 1.1 – Connecting to data with Power BI Desktop

  1. Next, within the Get data window, select Access database and click Connect:
Figure 1.2 – Access database connector in Power BI

Figure 1.2 – Access database connector in Power BI

  1. Select the Access database file that was downloaded and unzipped from the FAA Wildlife Strike Database, named wildlife:
Figure 1.3 – The wildlife file shows up in Power BI

Figure 1.3 – The wildlife file shows up in Power BI

  1. Select the STRIKE_REPORTS table and click Transform Data:
Figure 1.4 – Preview of the data before making transformations

Figure 1.4 – Preview of the data before making transformations

  1. The Power Query window will open in Power BI Desktop with a preview of the FAA Wildlife Strike data. On the ribbon, select the View header for Data Preview, and then check the boxes for Column quality, Column distribution, and Column profile. These features will provide some insights for the data preview, that helps you explore and understand the data:
Figure 1.5 – Data Preview features in Power Query

Figure 1.5 – Data Preview features in Power Query

In Figure 1.5, notice that the first column, INDEX_NR, is highlighted. You can see that none of the values are empty, none have errors, and in Column statistics at the bottom of Figure 1.6, every value is a unique integer. The name INDEX_NR gives it away, but this column is the unique identifier for each row of data.

Let’s review another column in Power Query. Go ahead and highlight TIME_OF_DAY. As you can see in Figure 1.6, there are four distinct values and about 12% are blank. Blank values are an important consideration for this solution. Non-empty values include terms such as Day, Dawn, Dusk, and Night. What does an empty value mean? Was the field left blank by the person filing the report? Was it not entered properly into the system? You’ll revisit this topic later in the book.

Figure 1.6 – Column statistics help with understanding data

Figure 1.6 – Column statistics help with understanding data

Since there are over 100 columns in the FAA Wildlife Strike reports’ data, we won’t discuss all of them in this chapter. That being said, reviewing each and every column would be a great way to review the data for errors, empty fields, distribution of values, and more. For the purposes of this chapter, go ahead and open up the read_me.xls file that was included with the ZIP file from the FAA. The first sheet is Column Name and contains the names and descriptive data about the columns in the wildlife.accdb file. Most of the columns fall into one of the following categories:

  • Date and time fields detailing the dates, times, and years for different events related to each report
  • Descriptive information about the event such as height of contact, latitudes and longitudes, originating airports, and flight numbers
  • Descriptive information about the aircraft such as ownership, aircraft type and manufacturer, number of engines, location of engines, and so on
  • Estimates of the damage due to the strike such as costs, costs adjusted for inflation, damage location on the aircraft, and more
  • Information about the wildlife struck by the aircraft including species, size, quantities hit, and so on

Once you’ve finished browsing the report data, close the read_me.xlsx document on your desktop, and then connect to it from Power BI per the following steps. The document version used in this book can be downloaded from the Packt GitHub site here: https://github.com/PacktPublishing/Unleashing-Your-Data-with-Power-BI-Machine-Learning-and-OpenAI/tree/main/Chapter-01.

  1. Click on Excel Workbook in the left-hand panel:
Figure 1.7 – Excel Workbook is a new source of data

Figure 1.7 – Excel Workbook is a new source of data

  1. Select the read_me file from the browser and click Open:
Figure 1.8 – Excel file ready to open in Power Query

Figure 1.8 – Excel file ready to open in Power Query

  1. Tick the Aircraft Type, Engine Codes, and Engine Position boxes. Then, click OK.
Figure 1.9 – Sheets in Power Query can be individually selected

Figure 1.9 – Sheets in Power Query can be individually selected

After clicking OK and importing the three sheets, notice that Aircraft Type, Engine Codes, and Engine Position are now available in Power Query as three separate tables of data:

Figure 1.10 – Three new tables are previewed in Power Query

Figure 1.10 – Three new tables are previewed in Power Query

The three tables contain descriptive information about values that exist in the FAA Wildlife Strike reports’ data:

  • Aircraft Type: A table that maps the Aircraft Code to a description such as Airplane, Helicopter, or Glider
  • Engine Codes: Information about engine manufacturer and model numbers
  • Engine Position: Details about the location of an engine on the aircraft

For all three of these tables, you’ll notice that there are some unnecessary rows and blank values. You will address these later in the book, so there is no need to make any modifications in Power Query at this time.

Once you’ve browsed the different columns from all the different tables in Power Query, click Close & Apply to import the data into Power BI and save it, per the following diagram:

Figure 1.11 – The Close & Apply button will import data into Power BI

Figure 1.11 – The Close & Apply button will import data into Power BI

Once the data is imported, you can save your Power BI Desktop file as a .pbix file. A copy of the PBIT file named Chapter 1 Template.pbit, which can be populated with the data and then saved as a PBIX, can be found at this GitHub link: https://github.com/PacktPublishing/Unleashing-Your-Data-with-Power-BI-Machine-Learning-and-OpenAI/tree/main/Chapter-01.

Reviewing the requirements for the solution

Now that you’ve explored the FAA Wildlife Strike data, you have a better understanding of what data is available for your solution. The original assignment that you were given by your stakeholders was as follows:

  • Provide leadership with tools to do interactive analysis of the FAA Wildlife Strike data
  • Find insights about factors that influence the incidents
  • Make predictions about future wildlife strike incidents

Those requirements sound pretty vague! Now that you have a better understanding of the available data, it’s a good time to circle back with the stakeholders and clarify those requirements. You ask them questions such as the following:

  • What types of interactive analysis do you want to do?
  • Are you interested in the impact on endangered bird species?
  • Maybe you’d like to view trends due to wildlife strikes and compare airports and regions?
  • Perhaps you’d like to see the height and frequency of wildlife strikes at different times of the year?
  • Would understanding correlations to factors such as aircraft size, time of day, season, geography, and height be useful?
  • Are you interested in predicting specific risks or outcomes?

Entire books have been written about gathering requirements for data and analytics projects, and discussions on this particular topic could also be extensive. For the sake of keeping this book consumable, let’s assume that your discussions with the stakeholders led to prioritizing the following deliverables for the project:

  • Analytic report: Viewing trends over time such as number of incidents, location of incidents, height, and details such as types of aircraft and wildlife species
  • Predict damage: When a strike is reported, make a prediction as to whether there will be a cost associated with any damage
  • Predict size: When a strike is reported, make a prediction about the size of the wildlife that struck the aircraft
  • Predict height: For wildlife strikes, predict the height of the incidents

Now, you review the notes you took about the FAA Wildlife Strike data during your data exploration efforts. In doing so, you can think about how the data might match up to the use cases. Based on the requirements and your initial exploration of the data, you decide that the FAA Wildlife Strike data from the wildlife.accdb file and the tables from the read_me.xls file (Engine Codes, Aircraft Type, and Engine Position) are appropriate content to include during the initial phases of your project.

Designing a preliminary data model

Earlier in this chapter, we made two simple assumptions about data modeling:

  • Most of the time, a star schema design will provide the most efficient storage and query performance for business intelligence data models
  • Basic ML models, such as the ones you can build in this book, are usually created with a flattened table

Now that you have a grasp of the underlying data and requirements, it is time to think about the data model for your FAA Wildlife Strike data solution. Logically, you can describe your tables of data as follows:

  • STRIKE_REPORTS (from wildlife.accdb): Each row represents a report that was filed. The table of data contains both descriptive values (date, location, and type) along with values that can be summed up and averaged (height and costs).
  • Engine Codes (from read-me.xls): This contains information about the aircraft engines that can be tied to STRIKE_REPORTS.
  • Aircraft Type (from read-me.xls): This contains information about the aircraft that can be tied to STRIKE_REPORTS.
  • Engine Position (from read-me.xls): This contains information about the aircraft engine positions that can be tied to STRIKE_REPORTS.

At this point, you are faced with some data model choices. No matter what decision you make, some people might question your architecture, since there is no perfect design. Depending on how end users will use the data, the data model design may change. This book will demonstrate some of the differences in data model designs for ML models versus traditional BI designs. At a high level, there are three basic approaches you can take in Power BI:

  • Flatten: You can flatten all the data onto a single table by joining Engine Codes, Aircraft Type, and Engine Position onto STRIKE_REPORTS.
  • Star schema: You can build out a true star schema with STRIKE_REPORTS as a fact table and Engine Codes, Aircraft Type, and Engine Position as dimension tables. Some additional data from STRIKE_REPORTS would also be broken out into separate dimension tables. For example, AIRPORT_ID, AIRPORT, STATE, and FAAREGION could be separate dimension tables.
  • Hybrid design: You can build out a hybrid design using both a flattened and star schema design pattern for the sake of practicality and ease of use.

Let’s look at each of these in turn.

Flattening the data

Flattening the FAA Wildlife Strike reports’ data would require joining the Engine Codes, Aircraft Type, and Engine Position tables onto the STRIKE_REPORTS table so that everything is on one big flat table of data. The result would be something that looks like this:

Figure 1.12 – Tables on the left are combined to form a single table on the right

Figure 1.12 – Tables on the left are combined to form a single table on the right

The following table contains some, but not all, of the pros and cons of a flattened table of data:

Pros

Cons

  • Simplicity
  • No joins needed for code
  • Commonly used by data scientists
  • Can compress well with columnar databases
  • No relational data models for business users
  • Repetitive data can lead to an inefficient storage footprint
  • Limitations for queries with advanced logic
  • Less flexibility for future change and evolution of solution
  • Complex logical queries can be less efficient

Figure 1.13 – Pros and cons of a flattened table for BI

Next, let’s look at the star schema.

Star schema

A true star schema built to best practices would include relationships between Engine Codes, Aircraft Type, and Engine Position with the STRIKE_REPORTS table. It would also break off parts of the STRIKE_REPORTS table into smaller dimension tables. The following figure is a representation of the approach for a true star schema. There may be more dimension tables that would need to be broken off of the STRIKE_REPORTS table in addition to Location and Species, but this is an example of how it might look:

Figure 1.14 – Tables on the left are combined into a star schema, and some data is split off into new dimension tables

Figure 1.14 – Tables on the left are combined into a star schema, and some data is split off into new dimension tables

The following table contains some, but not all, of the pros and cons of a true star schema design:

Pros

Cons

  • Tables often line up with business logic
  • Balance of minimal data duplication and efficient queries
  • Usually expandable if the scope of the solution grows and new data is introduced
  • Traditionally considered the gold standard for BI data models
  • With modern tools, the benefits of reducing data duplication are less impactful versus older tools
  • Complicated ETL
  • Machine learning models are usually trained with flat tables of data
  • Don’t always scale well with very large data volumes having tens of billions of rows

Figure 1.15 – Pros and cons of a star schema for BI

Hybrid design

For the FAA Wildlife Strike data, combining aspects of a flattened design and a star schema is also an option. At the time of this book’s writing, the entire STRIKE_REPORTS table is fewer than 300,000 rows and has fewer than 100 columns. Only two columns contain verbose free text, so data volume is not an issue when using Power BI. For this particular use case, the differences in data storage requirements between flattened and star schema data models are minimal. With data volumes of this small size, you can design the data model to meet the needs of the solution without some of the performance concerns that would be introduced for data sources with tens of millions of rows or hundreds of columns containing free text fields. Columns of data left on a transaction table that will be used as categories are technically called degenerate dimensions. A hybrid design could look something like the following example:

Figure 1.16 – Tables from the left are enhanced and combined into a star schema with some descriptive data still in the fact table

Figure 1.16 – Tables from the left are enhanced and combined into a star schema with some descriptive data still in the fact table

The following table contains some, but not all, of the pros and cons of a hybrid design:

Pros

Cons

  • Rapid prototyping
  • Less logic in the data transformation layer
  • Flexible design
  • Possibly less performant than a star schema for traditional BI
  • Additional logical complexity for users versus a big flat table
  • Data will still need to be flattened out for machine learning
  • Not perfect for either BI or ML, but a compromise between the two

Figure 1.17 – Pros and cons of a hybrid design and considerations for additional data

Before finalizing a preliminary logical design for your FAA Wildlife Strike solution, take a step back to think about the data and the requirements. You can review the expected deliverables from earlier in the chapter, including an analytic report and predictions of damage, size, and height.

In addition to the FAA Wildlife Strike data you’ve been using, what other data might be useful for the solution? Also, what is the effort to get the data? Here are a few examples that you could research:

Additional Data Sources

Level of Effort

Date-based table of aggregations such as Month, Quarter, Season, and Holidays

Easy

Time-based table of aggregations such as hour, AM/PM, and so on

Easy

Data for flights that didn’t have a wildlife strike could provide a baseline for the percentage of flights with strikes

Difficult

Weather data that could be mapped to the date and time of wildlife strikes

Difficult

Additional data about wildlife species such as weight ranges, habitat ranges, and so on

Difficult

Figure 1.18 – Additional potential data sources for the solution

Additional flight, weather, and wildlife data could provide greater analytic and predictive value for the solution. However, adding those sources would require quite a bit of effort that exceeds the scope of your project and the length of this book. If the initial project goes well, you can circle back to these options for future enhancements.

There may be value in adding a Time table to the solution, so open up Power Query and take another look at the TIME column. Notice that 95% of the entries are empty:

Figure 1.19 – 95% of the values for TIME are empty

Figure 1.19 – 95% of the values for TIME are empty

Due to a lack of complete data, you decide to leave a Time table out of the initial build.

How about a Date table so that you can roll up data by week, month, quarter, year, holidays, weekends, and more? Looking at the INCIDENT_DATE column in Power Query, it is populated for every entry in the preview:

Figure 1.20 – INCIDENT_DATE is fully populated with date values

Figure 1.20 – INCIDENT_DATE is fully populated with date values

INCIDENT_DATE can be used as a key for connecting to a Date table containing many different date-based aggregations. You decide to pull in a Date table for the architecture. The resulting preliminary data model will now look as follows:

Figure 1.21 – A Date table is added to the preliminary data model

Figure 1.21 – A Date table is added to the preliminary data model

The Date table was not present in the source data, but in your reporting model, it will allow you to slice and dice data by day, week, month, quarter, year, weekend, and more. When you explore data in future chapters, it will add new ways to dive into and explore date-based trends. The Date table will be added in Chapter 2.

In the final section of the chapter, we’ll look at what else we need to take into account for ML.

Considerations for ML

Now that you’ve created a preliminary data model that will serve as the basis for analytic reporting in Power BI, you start thinking about a process for creating tables of data to be used with Power BI machine learning. You will need to create a single table of flattened data for each machine learning model that you train, test, and deploy.

Creating tables of data to train a machine learning model entails treating each column as a feature of the algorithm that you will be training and then using to make predictions. For example, if you wanted to create a machine learning algorithm that predicts whether something is an insect, the features (ML terminology for columns on a single table) might be [Six Legs Y/N?], [Life Form Y/N?], [Count of Eyes], and [Weight], and then a column that will be predicted, such as [Insect Y/N?]. Each row would represent something that is being evaluated for a prediction to answer the question, “Is this an insect?”

You decide to take the following approach, in the following order, so that you can do everything within Power BI:

  1. Data exploration and initial data model creation in Power BI Desktop Power Query.
  2. Analytic report created in Power BI.
  3. Feature discovery in Power BI.
  4. Create training data sets in Power Query.
  5. Move training data sets to Power BI dataflows.
  6. Train, test, deploy a Power BI machine learning model in Power BI dataflows.

This process is shown in Figure 1.22.

Figure 1.22 – All of the ETL (extract, transform, load) will happen in Power BI Power Query and Power BI dataflows

Figure 1.22 – All of the ETL (extract, transform, load) will happen in Power BI Power Query and Power BI dataflows

Power BI ML offers three different types of predictive model types. Those types, as defined in the Power BI service, are as follows:

  • A binary prediction model predicts whether an outcome will be achieved. Effectively, a prediction of “Yes” or “No” is returned.
  • General classification models predict more than two possible outcomes such as A, B, C, or D.
  • A regression model will predict a numeric value along a spectrum of possible values. For example, it will predict the costs of an event based on similar past events.

As part of your preliminary planning, you consider how these options could map to the deliverables that were prioritized by your stakeholders:

  • Analytic report: This deliverable will be a Power BI analytic report and could use some Power BI AI features, but it will not be a Power BI ML model. The analytic report will help you explore and identify the right data for Power BI machine learning models.
  • Predict damage: Predicting whether or not damage will result from a wildlife strike is a good match for a binary prediction model since the answer will have two possible outcomes: yes or no.
  • Predict size: Predicting the size of the wildlife that struck an aircraft based upon factors such as damage cost, damage location, height, time of year, and airport location will probably have multiple values that can be predicted such as Large, Medium, and Small. This requirement could be a good fit for a general classification model.
  • Predict height: This deliverable predicts the height at which wildlife strikes will happen and provides that prediction as a numeric value representing height above ground level in feet. It is likely a good fit for a regression model, which predicts numeric values.

There is no way of knowing with certainty whether the FAA Wildlife Strike data will support these specific use cases, but you won’t know until you try! Discovery is a key part of the process. First, you must identify features in the data that might have predictive value, and then train and test the machine learning models in Power BI. Only then will you know what types of predictions might be possible for your project.

Summary

In this chapter, you explored the data available for your project and reviewed subsequent options for mapping data to the requirements of your stakeholders. You reviewed data architecture options to meet both business intelligence and ML requirements in Power BI and decided upon a hybrid approach that blends a star schema design with flattened data. You also formulated a plan to explore, analyze, design, build, and deploy your solution. Finally, you decided upon three use cases for predictive ML models in Power BI.

In the next chapter, you will ingest and prep data from the FAA Wildlife Strike database using Power Query within Power BI. You’ll deep dive into data characteristics, decide what is needed for your design, and build out a flexible foundation that will support both the current project and future iterations and changes. Your approach in Power Query will support both business intelligence analytics in Power BI and predictive analytics in Power BI ML.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Take flight with Power BI machine learning and OpenAI using hands-on examples from the FAA airline data
  • Unlock the full potential of Power BI for advanced analytics using OpenAI
  • Design stunning data presentations, seamless integration of machine learning tools and technologies with OpenAI

Description

Microsoft Power BI is the ultimate solution for businesses looking to make data-driven decisions and unlock the full potential of their data. Unleashing Your Data with Power BI Machine Learning and OpenAI is designed for data scientists and BI professionals seeking to improve their existing solutions and workloads using AI. The book explains the intricacies of the subject by using a workshop-style data story for data ingestion, data modeling, analytics, and predictive analytics with Power BI machine learning. Along the way, you’ll learn about AI features, AI visuals, R/Python integration, and OpenAI integration. The workshop-style content allows you to practice all your learnings in real-life challenges and gain hands-on experience. Additionally, you’ll gain an understanding of AI/ML, step by step, with replicable examples and references. From enhancing data visualizations to building SaaS Power BI ML models, and integrating Azure OpenAI, this book will help you unlock new capabilities in Power BI. By the end of this book, you’ll be well-equipped to build ML models in Power BI, plan projects for both BI and ML, understand R/Python visuals with Power BI, and introduce OpenAI to enhance your analytics solutions.

Who is this book for?

This book is for data science and BI professionals looking to expand their skill sets into Power BI machine learning and OpenAI. This book is also useful for data scientists, data analysts, and IT professionals who want to learn how to incorporate OpenAI into Power BI for advanced experience.

What you will learn

  • Discover best practices for implementing AI and ML capabilities in Power BI using OpenAI
  • Understand how to integrate OpenAI and cognitive services into Power BI
  • Explore how to build a SaaS auto ML model within Power BI
  • Gain an understanding of R/Python integration with Power BI
  • Enhance data visualizations for ML feature discovery
  • Discover how to improve existing solutions and workloads using AI and ML capabilities in Power BI with OpenAI
  • Acquire tips and tricks for successfully using AI and ML capabilities in Power BI using OpenAI

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : May 31, 2023
Length: 308 pages
Edition : 1st
Language : English
ISBN-13 : 9781837636150
Vendor :
Microsoft
Category :
Languages :
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : May 31, 2023
Length: 308 pages
Edition : 1st
Language : English
ISBN-13 : 9781837636150
Vendor :
Microsoft
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 107.97 116.97 9.00 saved
The Future of Finance with ChatGPT and Power BI
€24.99 €33.99
Expert Data Modeling with Power BI, Second Edition
€44.99
Power BI Machine Learning and OpenAI
€37.99
Total 107.97 116.97 9.00 saved Stars icon
Banner background image

Table of Contents

20 Chapters
Part 1: Data Exploration and Preparation Chevron down icon Chevron up icon
Chapter 1: Requirements, Data Modeling, and Planning Chevron down icon Chevron up icon
Chapter 2: Preparing and Ingesting Data with Power Query Chevron down icon Chevron up icon
Chapter 3: Exploring Data Using Power BI and Creating a Semantic Model Chevron down icon Chevron up icon
Chapter 4: Model Data for Machine Learning in Power BI Chevron down icon Chevron up icon
Part 2: Artificial Intelligence and Machine Learning Visuals and Publishing to the Power BI Service Chevron down icon Chevron up icon
Chapter 5: Discovering Features Using Analytics and AI Visuals Chevron down icon Chevron up icon
Chapter 6: Discovering New Features Using R and Python Visuals Chevron down icon Chevron up icon
Chapter 7: Deploying Data Ingestion and Transformation Components to the Power BI Cloud Service Chevron down icon Chevron up icon
Part 3: Machine Learning in Power BI Chevron down icon Chevron up icon
Chapter 8: Building Machine Learning Models with Power BI Chevron down icon Chevron up icon
Chapter 9: Evaluating Trained and Tested ML Models Chevron down icon Chevron up icon
Chapter 10: Iterating Power BI ML models Chevron down icon Chevron up icon
Chapter 11: Applying Power BI ML Models Chevron down icon Chevron up icon
Part 4: Integrating OpenAI with Power BI Chevron down icon Chevron up icon
Chapter 12: Use Cases for OpenAI Chevron down icon Chevron up icon
Chapter 13: Using OpenAI and Azure OpenAI in Power BI Dataflows Chevron down icon Chevron up icon
Chapter 14: Project Review and Looking Forward Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8
(9 Ratings)
5 star 77.8%
4 star 22.2%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Vinod Sangare Aug 07, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This unique book promises to take us on an impactful learning journey at the intersection of business intelligence, machine learning, and artificial intelligence. The author has crafted an applied, hands-on approach using real-world FAA data that allows Power BI professionals to skill up on ML and AI in a practical workshop format. Rather than a technical manual on the specifics of Power BI ML, the goal is a holistic educational experience about bringing together BI, ML, and AI to prepare for the future of data analytics.This book fills an important gap by using a full project walkthrough to teach Power BI users about incorporating ML in the context of actual workflows. The inclusion of OpenAI provides a cutting-edge capstone to the journey. The complimentary GitHub repository allows readers to recreate the entire end-to-end solution along the way.The author has struck a good balance between providing beginner-friendly ML training for BI professionals using Power BI ML tools and setting appropriate expectations around production-level ML rigor. Cautioning against over-reliance on no code BI ML tools for highly accurate use cases is wise guidance.Overall, this book appears well-positioned to advance analytics skill sets for BI professionals on ML and AI while also allowing ML experts to gain valuable BI project perspective. The blend of technical training and conceptual discussion about bringing together different data disciplines sets it apart from more narrowly specialized analytics books. Following the preface, one anticipates an educational experience that will provide immediately applicable new skills combined with thought leadership about data convergence. For BI professionals looking to expand their capabilities or ML experts seeking cross-disciplinary learnings, this book represents an incredibly valuable opportunity. The author has laid out a compelling vision and roadmap in the preface for a book that should empower readers to flourish in the future of analytics.
Amazon Verified review Amazon
Bob Jun 09, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Wonderfully readable step by step guide through the most contemporary world of AI. Highlights the usage of data management to better our world. The example using aircraft safety prediction using available metrics is profoundly interesting and suggestive of an unlimited positive future for Microsoft data management methods. The author presents the material in a most readable and likeable style.
Amazon Verified review Amazon
tt0507 Aug 04, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book provides a comprehensive guide on using Power BI and combines them with machine learning. The books are a great way to practice implementing pipelines from machine learning to visualization. GitHub repo can be used for review and provides sufficient detail.
Amazon Verified review Amazon
JBR 2265 Jun 23, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Solid introduction to predictive analytics and an interesting use of OpenAI. I appreciated the GitHub component containing content to recreate all the lessons.
Amazon Verified review Amazon
Yiqiao Yin Sep 13, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The comprehensive nature of "Unleashing Your Data with Power BI Machine Learning and OpenAI" establishes it as an indispensable resource for those at the forefront of data analysis and interpretation. Using the FAA airline data as its primary case study, the text provides an intricate exploration of the synergies between Microsoft's Power BI and OpenAI. The significance of this integration can hardly be overstated, as it has the potential to redefine the paradigms of data-driven decision-making in contemporary business environments.The book adopts a workshop-style narrative that meticulously guides readers through the multifaceted processes of data ingestion, modeling, analytics, and predictive analysis. In doing so, it unveils the intricate landscape of AI features and their utility. Particularly noteworthy is the attention given to the integration of R and Python, and the pivotal role of OpenAI. It is a testament to the authors' dedication that the text manages to strike a delicate balance between theoretical rigor and practical application.By the conclusion of this tome, readers are not merely equipped with a cursory understanding of Power BI and OpenAI; rather, they are bestowed with a profound mastery over the orchestration of BI and ML. The expertise gained extends to the nuanced implementation of R/Python visuals, positioning practitioners at the vanguard of their respective fields.This text is primarily tailored for professionals in data science, business intelligence, and IT sectors. However, its depth and scope ensure that even those outside these disciplines, but with a vested interest in advanced data methodologies, will find it a worthy addition to their libraries. In summation, "Unleashing Your Data with Power BI Machine Learning and OpenAI" is a scholarly triumph that promises to be a seminal reference for years to come.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.