Lab 1 - Accessing and Preparing Data
Lab 1 - Accessing and Preparing Data
Dashboard in a Day
Lab 1
• How to load data from Microsoft Excel and Comma-Separated Values (CSV) sources
• How to manipulate the data to prepare it for reporting
• How to prepare the tables in Power Query and load them into the model
Learning these steps will prepare you for the modeling exercises in Lab 2. Additionally, the results of this
lab will be the starting point for Lab 2.
Power BI Desktop
Power BI Desktop – Accessing Data
In this section, you will import VanArsdel’s and its competitors’ USA sales data. You will then import and
merge sales data from other countries.
USA sales data is in a CSV file located in the Usages subfolder within the Data folder (/Data/USSales).
Sales of all other countries is in the InternationalSales subfolder within the Data folder
(/Data/InternationalSales). Each country’s sales data is in a CSV file in this folder.
Product, Geography, and Manufacturer information is in a Microsoft Excel file called bi_dimensions.xlsx in
the USSales subfolder within the Data folder (/Data/USSales/).
The Manufacturer sheet has data laid out across the sheet, no column headers, several blank rows, and a
note in row seven.
The Geo sheet has the geography information. The first few rows have data details. The actual data starts
within row four.
After reviewing the bi_dimensions.xlsx file, close Microsoft Excel. We will start by connecting data from
these different sheets, and then perform data cleaning and transformation operations.
2. If you don’t already have the Power BI Desktop open, launch it now.
3. If you have not signed into the Power BI Desktop, select the Get started option.
4. Sign in using your Power BI credentials.
5. You will see the startup screen open. Select the X in the top right corner of the dialog box to close it.
6. From the ribbon, select File, then choose Options and settings. Then, select Options.
7. Within the pane to the left of the Options dialog box, select Regional Settings under Current File.
8. From the Locale drop-down, select English (United States).
9. Then, select OK to close the dialog box.
Note: Power BI Desktop has the capability to connect to 300+ data sources. The
newest source to be added is Datamarts. You will not be using Datamarts in this
class.
We are using CSV and Excel data files in this lab for simplicity. If you would like a full list of data sources,
please visit this link: https://docs.microsoft.com/en-us/power-bi/connect-data/desktop-data-sources
10. From the ribbon at the top of the screen, select the Home tab. Then, choose the Get Data drop-
down.
11. Select Text/CSV from the options list.
Note: If your folder appears empty then this likely means you forgot to unzip your
class files. Navigate to your location where you saved the class files and unzip the
files by right-clicking on the .zip file, then select Extract All.
Power BI detects the data type within each column. There are options to detect the data type based on
the first 200 rows, based on the entire dataset or to not detect the data. Since our dataset is large and it
will take time and resources to scan the complete dataset, we will leave the default option of selecting
the dataset based on the first 200 rows.
After completing your selection, you have three options – Load, Transform Data or Cancel.
• Load adds the data from the source into Power BI Desktop for you to start creating reports.
• Transform Data allows you to perform data shaping operations such as merging columns, adding
additional columns, changing data types of columns as well as bringing in additional data.
• Cancel returns you back to the main canvas.
14. Within the Sales.csv dialog window, select the Transform Data button.
15. Notice that Power BI has set the Zip field to the data type Whole Number. To ensure that the leading
zero is not dropped from Zip codes that start with zero, we will format them as Text. To do this, select
the Zip column. Then, from the ribbon, select the Home tab. From the menu at the top of the screen,
within the Transform group, select the Data Type drop-down and choose the Text option.
IMPORTANT!
Missing these last two steps can impact your experience later.
Now, let’s get the data that is in the Excel source file called bi_dimensions.xlsx.
17. From the ribbon at the top of the screen, select the Home tab, choose the New Source drop-down,
and then select Excel Workbook.
18. Browse to the DIAD folder, double-click Data, double-click the USSales folder, and then select
bi_dimensions.xlsx.
20. The Navigator dialog box lists three sheets that are within the Excel workbook. It also lists the
Product_Table which is a pre-defined Excel table.
Note: Excel Tables are differentiated from Worksheets by using different icons.
21. From the pane to the left, select the box to the left of geo. In the preview pane, notice that the first
few rows are headers and are not part of the data. We will remove them shortly.
22. From the left pane, select the box to the left of manufacturer. In the preview pane, notice that the
last couple of rows are footers and are not part of the data. We will remove them shortly.
23. From the left pane, select the box to the left of Product_Table. Notice the different icon indicates
that this data is stored in an Excel table.
25. From the Home tab of the Query Editor, select the New Source drop-down menu.
26. Select More… from the options list. The Get Data dialog box will appear.
29. Within the Folder dialog box, select the Browse… button.
30. In the Browse For Folder dialog box, navigate to the location where you unzipped the class files.
31. Open the DIAD folder.
32. Open the Data folder.
33. Select the InternationalSales folder.
34. Select OK to close the Browse for Folder dialog box.
35. Select OK to close the Folder dialog box.
The selected folder dialog box will display the list of files within the folder.
36. Select the Combine & Transform Data button at the bottom of the dialog box.
Note: The data in your file for Date accessed, Date modified, and Date created
might be different than the dates displayed in the screenshot above.
The Combine Files dialog box will open. By default, Power BI will again detect the data type based on the
first 200 rows. Notice there is an option to select various file Delimiters. The file we are working with is
Comma delimited, so let’s leave the default Delimiter option as Comma.
There is also an option to select each individual file in the folder (using Example File drop-down) to
validate the format of the files.
37. Select the OK button located at the bottom of the dialog window.
Notice that the Zip column is of the Whole Number type. Based on the first 200 rows, Power BI thinks the
Zip column consists of whole numbers. But zip code could be alpha numeric in some countries or regions
or contain leading zeros. If we do not change the data type, we will receive an error when we load the
data shortly. So, let’s change the Zip column to data type Text.
41. Select the Zip column within the InternationalSales query, and then change the Data Type to Text
using the drop-down under the Home tab.
42. The Change Column Type dialog box will open. Select the Replace Current button when prompted.
43. We do not need the Source.Name column within the InternationalSales query. Select the
Source.Name column and from the ribbon, select the Home tab. Choose the Remove Columns drop-
down, and then select Remove Columns again.
You will now see that Australia, Canada, Germany, Japan, Mexico, and Nigeria
are all selected.
• If formula bar is disabled, you can turn on the formula bar from the View ribbon. This enables you to
see the “M” code generated by each click within the ribbons.
• Select the options available within the ribbon - Home, Transform, Add Column, and View - to review
the various features available.
44. Within the Queries pane, minimize the folder called Transform Files from InternationalSales.
45. Select each query name in the Other Queries section as you rename them in the next step.
46. Navigate to Query Settings pane to the right of the screen, and then the Properties section to rename
the queries. Rename each query listed within the Queries pane to the left of the screen using the new
names listed below. You will type the new name within the Name property of the Query Settings pane
and then hit Enter on your keyboard. Notice that once the query has been named, it will also change
within the Queries pane to the left of the screen.
47. With the Product query selected from the Queries pane, select the Category column.
Notice how all the null values are filled with the appropriate Category values.
Note: The fill down operation takes a column and traverses through the values in
it to fill any null values in the next rows until it finds a new value. This process
continues on a row-by-row basis until there are no more values in that column.
52. Within the dialog box, ensure that Custom is selected in the Select or enter delimiter drop-down
menu.
53. Notice that within the text box, there is a hyphen (-). Power BI assumes we want to split by hyphen.
Remove the hyphen symbol and enter the pipe symbol (|).
Note: The pipe symbol is located in the upper right-hand corner of the keyboard
below the Backspace button.
Note: If the delimiter occurs multiple times, the Split at section provides the
option to split only once (either left most or right most) or the option to split the
column on each occurrence of the delimiter.
In this scenario, the delimiter occurs only once, therefore the Product column is
split into two columns.
55. Select the Product.1 column, and then right-click next to the column name.
56. Choose Rename… from the options menu.
57. Rename the field to Product.
61. Within the first row of the newly added Column1, enter the first Price value, 412.13, and hit Enter on
your keyboard.
Notice after you hit Enter, Power BI knows that you want to split the Price column. The formula Power BI
uses is displayed as well.
62. Double-click the column header Text After Delimiter to rename it.
63. Rename the column to MSRP and select OK to apply the changes.
Notice that the MSRP field has a Data Type of Text. The Data Type that it needs to be is Decimal. Let’s
change it.
64. Select the ABC icon to the left of the MSRP column header.
65. From the menu, select Fixed Decimal Number. Notice that all the steps we performed on the Product
query are being recorded under APPLIED STEPS in the right panel.
67. Within the first row of the newly added Column1 enter the first Currency value as USD and then hit
Enter on your keyboard.
68. Rename the column header Text Before Delimiter to now be named Currency.
69. Select Ok to apply the changes.
Notice that after you hit Enter, Power BI knows you want to split the Price column. The formula it uses is
displayed above as well.
Now that we have split the Price column into the MSRP and Currency columns, we no longer need the
original Price column. Let’s remove it.
70. From the Queries pane to the left of the screen, select the Product Query.
71. Right-click on the Price column.
75. The Remove Top Rows dialog box opens. Enter 2 in the text box since we want to remove the top
informational data row and the blank second row.
76. Then, select OK.
77. With the Geography query selected in the Queries pane, from the ribbon at the top of the screen,
select the Home tab, and then choose Use First Row as Headers.
With that step, Power BI will predict the data type of each field again. Notice that the column Zip was
changed to the Number Data Type. Let’s change it to Text again as we did earlier. If we don’t, we will see
errors when we load the data.
78. Select the 123 icon to the left of the Zip column header. From the options menu, select Text.
80. From the Queries pane, select the Manufacturer query. Notice the bottom three rows are not part of
the data. Let’s remove them.
81. From the ribbon, select the Home tab, choose the Remove Rows drop-down, and then select Remove
Bottom Rows.
82. The Remove Bottom Rows dialog box opens. Enter 3 in the Number of rows text box.
83. Then, select OK.
Notice that this transposes the data into columns. Now we need the first row to be the header.
86. From the ribbon at the top of the screen, select the Home tab, and then choose the Use First Row as
Headers button.
Also, notice that with the Query Settings pane, under APPLIED STEPS, you will see the list of
transformations and steps that have been applied. You can navigate through each change made to the
data by selecting the step. Steps can also be deleted by choosing the X that appears to the left of the step.
The properties of each step can be reviewed by selecting the gear to the right of the step.
87. Within the Queries pane to the left of the screen, select the Sales query.
88. From the ribbon at the top of the screen, select the Home tab, and then choose Append Queries.
89. The Append dialog box opens. There is an option to append Two tables or Three or more tables.
Leave Two tables selected since we are appending just two tables.
90. From the Table to append drop-down, select International Sales. Then, select OK.
You will see null values within the Country column by default for the Sales table rows because that
column did not exist for the table with USA data. We will now add the value USA as a data shaping
operation.
91. From the ribbon at the top of the screen, select the Add Column tab, and then choose Conditional
Column.
92. In the Add Conditional Column dialog box, enter the name of the column as CountryName.
93. Select Country from the Column Name drop-down menu.
94. Choose equals from the Operator drop-down menu.
95. Enter null in the Value text box.
96. Enter USA in the Output text box.
97. Select the drop-down menu under Else and then choose the Select a column option.
98. Choose Country from the column drop-down menu.
99. Then select OK.
Note: A common mistake on the previous step is that the Else may not being set correct. Please double
check your Else part of the conditional column matches the screenshot above.
100. You will see the CountryName column in the Query editor window. Notice that within the APPLIED
STEPS list, it has added to the list the action you just completed.
The original Country column containing the null values is no longer needed and can be removed from
the final table for analysis.
101. Right-click on the Country column and select Remove from the options menu.
Note: The difference between a Fixed decimal number and a Decimal number is related to the length
and precision of the decimal places. https://learn.microsoft.com/en-us/power-bi/connect-
data/desktop-data-types#number-types
When the data is refreshed, it will process through all the APPLIED STEPS that you have created.
The newly named Country column will have names for all countries, including the USA. You can validate
this by selecting the drop-down menu next to the Country column to see the unique values.
105. At first, you will only see USA data. Select the drop-down arrow to the right of the Country column
header. Select Load more to validate your data from all seven countries.
106. Select Cancel to close this filter. You do not need to apply this filter to the data.
107. Ensure that the Sales query is selected within the Queries pane to the left of the screen. Select the
drop-down to the right of the Date column header.
108. Select Date Filters and then choose In the Previous…
109. The Filter Rows dialog box opens. Enter 4 in the first text box to the right of is in the previous.
110. Select years from the drop-down menu displaying the intervals.
Note: This is saying that we want the three years prior to the current year.
111. Then, select OK.
Now that the International Sales data is appended to the Sales query, in order to avoid duplicating data
we should suppress the International Sales table from loading into the data model.
112. From the Queries pane to the left of the screen, select the International Sales query.
Note: The appropriate data from the International Sales table will load onto the Sales
table each time the model is refreshed. By removing the International Sales table, we are
preventing duplicate data from loading into the model and increasing its file size. In some
instances, storing very large amounts of data affects the data model performance.
114. You may receive a message about Possible Data Loss Warning. If so, select Continue when this
warning appears.
115. Next from the ribbon, select the View tab and then choose Query Dependencies.
This opens the Query Dependencies dialog box. The dialog box shows the source of each query and its
dependencies. For example, we see that the Sales query has a CSV file source and a dependency on the
International Sales query. This is a useful information to share knowledge with your team members.
Note: That you can zoom in and out of the Query Dependencies view as needed.
You have now successfully completed import and data shaping operations and are ready to load the data
into the Power BI Desktop data model to visualize the data.
117. From the ribbon at the top of the screen, select the File tab, then choose Close & Apply. This will
close out the power query window and apply all changes
118. Once the data has finished loading, select the File tab from the ribbon at the top of the screen.
119. Then, from the options menu to the left, select Save to save the file. Name the file as
MyFirstPowerBIModel. Save the file within the DIAD Reports (\DIAD\Reports) folder.
Here are a few more resources that will help you with your next steps with Power BI.
• Getting started: http://powerbi.com
• Power BI Desktop: https://powerbi.microsoft.com/desktop
• Power BI Mobile: https://powerbi.microsoft.com/mobile
• Community site https://community.powerbi.com/
• Power BI Getting started support page:
https://support.powerbi.com/knowledgebase/articles/430814-get-started-with-power-bi
• Support site https://support.powerbi.com/
• Feature requests https://ideas.powerbi.com/forums/265200-power-bi-ideas
• New ideas for using Power BI https://aka.ms/PBI_Comm_Ideas
• Power BI Courses http://aka.ms/pbi-create-reports
• Power Platform https://powerplatform.microsoft.com/en-us/instructor-led-training/
• Power Apps Business Apps | Microsoft Power Apps
• Power Automate Power Automate | Microsoft Power Platform
• Dataverse What is Microsoft Dataverse? - Power Apps | Microsoft Docs
The technology/functionality described in this demo/lab is provided by Microsoft Corporation for purposes of
obtaining your feedback and to provide you with a learning experience. You may only use the demo/lab to
evaluate such technology features and functionality and provide feedback to Microsoft. You may not use it
for any other purpose. You may not modify, copy, distribute, transmit, display, perform, reproduce, publish,
license, create derivative works from, transfer, or sell this demo/lab or any portion thereof.
FEEDBACK. If you give feedback about the technology features, functionality and/or concepts described in
this demo/lab to Microsoft, you give to Microsoft, without charge, the right to use, share and commercialize
your feedback in any way and for any purpose. You also give to third parties, without charge, any patent rights
needed for their products, technologies and services to use or interface with any specific parts of a Microsoft
software or service that includes the feedback. You will not give feedback that is subject to a license that
requires Microsoft to license its software or documentation to third parties because we include your feedback
in them. These rights survive this agreement.
MICROSOFT CORPORATION HEREBY DISCLAIMS ALL WARRANTIES AND CONDITIONS WITH REGARD TO
THE DEMO/LAB, INCLUDING ALL WARRANTIES AND CONDITIONS OF MERCHANTABILITY, WHETHER
EXPRESS, IMPLIED OR STATUTORY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT.
MICROSOFT DOES NOT MAKE ANY ASSURANCES OR REPRESENTATIONS WITH REGARD TO THE ACCURACY
OF THE RESULTS, OUTPUT THAT DERIVES FROM USE OF DEMO/ LAB, OR SUITABILITY OF THE INFORMATION
CONTAINED IN THE DEMO/LAB FOR ANY PURPOSE.
DISCLAIMER
This demo/lab contains only a portion of new features and enhancements in Microsoft Power BI. Some of the
features might change in future releases of the product. In this demo/lab, you will learn about some, but not
all, new features.