Chapter 11: Creating a Basic Custom Function – Learn Power Query

Chapter 11: Creating a Basic Custom Function

In Chapter 9, Working with M, we introduced the M Power Query language and learned how to to use and write the M syntax, including steps to reveal a list of functions and definitions in Power Query.

This chapter will take you through the steps to create functions manually using the M functional language in Power Query, as well as how to create a date and time column using functions.

In this chapter, we're going to cover the following main topics:

  • Creating a function manually using M
  • Creating a DateTime column using three M functions

Technical requirements

You should be familiar with the content that we covered in Chapter 9, Working with M, and Chapter 10, Examples of M Usage, before delving into the topics covered here. It goes without saying that you should be proficient at importing various data sources into Excel or Power BI and be comfortable with the Power Query interface, being able to navigate it with ease. You need an internet connection to download the relevant files from GitHub. Also, the code files for this chapter can be found at the following link:

https://github.com/PacktPublishing/Learn-Power-Query

For Code in Action video of this chapter please visit the following link: https://www.youtube.com/watch?v=arwkny-WhIk&list=PLeLcvrwLe186O_GJEZs47WaZXwZjwTN83&index=12&t=0s

Creating a function manually using M

Just like Excel, Power Query (and Power BI) has many functions that you can use to prepare or transform data. These are, of course, updated regularly by developers. When we use icons and set options in Power Query, the program works hard behind the scenes to generate the code to provide you with results to any functions you may apply to the data.

We can display a list of M functions using the #shared code in the Power Query formula bar, as explained in Chapter 9, Working with M, in the Using #shared to return library functions section. We can use this to build our data queries, but sometimes, we need to construct personalized functions to make our lives easier and less complicated. By less complicated, we mean being able to address repetitive tasks with one action instead of multiple actions or construct them so that they can be applied to many different queries or arguments. We will use a manual method of creating a custom function by constructing a step manually in the advanced editor to transform a query into a function.

In this section, we will create a function that can be applied to any file you need and that you can produce a set of data from. Imagine we work with monthly data; for instance, say we receive sales figures each month for our company's sales representatives and we need to export this data regularly to update a yearly projection. We can get Power Query to do this for us by creating a custom function. The values, of course, will be different in each monthly workbook depending on the scenario. This is because we use each scenario to create a function with one parameter to export.

Here, we will learn how to edit a query to change the file path, and then duplicate the query and work with Advanced Editor to convert code into a parameter to turn it into a custom function. We will then test and invoke the function by connecting to a folder and then display the results in a table:

Figure 11.1 – Graphical representation showing how multiple results are transformed into one output table using a single parameter

For this example, we need a Power BI file called ChoklatoFlakSales.pbix, which includes the MaySales query.

You can, however, start from scratch and import an Excel workbook named May-ChoklatoFlakSales into Power BI if you prefer. Excel workbooks have exactly the same structure and format. For this example, we have used the May-ChoklatoFlakSales.xlsx, June-ChoklatoFlakSales.xlsx, and July-ChoklatoFlakSales.xlsx workbooks.

Let's get started. We will break our scenario down into three main steps.

Changing the file path of the query to a local path

As you will be using code files for this example, the path to the exercise files needs to be updated to your local computer location. This is to ensure that when opening the existing Power BI file (or the completed exercise file, for that matter) containing the query example, the software will try and load the tables from the new path. Let's see how to go about this:

Note

You can also load the workbook directly to Power BI from the GitHub path, should you prefer to do so.

  1. Open Power BI Desktop, and then launch the file called ChoklatoFlakSales.pbix.
  2. Within the Power BI file, we can see the MaySales query already in Power BI.
  3. We need to transform our data using Power Query. Action Power Query by selecting Home | Transform Data.
  4. Click on the gear icon next to Source in the APPLIED STEPS pane.
  5. You will see that there is an error displayed on the main window as the file path is not recognized, and so the query will not load:

    Figure 11.2 – Displayed error due to incorrect file path on query settings in Power Query

  6. Click on Browse… to change the file path. Navigate to the location on the computer where the May-ChoklatoFlaksales.xlsx workbook resides. Select the file to update the path, and then click on Open. Click on the OK command to change the path:

    Figure 11.3 – Updating the file path in Power Query

  7. The query is now updated in Power Query and the query will load.

Next, let's see how to create a function from a query manually.

Creating the function manually

We now need to transform our query so that it becomes a function, after which we will assign a parameter to the query function. So, what is a parameter? Parameters were explained in Chapter 10, Examples of M Usage, in the Using parameters section.

Let's see how to transform the query:

  1. We will firstly duplicate the query called MaySales in Power Query.
  2. Right-click on the MaySales query and choose Duplicate from the drop-down menu provided.
  3. A copy of the query is now visible under the existing MaySales query, named MaySales(2).
  4. Open Advanced Editor to edit the code for the query by selecting View | Advanced Editor:

    Figure 11.4 – Accessing Advanced Editor in Power Query

  5. We now need to select part of the code to edit into a parameter so that we are able to run this code on any data source within the source folder. The following screenshot identifies the portion of the code that we will need to change via the highlighted code. The highlighted portion is currently the source path location of the data source we imported into Power BI Desktop:

    Figure 11.5 – The highlighted code that we will turn into a parameter query using Advanced Editor

  6. To make the change to the source path, we will add the following line of code above the let syntax:

    (FileBinary as binary) as table =>

    This is shown in the following screenshot:

    Figure 11.6 – The Advanced Editor view edited to include the parameter code above the let call

  7. Click on Done to change the highlighted code into a parameter query. Notice that because we edited the code, the MaySales(2) query has turned into a parameter query. We can identify the parameter query thanks to the fx icon located to the left of the query name:

Figure 11.7 – The parameter query applied to the query function

In this section, we learned how to edit a query using Advanced Editor to create a parameter query. Now, let's test the parameter function that we have created in the next section.

Testing the parameter function

Here, we need to ensure that the parameter query works for us by linking it to a source folder so that it can do its job. Remember that all the files in the folder need to be of the same file type, otherwise you may encounter an error. Let's see how we test the function:

  1. Click on Home | New Source | More… | Folder | Connect:

    Figure 11.8 – Creating a new folder connection

  2. Then, click on Browse… to find the file path you wish to connect to:

    Figure 11.9 – Selecting the folder path

  3. Click on OK to accept the folder.
  4. Then, select Transform Data to add a new query to the Power Query interface:

    Figure 11.10 – Choosing Transform Data to add the folder connection to Power Query

    We will now use the Invoke Custom Function option to apply our function to the files that we have added to the folder, as selected in the previous step. Essentially, what we are getting here is a new column with the parameter query, which will invoke the function.

  5. Click on Add Column | Invoke Custom Function.
  6. Provide a name for the new column in the New column name field. For this example, we have used SalesInvoke. Choose MaySales (2) as the function query. Leave the FileBinary option as Content, as shown:

    Figure 11.11 – Invoking a custom function

  7. Click on OK to continue.
  8. The new column, SalesInvoke, is added to the query:

    Figure 11.12 – The new column is added to the query

  9. Click on the SalesInvoke column to view the results for each of the source files:

    Figure 11.13 – Clicking on the whitespace in the last column will display the content of the file in the window below

  10. Please note that if you happen to have file formats within the source folder that are not the same as the file type you are invoking, you could encounter an error. For example, we are invoking an Excel file format in this example. If I had .csv files in this folder, or any other file type, then I would receive errors in certain cells because of the file type difference. Remove the files from the folder before invoking the function.
  11. Remove any columns you do not need for the custom function. Observe the following screenshot:

    Figure 11.14 – Removing unnecessary columns from the query

    In this example, I have removed all the columns except Name and SalesInvoke. You should now be left with the following columns in the query:

    Figure 11.15 – The columns that remain after removing them from the query

  12. The final step in this process is to append all the data from the Excel workbooks into a single workbook so that we see the May, June, and July sales in one combined result:

    Figure 11.16 – Clicking on the double composite arrows (expand) will show the append options

  13. Click on OK to view the result.

    The beauty of this process is that any new datasets added to the folder path will be appended to the table. All you need to do is refresh the data connection.

  14. Add the August-ChoklatoFlakSales.xlsx workbook to the folder path and watch the magic happen:

    Figure 11.17 – The August workbook is added to the folder

  15. Click on the Refresh All icon on the Home tab of the Power Query interface to refresh the data. You will now see that the August sales have been added to the dataset:

Figure 11.18 – Clicking on the Refresh All icon to pull in the new dataset

It is important to note that there are a number of methods to achieve the result in the preceding example. The steps you have followed so far give you a good indication of the magic that Power Query can provide when you need to update a workbook by appending datasets. It is necessary to understand this process using the manual method, as used here. This will benefit you more than clicking on a single button and not knowing how the application works behind the scenes, using functions and parameters. The shorter method uses the Combine Files icon on the Home tab:

Figure 11.19 – The Combine Files method

If you are an experienced user of M, you will be able to create the code for all of the preceding steps without using the interface as we have done in this example. I prefer to use the manual method so that I understand what I am doing more thoroughly than having to switch between code and testing all the time.

You can also use the Manage Parameters icon in the Home tab to create a binary parameter, after which you could create a reference by right-clicking on the binary file. The next step would be to create the custom function by right-clicking on the created parameter query and then selecting Create Function.

The functions you want to perform to transform data are then actioned manually, after which you would use Invoke Custom Function. So, the steps are very similar; they just depend on the method that you prefer.

Having understood this, we will move on to the next section to create a date/time column using three M functions.

Creating a date/time column using three M functions

In this section, we will learn how to create a function manually by creating a DateTime column in Power Query using three M functions, namely, List.DateTimes, Duration.TotalDays, and Time.LocalNow().

A comprehensive reference for all the date/time functions in Power Query can be found at https://docs.microsoft.com/en-us/powerquery-m/date-functions.

The great thing about this Microsoft documentation is that you can copy code directly from the website and paste it into your Power Query Advanced Editor window or the formula bar. That way, you do not need to remember the code and can save a lot of time when constructing M code. Locate the code you wish to use and then click on the Copy button at the top right of the code:

Figure 11.20 – Copying M code from the docs.microsoft.com website

Before we delve into the M date functions, let's look at a really easy method to subtract dates in Power Query without having to remember any code.

If you want to find the number of days between two dates in Power Query, you can use the subtract dates function. This function is readily available from the Add Column tab in Power Query. Perform the following steps to calculate the number of days between two dates:

  1. In the SSGDates query, we have a table listing products, as well as the Date Sold, and Date Delivered columns. We want to subtract the two dates from each other to find the number of days between the sold and delivered dates. The first step is to select the columns to subtract. The order of the date columns is important when selecting. If we select Date Sold before Date Delivered, the formula will calculate a negative number due to the Date Sold date being before Date Delivered. If you want a positive integer, then select Date Delivered before selecting the Date Sold column:

    Figure 11.21 – The SSGDates query

  2. So, first select the Date Delivered column, and then Date Sold. Click on Add Column | Date | Subtract Days:

    Figure 11.22 – Calculating the number of days between dates using the Subtract Days function

  3. The result is entered in a separate column, showing the number of days between the two dates:

    Figure 11.23 – The number of days is generated in a separate column

  4. Let's view the M code for this function. Click on View | Advanced Editor to see the code. This is what we get:

Figure 11.24 – The M code for the Subtract Days function

In the preceding example, we looked at one method to subtract days from existing date columns. For the next part of this section, we will get to know the List.DateTimes M function. This function allows you to generate a list of dates or times based on your input.

The first step is to calculate the number of days between two dates:

  1. Create a new query named NumbDays.
  2. Click on View | Advanced Editor.
  3. Change the name of the Source code text to NumbDays. This will be our M formula function name, which we can use at a later stage in our List.DateTimes formula:

    Figure 11.25 – Renaming the source code NumbDays

  4. We will use the following calculation to subtract the two dates:

    = #datetime(2019,01,01,00,00,00)-#datetime(2020,06,17,00,00,00)

  5. Click on Done to view the number of days between the two dates:

    Figure 11.26 – Subtracting the dates to obtain the number of days between them

  6. As we have a negative value for the number of days, simply switch the M code around to produce a positive value:

    = #datetime(2020,06,17,00,00,00)-#datetime(2019,01,01,00,00,00)

  7. Add a step by clicking on the fx icon in the formula bar, as follows:

    Figure 11.27 – Creating a step

  8. When we construct M code, we can type the code into the Advanced Editor window or in the formula bar. Enter the following formula into the formula bar:

    List.DateTimes(#datetime(2019, 01, 01, 00, 00, 0), 10, #duration(0, 0, 1, 0))

    This code will generate a date and time list of 10 records from January 1, 2019 with a 1-second interval:

    Figure 11.28 – The List.DateTimes M code

    Here, we have used the second operator, but you can do this for days, hours, minutes, seconds, and so on.

  9. We will edit the M code so that the list reflects every hour and remove the minute operator by entering 0:

    Figure 11.29 – The M code edited to reflect the increment by day, and not minutes

  10. Let's update the number of lines to 24 to meet the requirements for the number of hours in a full day:

    Figure 11.30 – The 24 rows in the list reflecting the number of hours in a day

  11. Go back to the NumbDays step, where you will see the number of days between the two dates. For this example, you will see 533 days between the two dates. We will use the NumbDays variable instead of the 24 days in the M code to reflect these 533 days. Enter NumbDays into the formula in place of 24. You will see an error occur immediately. The reason for this is that the variable entered is not actually a number, so we need to tweak this a bit:

    Figure 11.31 – The edited M code to reflect the NumbDays variable

  12. This is where the Duration.TotalDays M function comes into play. Go back to the NumbDays step and edit the M code to read the following:

    =Duration.TotalDays(#datetime(2020,06,17,00,00,00)-#datetime(2019,01,01,00,00,00))

    Figure 11.32 – The variable is now a number by adding the Duration.TotalDays M code

  13. If you click on the Custom1 step, you will now see that the 533 records have updated.
  14. The next step is to turn the values into a data table. Click on the To Table icon at the top left of the ribbon:

    Figure 11.33 – Creating a table from the date values

  15. Click on OK to select the defaults and convert them into a table.
  16. Sort the table into ascending order to check that it shows January 1, 2019, and then sort it into descending order. If you find that some days are missing when you have sorted it into descending order, you need to tweak the M code a little further. This is because we don't have the required 25 lines per day. We need to multiply the first step by 24 so that all the hours in the day are covered:

    Figure 11.34 – Editing the M code by multiplying by 24 to reflect the required number of hours in each day

  17. If we now move back to view the Sorted Rows step, we will see that the calendar is still incorrect as it should be counting from 00 in the code. We now need to edit the code further to include  DateTime.LocalNow, which will return the date and time of the system's current date and time:

    Figure 11.35 – The DateTime.LocalNow function added to the M code

  18. Click on the Sorted Rows step to view the change to the data:

Figure 11.36 – DateTime.LocalNow updated to the system's current date and time

In this section, we mastered working with dates in Power Query using the List.DateTimes, Duration.TotalDays, and Time.LocalNow() M functions to create a date and time column. Have a browse through the link provided at the start of the section to broaden your knowledge about the other date and time functions that you can apply to datasets. We hope that the three functions included in this topic have sparked your interest and that you are already thinking of scenarios where you could apply them to your datasets.

Summary

In this chapter, you mastered changing a file path to a different folder location on your computer, and you should now be able to create an M function manually by editing code using the Advanced Editor window to create a parameter function.

You should be able to test a parameter query and create a new column to invoke a custom function. We covered Power Query date and time functions using the List.DateTimes, Duration.TotalDays, and Time.LocalNow() M functions to create a date and time column, and now we should be able to tweak the data using all three of the functions to edit M code.

In the next chapter, we'll cover the differences between DAX and M. You will understand the differences between the two languages by going through some examples and learning to create calculated measures.