Chapter 9: Working with M – Learn Power Query

Chapter 9: Working with M

In this chapter, we will be introduced to the Power Query M language and how to use and write the syntax, including the steps to reveal a list of functions and definitions.

This chapter will look at the structure and syntax of M. All programming languages have their own specific syntax and structure, and once you master the structure of M, it becomes easier to understand. We will look at the main data types and functions and provide a walk-through demonstration of how to use each of these data types. Some of the text data types that we will cover include numeric data types, lists, records, tables, searches, and shares, before looking at how to import a .csv file using M.

In this chapter, we will cover the following main topics:

  • The beginnings of M
  • Understanding the M syntax and learning how to write M
  • Using #shared to return library functions, including function definitions, without having an internet connection or connecting to an outside data source

This chapter provides an introduction to the M language and how it is used. We will go through some basic M syntax for writing some programming lines. There will be a series of simple walkthroughs of different types of programming and we will look at how each of these types can be used.

By the end of this chapter, you will be able to look at M code and see whether there are syntax or structural problems and be able to fix them.

Technical requirements

You need an internet connection to download the relevant files from GitHub. Also, the code files for the chapter can be found at https://github.com/PacktPublishing/Learn-Power-Query/.

This chapter assumes that you already know how to open the Power Query editor and that you are comfortable with using the different commands in the editor. We also presume that you are familiar with at least one type of programming language, so you will understand the syntax and logic behind M.

For Code in Action video of this chapter please visit the following link: https://www.youtube.com/watch?v=pQ6Iyh7mxZE&list=PLeLcvrwLe186O_GJEZs47WaZXwZjwTN83&index=10&t=0s.

The beginnings of M

As we have learned so far, Power Query allows you to connect to data sources, as well as clean, transform, filter and then, finally, publish them. Beyond this, it also allows you to mash up or collate data from multiple sources. The language is called M, as Mashup is the jargon name that it started with.

Power Query has a really good interface that allows most people to use it effectively without doing too much programming. Everything that you create in Power Query is translated into M. Although we have dabbled a bit with M, we have allowed Power Query to do most of the work for us. In Chapter 8, Creating Dashboards with Power Query, we edited M and created our own code when we inserted additional columns. Without M, we would not be able to write more complex queries, transformations, and calculations. There are certain things, such as connecting to web services, that can only be done with custom M code.

The first thing I would like to say is that M is a functional language, and like every programming language, it has its own structure and syntax. Before looking at the basic syntax structure, I would like to remind you that M is case-sensitive for both variable and function names. This is very different from Excel, where you can type in lowercase or uppercase and Excel changes it to uppercase in the formula. VBA usually changes the syntax automatically. As an example, when you are using functions, Excel normally changes the first letter to uppercase. DAX, on the other hand, allows you to use either uppercase or lowercase, and you can also mix them up in your programming, although this is definitely not recommended.

Of course, M is found in more than just Power BI Desktop; it can be used in Power Query, Excel, and the Get & Transform import feature in older versions of Excel.

The next section deals with the M syntax and explains the differences between let and in, which are the basis of the M syntax.

Understanding the M syntax and learning how to write M

In this section, we are going to look at the syntax of M, explaining what it is, and then we will go through a few examples. As mentioned earlier, M is made up of individual language elements, including functions, variables, expressions, and values, which are all used to transform data.

Let's start by seeing what M looks like when we open it for the first time.

In Excel, open Power Query Editor, and then select Advanced Editor from the View tab:

Figure 9.1 – The Advanced Editor window

By default, we see the following code snippet:

let      Source = ''in      Source

There are two blocks—namely, the let expression block and the in expression block. Let's understand each one.

The let expression block contains the programming code—or, as we can call it, the procedural steps—that we use to define the query. Each one of these steps has a variable name that has been assigned an expression, and this expression is the logic for transforming the data. There can be multiple steps, and each step usually evolves using the step before it, although this is not always the case. The procedural steps do not have to follow the physical or logical order and can be mixed. This, of course, can be a complete nightmare for you or anyone else trying to work out what you are trying to do, so the accepted best practice is to keep them in the correct order. Thus, each query in M is a single Power Query let expression. This is the code that defines the datasets, and each procedural step is known as a variable.

The in expression block is an output. I know it sounds wrong, but it will return a variable value based on let.

Looking at figure 9.1, the two blocks come up automatically and all we need to do is populate them. I have entered Hello World into the let expression block, and when we run this, we get Column 1 with Hello World in it. Although this is not very impressive, it does give us an indication of how the syntax works.

Important note

There are a few other basic things to note when writing in the let expression block. For example, you need a comma (,) at the end of every line, except for the line before in. Variable names are normally one word, using camel case or an underscore (_), but if you would like a long variable name with spaces, then you need to use # followed by the variable name in quotation marks (for instance, #'This is the long variable name'). The variable names can also include special characters, such as % or &. The variable name is also the name of the applied steps found on the right-hand side of the screen.

To use comments, you can use // at the beginning of the line (the same as in JavaScript). Parentheses (()) are used to pass parameters to a function. Square brackets ([]) are used to encapsulate a set of records. Set braces ({}) are used for lists.

Most of the functions in M are part of an object class, and the syntax is as follows:

ObjectClass.Function(),

ObjectClass.Function(Parameter),

ObjectClass.Function(Parameter1, parameter2)

There are a few functions that came out in the original release of Power Query that do not follow the same object class name, such as the date. To use the date, we write #date(), and it is then followed by three parameters—year, month, and day. This will be discussed in greater detail in Chapter 10, Examples of M Usage.

The basic M syntax is relatively straightforward, provided that we follow the correct structure with let and in. The one trick to remember is to look at the type of parentheses that are used to determine whether we are dealing with variables or lists, or whether we are trying to encapsulate a set of records.

In the next section, I will go through a number of different examples of M code. Some of the code will not need data sources as the values will be generated by the code itself. Of course, using the same techniques, we will be able to use the same functions on any data source.

Using #shared to return library functions

In this section, we will look at the #shared libraries, which loads functions and enumerators in a result set. This means that we do not need any datasets as the code that we create will automatically make use of the #shared libraries. We will concentrate on creating text data types, numeric data types, lists, records, tables, searches, and shares, as well as importing a CSV file. With each of the different types, we will write the code so that we can see how it works and be able to apply it to other programs. We need to associate data with data types so that M knows whether the data is text, a number, or a string, and so on. We will briefly look at each of the most common data types and see the similarities and differences between them.

Text data types

We already created one example of this at the beginning of this chapter, Hello world. You will notice that it was not necessary for us to explicitly assign data types as the variable object as data types are assigned automatically. Referring to figure 7.1, if you look to the left-hand side of Hello world, you will see that it has a text data type (ABC) in the output window on the right. Although there is one text data type, there are many different ways in which we can use it.

One of my favorites is the Text.format() function, which allows you to insert values into a piece of text. For example, take the following command:

Text.Format('#[StudentName] has chosen #[Subject]', [StudentName ='Alice', Subject ='Maths'])

From the preceding example, the sentence would become Alice has chosen Maths. This same function can be used to pass a list instead of a record, as in the following expression:

Text.Format('The first number is #{0}, the second number is #{1}, the third number is #{2}',

{15,9,29})

This would return the following text:

The first number is 15, the second number is 9, the third number is 29.

The next data type is the number data type, which, as its name implies, deals with numbers and number formats.

Number data types

If we look at a simple number data type, we can see that when it is complete, it automatically assigns the number data type to it. Let's see how this works.

Create a new blank query and type the following:

let

    variable1 = 5,

    variable2 = 10,

    variable3 = variable1 + variable2

in

    variable3

From our comma-delimited list of variable declarations, we get an answer of 15. You will notice that I have three different variables—the first two variables have a number assigned to them, while the third variable refers to the value of the other two. If you look at the APPLIED STEPS section, the three variables have been turned into steps alongside the query. Refer to the following screenshot:

Figure 9.2 – Steps inside a query

If we wanted to change the variable name by creating a space, we can do so, but then we would have to have #'variable 3' for it to work. If you look at the preceding code, it looks like procedural code that works from the top to the bottom. The Power Query engine starts by looking at what we need for our let expression. In this case, we are looking for the value of step3. It will then look at what step 3 is, and then it will have to evaluate step 1 and step 2 as they are mentioned. If step 1 is not needed, the Power Query engine will not look at it, even though it is the first step.

One of the other things that can be really useful is that it is possible to refer to the output of other queries. We already have a query called Number that we have already created. If we wanted to call that query, we wouldn't have to have the let or in expression blocks; we can delete everything and type in Number, and the output of the Number query will be displayed.

The number data type is used very often as most of the time we work out totals, averages, and other mathematical formulas. The number data type can include integers, decimals, and currency.

We use lists in programming all the time, from numbers to objects and everything in between. The next section will look at how we can create lists, including nested lists.

Lists

Lists are ordered sets of values. To create a list in M, open the Power Query editor and type the following:

let

    fruit={'Apple','Grapes','Pear'}

in

    fruit

On the left-hand side, you will see that this is a list data type, but there is also a List Tools menu that appears. Refer to the following screenshot:

Figure 9.3 – The List Tools menu

With this menu, you can convert the list into a table, remove items, and do much more. As an output, the list we created is an ordered list; Apple will be first and Pear will be last.

It is possible to have nested lists that are separated by {},{}. For example, type the following command:

let

        fruitNested ={{'Apple','Grapes'},{'Limes','Lemons'}}

in

    fruitNested

If you click on the first list, you will see that the items in this list are at the bottom.

It is possible to do things with numbered lists that would save you time—for example, you could type numbersToTen = {1..10}, which will give you all the numbers from 1 to 10.

We can now expand on lists to create records, which are lists of fields.

Records

A record is written as a comma-delimited list of fields with the values that are associated with the various fields. They are written with square parentheses ([]). Let's see how this works.

Create a new blank query and type the following into the Advanced Editor window:

let

    record = [firstName='Warren',surname='Sparrow',title='Mr'],

in

    record

From the preceding record, we can see that we have formed a comma-delimited list of fields. There are three different fields—firstName, surname, and title—and the field names and the values associated with them are recorded after each equals sign:

Figure 9.4 – The Record Tools menu

On the left-hand side in the query, you will notice that the icon for a record is next to this query, and there is also a Record Tools menu.

Once again, we can expand on the records and use them to create table data types.

Table data types

The table data type is probably the most important structured data type. You could think of it as being made up of a combination of lists and records.

Create a new blank query and type the following into the Advanced Editor window:

let

    Source = #table({'A','B','C','D'},{{'1','2','3','4'},{'10','11','12','13'}})

in

    Source

The #table function is a function that will create a table for us. There are two parts to this. The first list is the names of the column names in a text data type. In this example, the column names are A, B, C, and D. The second list, and each additional list, creates one item in each column for each row, as follows:

Figure 9.5 – The Table data type

There is now a Table data type in the query, so we know that this has been associated with the table data type. Notice that some of the data types in the column headings use the ABC123 icon, which depicts any data type. We would need to associate a specific data type later to correct this.

There is a different way of creating a table, but it is slightly more complicated.

Try it out by typing the following:

Source = #table(type table[A=number, B=number, C=number, D=number],{{1,2,3,4},{100,101,102,103}})

In the preceding code, we created a table, but this time, the code says that the data type is a number, so we will not have the problem of it showing the ABC123 icon as this is a generic data type. If we wanted to change the data type to a number data type, we can change this to the number data type—123.

One of the things that we need to do once we have the data is search or find the specific data that we are looking for. The next section deals with how we can find relevant data.

Searching for relevant data

In the previous section, we learned that tables are made up of a combination of lists and records. So, this is important as we can find values from lists, records, and tables by using either the positional or lookup operator.

Let's see how.

Create another blank query and, in the Advanced Editor window, type the following:

let

    fruit={'Apple','Grapes','Pear'},

    numbersToTen = {1..10}

in

    numbersToTen{2}

In the preceding code, we have a variable called numbersToTen, which is all the integer numbers from 1 to 10. In the in expression block, we are asked for the number at index 2. When we run this code, we will get the value 3 because the index starts at 0, not 1. So, the value at index 0 will be 1, the value at index 1 will be 2, and our answer will be the value at index 2, which is 3.

If we changed the last line of the preceding code and we typed in fruit{0}, we would get the value of Apple as our answer. If we type in fruit{6}, we would get an error message as a result, as we do not have enough items in our list. However, if we typed in fruit{6}?, we would get an answer of null, as the question mark will say that we are out of range and returns null.

Using the {} parentheses allows us to do a search for the nth item in a list, and we can use exactly the same process to search for a record.

We can use the same function to find lists and records from a table. Create a new blank query and, in the Advanced Editor window, type the following:

let

    Source = #table({'A','B','C','D'},{{'1','2','3','4'},{'10','11','12','13'}}),

    Output = Source[A]

in

    Output

The source table has fields with the A, B, C, and D headings, and it then has numbers in the rows underneath. The preceding code is similar to what we have used before, with Output = Source[A] as the only difference. From the square bracket ([]) operator, we get an individual field from a record, but the source, in this case, is a table, so this will give us all the items in column A as a list.

If we change [A] to {A}, this will change the search to the first row and give us a result of 1, 2, 3, 4.

We can also put this together and type = Source [A]{A}, which will give us the value in the first row and first column.

Of course, there are so many different functions that are already built into M, and we have only touched on the most common ones. If you create a new blank query and type = #shared, this will give you a list of all the different functions that are available for use. It will display a record and each field is a name of a function that you can use. You will notice that these have functions as their data type. When you click on one of the functions, it will give you a preview, as well as the documentation for that function.

Up to this point, we have only used code to create the data that we needed. The next section deals with how to import a CSV file using M.

Importing a CSV file using M

One of the things that we do on a fairly regular basis is open CSV or Excel files. Although we can do this within Power Query, if we wanted to do this with M, we could do so in two different ways. The first way is to use the Advanced Editor window in the Power Query editor, and the second way is to type M into the formula bar in the Power Query editor.

When we open Advanced Editor, we can type the following:

let

    salesData = Csv.Document(File.Contents('C:\DataFiles\sales.csv'),     [Delimiter=',', Encoding=1252])

in

    salesData

If, however, you wanted to type this in directly without using the editor, then type type = Csv.Document(File.Contents('C:\DataFiles\sales.csv'),     [Delimiter=',', Encoding=1252]) into the formula bar, which will do the same thing.

This will then import the CSV file, but there are two problems with this. The first is that there is a blank column, and the second is that we need to promote the first row as headers.

To remove the column, open Advanced Editor and add the following to the next line:

RemoveCols = Table.RemoveColumns(salesData , 'Column3')

You can view this in the following screenshot:

Figure 9.6 – Removing columns

Once we have removed the columns, we can then move on to the second step, which is promoting the headers:

.

Figure 9.7 – Promoting the headers

Lastly, to promote the headers, type PromoteNames = Table.PromoteHeaders(RemoveCols, [PromoteAllScalars=true]).

This section has helped us go through the basics of how to use the M syntax with examples on how to use the most common data types and functions, including text, numbers, lists, tables, and records.

Summary

This chapter provided us with an introduction to how M is used. You have now acquired the necessary knowledge to understand how the syntax is used, and you should be able to look at some code and work out how it works. It makes life a great deal easier when you can look at code and determine what it is doing, especially if there is a problem that you need to solve.

In this chapter, we covered how M came about and how it works, and how it is a functional language that has its own structure and syntax. We made a quick comparison between M, DAX, and VBA with regard to structure and syntax, before looking at how to write M code. Although we have mostly used the Advanced Editor window in the Power Query editor, using the same code, we could type it into the formula bar; however, it is better to use the editor, as once your code goes over multiple lines, it is not always easy to spot a mistake in the formula bar.

This chapter included many working examples, going through text data types, numeric data types, lists, records, tables, searches, shares, and importing a CSV file. I find these particularly helpful, especially as each section is an entity in itself, which allows you to use a specific piece of code in other queries that you have already created. Having gone through the main data types, it should be evident just by looking at the data type in the query what it is associated with and which type of parentheses you should be using. The last section covered importing a CSV file, and although we have done this previously in this book, I felt that it was necessary for us to be able to see the code and the steps and understand what the code is doing.

The next chapter will deal with the aid of M with more in-depth scenarios and formulas.