6. The sys Schema – MySQL 8 Query Performance Tuning: A Systematic Method for Improving Execution Speeds

© Jesper Wisborg Krogh 2020
J. W. KroghMySQL 8 Query Performance Tuninghttps://doi.org/10.1007/978-1-4842-5584-1_6

6. The sys Schema

Jesper Wisborg Krogh1 
(1)
Hornsby, NSW, Australia
 

The sys schema is the brainchild of Mark Leith, who has also for long been part of the team that develops MySQL Enterprise Monitor. He started the ps_helper project to experiment with monitoring ideas and to showcase what the Performance Schema was able to do while making it simpler at the same time. The project was later renamed to the sys schema and moved into MySQL. There have since been contributions from several other people, including the author of this book.

The sys schema is available for MySQL Server 5.6 and later. In MySQL 5.7 it became part of the standard installation, so you do not need to do anything to install the sys schema or upgrade it. As of MySQL 8.0.18, the sys schema source code is part of the MySQL Server source.

The sys schema is used throughout the book for analyzing queries, locks, and more. This chapter will give the high-level overview of the sys schema including how to configure it, formatting functions, how the views work, and various helper routines.

Tip

The sys schema source code (https://github.com/mysql/mysql-server/tree/8.0/scripts/sys_schema and for older MySQL versions https://github.com/mysql/mysql-sys/) is also a useful resource to learn how to write queries against the Performance Schema.

sys Schema Configuration

The sys schema uses its own configuration system as it was originally implemented independent of MySQL Server. There are two ways to change the configuration depending on whether you want to change the setting permanently or just for the session.

The persisted configuration is stored in the sys_config table which includes the name of the variable, its value, and when the value was last set and by which user. Listing 6-1 shows the default content (the set_time will depend on when the sys schema was last installed or upgraded).
mysql> SELECT * FROM sys.sys_config\G
*************************** 1. row ***************************
variable: diagnostics.allow_i_s_tables
   value: OFF
set_time: 2019-07-13 19:19:29
  set_by: NULL
*************************** 2. row ***************************
variable: diagnostics.include_raw
   value: OFF
set_time: 2019-07-13 19:19:29
  set_by: NULL
*************************** 3. row ***************************
variable: ps_thread_trx_info.max_length
   value: 65535
set_time: 2019-07-13 19:19:29
  set_by: NULL
*************************** 4. row ***************************
variable: statement_performance_analyzer.limit
   value: 100
set_time: 2019-07-13 19:19:29
  set_by: NULL
*************************** 5. row ***************************
variable: statement_performance_analyzer.view
   value: NULL
set_time: 2019-07-13 19:19:29
  set_by: NULL
*************************** 6. row ***************************
variable: statement_truncate_len
   value: 64
set_time: 2019-07-13 19:19:29
  set_by: NULL
6 rows in set (0.0005 sec)
Listing 6-1

The sys schema persisted configuration

Currently the set_by column is always NULL unless the @sys.ignore_sys_config_triggers user variable is set to a value that evaluated to FALSE but is not NULL.

The option you are most likely to change is statement_truncate_len which specifies the maximum length the sys schema will use for statements in the formatted views (more about these later). The default of 64 was chosen to increase the probability that querying views will fit in the width of your console; however, sometimes it is too little to get enough useful information about the statement.

You can update the configuration settings by updating the value in sys_config. This will persist the change and apply immediately to all connections unless they have set their own session value (this happens implicitly when using something in the sys schema that formats statements). As sys_config is a normal InnoDB table, the change will also remain after restarting MySQL.

Alternatively, you can change the setting just for the session. This is done by taking the name of the configuration variable and prepending sys. and turning it into a user variable. Listing 6-2 shows examples both of using the sys_config table and a user variable to change the configuration of statement_truncate_len. The result is tested with the format_statement() function which is what the sys schema uses to truncate statements.
mysql> SET @query = 'SELECT * FROM world.city INNER JOIN world.city ON country.Code = city.CountryCode';
Query OK, 0 rows affected (0.0003 sec)
mysql> SELECT sys.sys_get_config(
                  'statement_truncate_len',
                  NULL
              ) AS TruncateLen\G
*************************** 1. row ***************************
TruncateLen: 64
1 row in set (0.0007 sec)
mysql> SELECT sys.format_statement(@query) AS Statement\G
*************************** 1. row ***************************
Statement: SELECT * FROM world.city INNER ... ountry.Code = city.CountryCode
1 row in set (0.0019 sec)
mysql> UPDATE sys.sys_config SET value = 48 WHERE variable = 'statement_truncate_len';
Query OK, 1 row affected (0.4966 sec)
mysql> SET @sys.statement_truncate_len = NULL;
Query OK, 0 rows affected (0.0004 sec)
mysql> SELECT sys.format_statement(@query) AS Statement\G
*************************** 1. row ***************************
Statement: SELECT * FROM world.ci ... ode = city.CountryCode
1 row in set (0.0009 sec)
mysql> SET @sys.statement_truncate_len = 96;
Query OK, 0 rows affected (0.0003 sec)
mysql> SELECT sys.format_statement(@query) AS Statement\G
*************************** 1. row ***************************
Statement: SELECT * FROM world.city INNER JOIN world.city ON country.Code = city.CountryCode
1 row in set (0.0266 sec)
Listing 6-2

Changing the sys schema configuration

First, a query is set in the @query user variable. This is purely for convenience, so it is easy to keep referencing the same query. The sys_get_config() function is used to get the current configuration value for the statement_truncate_len option. This takes into account whether the @sys.statement_trauncate_len user variable is set. The second argument provides the value to return if the provided option does not exist.

The format_statement() function is used to demonstrate formatting the statement in @query, first with the default value of 64 for statement_truncate_len, then updating sys_config to have a value of 48, and finally setting the value for the session to 96. Notice how the @sys.statement_truncate_len user variable is set to NULL after updating the sys_config table to make MySQL apply the updated setting to the session.

Note

There are a few configuration options supported by some of the sys schema features that are not in the sys_config table by default, for example, the debug option. The documentation of the sys schema objects (https://dev.mysql.com/doc/refman/en/sys-schema-reference.html) includes information on which configuration options are supported.

The format_statement() function is not the only formatting function in the sys schema, so let’s take a look at all of them.

Formatting Functions

The sys schema includes four functions to help you format the output of the queries against the Performance Schema to make the result easier to read or take up less space. Two of the functions have been deprecated in MySQL 8.0.16 as native Performance Schema functions have been added to replace them.

Table 6-1 summarizes the four functions and the new native functions that replace them for the case of format_time() and format_bytes() .
Table 6-1

sys schema formatting functions

sys Schema Function

Native Function

Description

format_bytes()

FORMAT_BYTES()

Converts a value in bytes into a string with a unit (1024-based).

format_path()

 

Takes a path to a file and replaces the data directory, temporary directory, and so on with a string representing the corresponding global variable.

format_statement()

 

Truncates a statement to at most the number of characters set by the statement_truncate_len configuration option by replacing the middle of the statement with ellipses (...).

format_time()

FORMAT_PICO_TIME()

Converts a time in picoseconds to a human-readable string.

Listing 6-3 shows an example of using the formatting functions, and for format_bytes() and format_time(), the results will be compared to the native Performance Schema functions.
mysql> SELECT sys.format_bytes(5000) AS SysBytes,
              FORMAT_BYTES(5000) AS P_SBytes\G
*************************** 1. row ***************************
SysBytes: 4.88 KiB
P_SBytes: 4.88 KiB
1 row in set, 1 warning (0.0015 sec)
Note (code 1585): This function 'format_bytes' has the same name as a native function
mysql> SELECT @@global.datadir AS DataDir,
              sys.format_path(
                  'D:\\MySQL\\Data_8.0.18\\ib_logfile0'
              ) AS LogFile0\G
*************************** 1. row ***************************
 DataDir: D:\MySQL\Data_8.0.18\
LogFile0: @@datadir\ib_logfile0
1 row in set (0.0027 sec)
mysql> SELECT sys.format_statement(
                  'SELECT * FROM world.city INNER JOIN world.city ON country.Code = city.CountryCode'
              ) AS Statement\G
*************************** 1. row ***************************
Statement: SELECT * FROM world.city INNER ... ountry.Code = city.CountryCode
1 row in set (0.0016 sec)
mysql> SELECT sys.format_time(123456789012) AS SysTime,
              FORMAT_PICO_TIME(123456789012) AS P_STime\G
*************************** 1. row ***************************
SysTime: 123.46 ms
P_STime: 123.46 ms
1 row in set (0.0006 sec)
Listing 6-3

Using the formatting functions

Notice that the use of sys.format_bytes() triggers a warning (but only the first time a connection uses it) because the sys schema function name is the same as the native function name. The format_path() function expects backslashes for path names on Microsoft Windows and forward slashes on other platforms. The result of the format_statement() function assumes the value of the statement_truncate_len option has been reset to its default value of 64.

Tip

While the sys schema implementations of format_time() and format_bytes() still exist, it is best to use the new native functions as the sys schema implementations are likely to get removed in a future version and the native functions are much faster.

These functions are not only useful on their own, they are also used by the sys schema to implement views that return formatted data. As it is in some cases necessary to work with the unformatted data, there exist two implementations of most sys schema views as you will see next.

The Views

The sys schema provides a number of views that work as predefined reports. The views mostly use the Performance Schema tables, but a few also use the Information Schema. The views are there both to make it easy to get information out of the Performance Schema and to serve as examples of how to query the Performance Schema.

As the views are ready-made reports that you can use as a database administrator or developer, they are defined with a default ordering. This means that a typical way of using the views is to do a plain SELECT * FROM <view name>, for example:
mysql> SELECT *
         FROM sys.schema_tables_with_full_table_scans\G
*************************** 1. row ***************************
    object_schema: world
      object_name: city
rows_full_scanned: 4079
          latency: 269.13 ms
*************************** 2. row ***************************
    object_schema: sys
      object_name: sys_config
rows_full_scanned: 18
          latency: 328.80 ms
2 rows in set (0.0021 sec)

The result depends on which tables have been used with a full table scan. Notice how the latencies have been formatted like with the FORMAT_PICO_TIME() or sys.format_time() function.

Most of the sys schema views exist in two forms with one having statements, paths, byte values, and timings formatted and the other returning the raw data. The formatted views are very useful if you query a view at the console and look at the data yourself, whereas the unformatted views work better if you need to process the data in a program or want to change the default sorting. The performance reports in MySQL Workbench use the unformatted views, so you can change the ordering from within the user interface.

You can distinguish between the formatted and unformatted views from the name. If a view contains formatting, there will also be an unformatted view with the same name, but with x$ prepended to the name. For example, for the schema_tables_with_full_table_scans view that was used in the previous example, the unformatted view is named x$schema_tables_with_full_table_scans:
mysql> SELECT *
         FROM sys.x$schema_tables_with_full_table_scans\G
*************************** 1. row ***************************
    object_schema: world
      object_name: city
rows_full_scanned: 4079
          latency: 269131954854
*************************** 2. row ***************************
    object_schema: sys
      object_name: sys_config
rows_full_scanned: 18
          latency: 328804286013
2 rows in set (0.0017 sec)

The last topic for the sys schema is the helper functions and procedures that are provided.

Helper Functions and Procedures

The sys schema provides several utilities that can help you when working with MySQL. These include the ability to execute dynamically created queries, manipulating lists, and more. The most important of the helper functions and procedures are summarized in Table 6-2.
Table 6-2

Helper functions and procedures in the sys schema

Routine Name

Routine Type

Description

extract_schema_from_file_name

Function

Extracts the schema name from a path for a file-per-table InnoDB tablespace file.

extract_table_from_file_name

Function

Extracts the table name from a path for a file-per-table InnoDB tablespace file.

list_add

Function

Adds an element to a list unless it already exists in the list. This is, for example, useful if you need to change the SQL mode.

list_drop

Function

Removes an element from a list.

quote_identifier

Function

Quotes an identifier (e.g., table name) with backticks (`).

version_major

Function

Returns the major version for the instance you are querying. For example, it returns 8 for 8.0.18.

version_minor

Function

Returns the minor version for the instance you are querying. For example, it returns 0 for 8.0.18.

version_patch

Function

Returns the patch release version for the instance you are querying. For example, it returns 18 for 8.0.18.

execute_prepared_stmt

Procedure

Executes a query given as a string. The query is executed using a prepared statement, and the procedure deallocates the prepared statement after the execution has completed.

table_exists

Procedure

Returns whether a table exists and if so whether it is a base table, temporary table, or a view.

Several of these utilities are also used internally in the sys schema. The most common use of the routines is in stored programs where you need to handle data and queries dynamically.

Tip

The sys schema functions and procedures come with built-in help in the form of routine comments. You can obtain the help by querying the ROUTINE_COMMENT column of the information_schema.ROUTINES view.

Summary

This chapter has provided a brief introduction to the sys schema, so you know what it is and how to use it when you see examples in later chapters. The sys schema is a useful addition that provides ready-made reports and utilities that can simplify your daily tasks and investigations. The sys schema is a system schema in MySQL 5.7 and later, so no action is required from your side to start using it.

First, the sys schema configuration was discussed. The global configuration is stored in the sys.sys_config table which can be updated, if you prefer different default values than what is provided when MySQL is installed. You can also change the configuration option for a session by setting a user variable with sys. prefixed to the name of the configuration option.

Then the sys schema formatting functions were covered with mention of the cases where native Performance Schema functions have been added as replacement for the sys schema functions. The formatting functions are also used in several of the views to help make the data easier to read for humans. For the views using the formatting functions, there is also a corresponding unformatted view with x$ prefixed to the name.

Finally, several helper functions and procedures were discussed. These can help you when you try to do work dynamically, such as executing a query generated in a stored procedure.

The next chapter is about the Information Schema.