In the previous chapter, we introduced AI, machine learning, and deep learning. We also discovered how the banking sector functions and how the use of AI can enhance banking processes. We learned the importance of banking processes being easily accessible. We also learned about a machine learning modeling approach called CRISP-DM. Overall, the chapter provided the necessary background for the application of machine learning in the banking industry to solve various business problems.
In this chapter, we will learn about an algorithm that analyzes historical data to forecast future behavior, known as time series analysis. Time series analysis works on the basis of one variable—time. It is the process of capturing data points, also known as observations, at specific time intervals. The goal of this chapter is to understand time series analysis in detail through examples and explain how Machine-to-Machine (M2M) communication can be helpful in the implementation of time series analysis. We will also understand the concepts of financial banking as well.
In this chapter, we will cover the following topics :
- Understanding time series analysis
- M2M communication
- The basic concepts of financial banking
- AI modeling techniques
- Demand forecasting using time series analysis
- Procuring commodities using neural networks on Keras
Understanding time series analysis
A time series is technically defined as the ordered sequence of values of a variable captured over a uniformly spaced time interval. Put simply, it is the method of capturing the value of a variable at specific time intervals. It can be 1 hour, 1 day, or 20 minutes. The captured values of a variable are also known as data points. Time series analysis is performed in order to understand the structure of the underlying sources that produced the data. It is also used in forecasting, feedforward control, monitoring, and feedback. The following is a list of some of the known applications of time series analysis:
- Utility studies
- Stock market analysis
- Weather forecasting
- Sales projections
- Workload scheduling
- Expenses forecasting
- Budget analysis
Time series analysis is achieved by applying various analytical methods to extract meaningful information from raw data that has been captured from various data sources. Time series analysis is also useful for producing statistics and other characteristics of data—for example, the size of data, the type of data, the frequency of data, and more. In time series analysis, the capturing of a value is done at a point of observation.
Let's try to understand this through an example. When using time series analysis modeling, the branch manager of a specific branch can predict or forecast the expenses that will occur in the upcoming year. The branch manager can do this by employing a time series analysis machine learning model and then training the model using historical yearly expense records. The recorded observations can be plotted on a graph with a specific time (each day, in this example) on the x axis and historical expenses on the y axis. Therefore, time series analysis is an algorithm that is used to forecast the future values of one variable (that is, yearly expenses in this example) based on the values captured for another variable (in this case, time).
Let's understand this in more detail using another example. In this example, we will imagine that a bank wants to perform expense projections based on the historical data it has. The bank manager wants to know and forecast the expenses in the year 2020 for the branch that he manages. So, the process of forecasting the expenses will start by collecting historical expenses information from the year 2000. First, the bank manager will look at the expenses data for the year.
As we mentioned earlier, time series analysis is done by capturing the values of a variable. Can you guess the variable in this example? I am sure that you will have guessed it by now. The variable under observation is the total expense amount per year. Let's assume that the following is the data per year:
Total expense (in USD)
Many options are available to analyze this data and predict future expenses. The analytical methods vary in terms of complexity. The simplest one will be to average out the expenses and assume the resultant value to be the number of expenses for the year 2020. However, this is solely for the purpose of our example. You can find the average of expenses by using various other mathematical and analytical methods as well. With this option, the total number of expenses for the year 2020 will be $20,915.
The complex method may involve analyzing detailed expenses, predicting future values for each individual expense type, and deriving the total expenses amount based on it. This may provide a more accurate prediction than the averaging option. You can apply a more complex analytical method based on your needs. This example is provided so that you can understand how time series analysis works. The amount of historical data that we have used in this example is very limited. AI and machine learning algorithms use large amounts of data to generate predictions or results. The following is a graphical representation of this example:
In the following section, we will learn how machines can communicate with each other using a concept known as M2M communication.
M2M communication is extremely powerful and can dramatically improve the functions of commercial banking.
M2M communication represents the communication between two machines or devices through various channels such as physical networks, information-sharing networks, software communication, and application interfaces. The sole purpose of M2M communication is the exchange of information across two or more machines or between software running on those machines.
The concept of M2M communication assumes that no human intervention is required while exchanging information between machines. M2M communication can also take place over wireless networks. Wireless networks have made M2M communication easier and more accessible. The following list includes several common applications of M2M communication:
- Smart utility management
- Home appliances
- Healthcare device management
However, M2M communication is different from IoT. IoT uses sensors to trigger inputs, whereas M2M communication specifically refers to the interaction between two systems.
Commercial banking is a group of financial services that includes deposits, checking account services, loan services, drafts, certificates of deposits, and savings accounts for individuals and businesses. Commercial banks are the usual destination for peoples' banking needs. But how do banks function and make money? This is a very common question that we will answer right now. Commercial banks make money when they earn interest from various types of loans that they provide to their customers. The types of loans can vary, for example, automobile loans, business loans, personal loans, and mortgage loans. Usually, a commercial bank has a specialty in one or more types of loans.
Commercial banks get their capital from various types of account services that they provide to their customers. The types of accounts include checking accounts, savings accounts, corporate accounts, money market accounts, and more. Banks utilize this capital to invest in high-return investment options such as mutual funds, stocks, and bonds. Banks have to pay interest to those customers who have their accounts with the bank. The rate of interest is far less when compared to loans, however.
The role of M2M communication in commercial banking
Consider an example that involves transferring money from one customer's account to another customer's account. In the past, this was a manual task that required filling in an appropriate form, submitting the form to the appropriate department where ledger entries were created, and then the amount was debited from one account and credited to the beneficiary's account.
Nowadays, this process has changed entirely. With a mobile phone, the customer can transfer funds from one account to another without any hassle. The beneficiary's account will be credited with the money within a few minutes. Incredible, isn't it? So, how did this happen? Well, M2M communication and process automation have played a major role in making this happen. It has become possible for machines (that is, computer systems, cloud-based virtual machines, and mobile devices) to connect over a wireless or wired network and transfer every piece of necessary information to another machine or software running on that machine. Nowadays, you only have to visit the bank for a few specific reasons. Customers can now even open a savings bank account or a loan account straight from their mobile devices.
The basic concepts of financial banking
Before we move full steam aheadinto anotherexample,we will first craft out our data, AI, and business techniques and knowledge. If you are familiar with all of these concepts, feel free to skip this section.
Financial knowledge is a good place to start to understand how our decisions in forecasting business activities impact financial decision-making in non-financial firms. Additionally, when predicting future activities using a machine learning model, we also learn how the finance industry can prepare for this future volume of business. What financing does to help with core business activities in non-financial firms is covered in the following section.
The functions of financial markets – spot and future pricing
Financial markets, such as exchanges, play the role of markets for products to be exchanged. For example, consider commodities such as natural gas—we can either buy it directly from sellers or buy it via an exchange. As it turns out, long-running economics theories encourage you to buy the product from an exchange as much as possible if the product is standardized. Chicago Mercantile Exchange (CME) in the US could be a popular choice for commodities and, needless to say, the New York Stock Exchange (NYSE) is the market for publicly listed equities.
In this chapter, let's stick to natural gas as a product that we need. Of course, in some cases, it could be more efficient to buy it from big oil companies such as Shell—that is, if we want these physical goods from producers on a regular basis.
Within exchange markets, there are two prices— the spot price and the future price. Spot price means you can have goods (delivered) now if you pay; future price means you get the goods later by paying now.
Choosing between a physical delivery and cash settlement
Even if a change in ownership is to take place, it could occur in two forms, that is, via physical delivery or a cash settlement. Ultimately, physical delivery or a cash settlement depends on whether we need the goods immediately or not. However, on any given trading day, we must weigh up the cost of only two options: physical delivery (cost of natural gas + cost of financing + cost of storage) as opposed to a cash settlement.
Essentially, we have four options, as presented in the following table—assuming that we need to have the physical natural gas product in 3 months time for the generation of electricity:
Finance the purchase to get the product now; store it for 3 months.
Buy the product now and own it on paper. There is no need to keep the goods.
Finance the purchase now to get the product in the future; get it in 3 months.
Finance to buy the product in the future. 3 months later, purchase on spot from the market with physical delivery.
To weigh up the options, we require the following data:
- Storage costs should be provided by the internal source if the company owns the storage facility for natural gas. It is assumed to be rather static, for example, at around 0.12 per MMBtu. MMBtu is a unit used to measure the energy content in fuel. It stands for one Million British Thermal Units.
- The financing cost should cover the storage period and the interest expenses for the purchase cost. It is assumed to be $0.1 per MMBtu. This should be fed by the bank.
- The cost of natural gas (spot price) should be provided by the market data provider. The real-time Henry Hub spot price should be provided by Thomson Reuters, for example, at around $2.5 per MMBtu.
- The cost of futures should be provided by CME. Data should be available on Quandl free of charge. It should be around $3 per MMBtu for 3-month contract.
The numbers given here merely provide an indication of the magnitude of values. Of course, they could be optimized by comparing the options—however, the decisions can be derived by linear algebra, and not many machine learning techniques are needed. In real life, we should not impose a machine learning solution on anything if we can have a nice deterministic formula to do so.
Options to hedge price risk
To avoid the price swinging outside the predefined price range of natural gas, we will needheuristic rulessuch as deciding what to do at what price given a fixed target purchase quantity. Alternatively, we need the rules to adjust what has been already placed to sell or buy more.
Take the following example. If the price is beyond the acceptable range, for example, lower than $2.84 or higher than $3.95, we can choose to pocket the profit by doing one of the following:
- Writing options (selling insurance) if the price drops a lot.
- Reducing the loss by buying options (buying insurance) if the price shoots up unfavorably.
The followingdiagramshows the per-unit payoff from the hedged position by buying insurance against the high procurement price and selling the benefits at a low procurement price:
Here, we have sold insurance at an extremely low price—which means that even though we should have enjoyed a lower cost of procurement, we gave it away at the benefit of taking an insurance premium. On the other hand, there will be positive payoff when the price is too expensive, that may eat into the profitability of the company—by paying a premium to buy insurance. The exact price of the insurance is called option pricing and will be addressed in Chapter 7, Sensing Market Sentiment for Algorithmic Marketing at Sell Side. We now assume that the price we pay for the insurance is the same as the price we earn from selling insurance.
AI modeling techniques
In the following sections, we will introduce the Autoregressive Integrated Moving Average (ARIMA), the most traditional type of forecasting model. We will also introduce a neural network model. ARIMA is a class of statistical models that is used to forecast a time series using past values. ARIMA is an acronym for the following:
- AR (autoregression): Autoregression is a process that takes previous data values as inputs, applies this to the regression equation, and generates resultant prediction-based data values.
- I (integrated): ARIMA uses an integrated approach by using differences in observations to make the time series equally spaced. This is done by subtracting the observation from an observation on a previous step or time value.
- MA (moving average): A model that uses the observation and the residual error applied to past observations.
Introducing the time series model – ARIMA
For this project, we will fit data into a time series model calledARIMA. ARIMA is a specific type of time series model in statistics, which is commonly used to predict data points in the future, with parameters on autoregressive terms (p), non-seasonal differences (d), and lagged terms (q).
This ARIMA model belongs to parametric modeling—models that are fitted by known parameters. Normally, we classify this type of model as a statistical model because we need to make assumptions about what the data looks like. This is considerably different for wider machine learning models that do not have any preset assumptions about what the data looks like.
However, in a real banking scenario, a statistical approach is still prevalent among the econometrics, quantitative finance, and risk management domains. This approach works when we have a handful of data points, for example, around 30 to 100 data points. However, when we have a wealth of data, this approach may not fare as well as other machine learning approaches.
ARIMA assumes that there is a stationary trend that we can describe. The autoregressive terms, p and d, are each significant in their own way:
- pmeans the number of past period(s) that is affecting the current period value (for example,p = 1: Y current period = Y current -1 period * coefficient + constant).
- Non-seasonal difference (d) refers to the number of past periods progression impacting the current period values (for example,d = 1: the difference betweenY now versusYin the past period).
- Lagged terms (q) means the number of the past period's forecast errors impacting the current period values.
Consider an example in whichq = 1: Yimpacted by an error in the t - 1period—here, error refers to the difference between the actual and predicted values.
In a nutshell, ARIMA specifies how the previous period's coefficient, constant, error terms, and even predicted values impact the current predicted values. It sounds scary, but it is, in fact, very understandable.
After the model is fit, it will be asked to make a prediction and be compared against the actual testing data. The deviation of the prediction from the testing data will record the accuracy of the model. We will use a metric called theMean Square Error(MSE) in this chapter to determine the fitness of the model to the data.
Introducing neural networks – the secret sauce for accurately predicting demand
We may have a good data source, but we should not forget that we also need a smart algorithm. You may have read about neural networks thousands of times, but let's look at a short explanation before we use them extensively throughout the book. A neural network is an attempt by a computer to mimic how our brain works—it works by connecting different computing points/neurons with different settings.
Architecture-wise, it looks like layers of formulas. Those of you reading this book probably have some background in algebra, and can see how the interested outcomeYis related toX, the variable, with b being the coefficient and c being the constant term:
Yis what we wish to predict on the left-hand side; on the right-hand side,bX + care the forms that describe how the feature (X) is related toY. In other words,Yis the output, whileXis the input. The neural network describes the relationship between the input and the output.
Suppose that Zis what we want to predict:
It seems that the formulas are linked:
This is the simplest form of a neural network, with one input layer, one hidden layer, and one output layer. Each of the layers has one neuron (point).
There are other concepts in neural networks, such as backpropagation. This refers to the feedback mechanism that fine-tunes the neural network's parameters, which mostly connect neurons within the network (except when it is a constant parameter at the layer). It works by comparing the output at output layerZ(predicted) versus the actual value of Z (actual). The wider the gap between actual and predicted, the more adjustment of b, c, d, and e is needed.
Understanding how gaps are measured is also an important piece of knowledge—this is called metrics and will be addressed in Chapter 3, Using Features and Reinforcement Learning to Automate Bank Financing.
Neural network architecture
Architecture concerns the layers and number of neurons at each layer, as well as how the neurons are interconnected in a neural network. The input layer is represented as features. The output layer can be a single number or a series of numbers (called a vector), which generates a number ranging from 0 to 1 or a continuous value—subject to the problem domain.
For example, to understand the structure of a neural network, we can project that it will look like the following screenshot from TensorFlow Playground (https://playground.tensorflow.org/), which is the visualization of another network with the same hidden layers—three layers with a size of 6:
Using epochs for neural network training
Besides the design of the neural network, we also utilize theepochparameter, which indicates the number of times the same set of data is fed to the neural network.
We need to increase the number of epochs if we do not have enough data to satisfy the number of parameters in neural networks. Given that we have X parameters in the neural network, we need at least X data points to be fed to the network. Unfortunately, if our data point is only X/2, we need to set epoch to 2 in order to make sure that we can feed X data points (all of them are fed twice) to the network.
Before feeding the features to the machine learning model, we will normalize the input features of different magnitudes to be of the same magnitude. For example, the price and volume of goods are different types of numeric data. The scaling process will make sure that both of them are scaled to the same range, from 0 to 1. In classical statistical modeling processes, this step is very important to avoid a particular feature of bigger scales that dominate the influence on the prediction.
Apart from data column-level scaling, we also need to pay attention to the sampling bias of the model. Normally, we will set aside a portion of the data unseen by the machine while it is training and learning on another set of data—which is called a training set. Later on, the testing set (which is the dataset kept aside) will be used to check against the prediction made by the model.
Demand forecasting using time series analysis
In this section, we will take a look at the first example of forecasting the demand for electricity consumption, and predict the energy expenses using time series analysis. We will start with a brief problem statement and define steps to solve the problem. This will give you a better understanding of how to find solutions to problems using time series analysis.
Today, electricity or energy is a very basic necessity for all of us. We use electricity and pay bills. Now, as a customer, we want to analyze electricity consumption and predict future consumption and predict energy expenses. This is the problem that we will solve in this section.
Time series analysis is the optimal approach for solving problems similar to the one defined in the preceding problem statement. Machine learning models need large datasets to be fed before the actual solution is derived. These large datasets are used by machine learning models to derive a pattern or identify an existing pattern that might not be visible when the data is scattered. Similarly, our first step would be to obtain a large amount of data and process it to extract meaningful information. This is going to be a three-step process. Here are the steps that we will follow:
- Downloading the data
- Preprocessing the data
- Model fitting the data
Downloading the data
Start by downloading data regarding electricity consumption and energy expenses. Even though we can download data from public websites now, in a true production environment, it is not uncommon to download data from an internal database and pass it to users as a flat file (a text file with no database structure).
You can download the files from the following paths:
- Consumption: https://www.eia.gov/opendata/qb.php?category=873&sdid=ELEC.CONS_TOT.NG-CA-98.M
- Cost: https://www.eia.gov/opendata/qb.php?category=41625&sdid=ELEC.COST.NG-CA-98.M
- Revenue: https://www.eia.gov/opendata/qb.php?category=1007&sdid=E LEC.REV.CA-RES.M
There are many ways of obtaining data, for example, using an API or robots. We will address these other methods of extracting data as we move further into the book. In Chapter 4,Mechanizing Capital Market Decisions, we'll obtain data through an API call. If we were to use a robot, for example, we could have used Beautiful Soup to parse the website or register the API. However,in this example, we simply visit the site using a browser and navigate to the downloadbutton to download the data.
Preprocessing the data
After we obtain the data, we align it together in the same time series, as the data we've downloaded can cover different periods of time. As data scientists, we strive to align our data in one single sheet of data, with all of the required data listed column by column (that is, cost, consumption, sales, and more):
Each line (or row)of the data should represent a single month. Right before we feed our data for the machine to learn the patterns, we will need to set aside some data for testing and some for learning. With the testing data, we can see whether the model predicts well, without training on the learning data first. This is a fundamental step in all ML/predictive models. We do not feed the testing dataset for ML/trai ning. The line that calls the function is as follows:
list_flds = ['consumption_ng','avg_cost_ng']
tuple_shape_list = [(8,0,3),(12,1,3)]
list_flds = preprocess_data(f_name,list_flds)
In this program, we set aside the earliest 70% of data points as training data for the machine to learn and adapt to, while keeping the latter 30% of data points as testing data. This is data that will be used to compare against the prediction made by the model, not used to f it the data .
Model fitting the data
Once the data is clean, we will start training the machine to learn about the pattern. The training data will be fed to the machine as fitting. The model is like a shirt and the training data is like the body we're attempting to fit it to.
Here are the steps to fit our data into an ARIMA model:
- For each data file/field in the consolidated file, we run step 3 and step 4 (which have been marked in the code file for the following code block).
- If the Boolean variable, parameter_op, is set to True, then we will run step 5 (which is marked as well). This explores all the possible combinations of parameters in ARIMA with regard to p, d, and q, which are set as follows:
- p: Ranging from 0to 12
- d: Ranging from 0 to 2
- q: Ranging from 0 to 3
- For combinations of any of the preceding values, we calculate the fitness of the data to the actual pattern and measure the error values. The combinations with the lowest error values are selected as the chosen parameters of the ARIMA model.
The following is the code snippet to fine-tune the parameters used (please refer to the full code file on GitHub: https://github.com/PacktPublishing/Hands-On-Artificial-Intelligence-for-Banking):
start = time.time()
lowest_order = (0,0,0)
for p_para in range(13):
for d_para in range(3):
for q_para in range(4):
order = (p_para,d_para,q_para)
error,temp_data,temp_model = forecasting(dta_df, fld, False, \
order, 0.7, fld)
lowest_order = order
lowest_data = temp_data
lowest_model = temp_model
end = time.time()
Congratulations! We have now delivered a model that can provide volume forecasts for the future!
Procuring commodities using neural networks on Keras
In this section, we will take a look at another more complex example. As before, we will define the problem statement and then define steps to solve the problem.
In this example, we want to forecast the procurement of commodities based on historical data. The commodity that we are going to use is natural gas. In the case of natural gas, we do not have any control over its pricing because it is a hugely globalized commodity. However, we can still set up the internal procurement strategy when the pricing of the natural gas hits a certain range. The profitability ratio target will constrain the maximum pricing we can pay for the raw material to be profitable for the owners of the firm. We will track the profitability ratio, which is the ratio of cost of natural gas to sales.
Let's understand this pricing constraint with an example. In this example, we assume that for each dollar spent where the unit cost of natural gas (for electric power) increased, the cost of materials to sales of the energy company will increase by 9.18% (this is based on 3 years of data):
The following table shows the weighted average for sales on an annual basis:
Here, you can see the cost of natural gas to sales from 2015 to 2017. In 2017, at an average unit weight of $3.91, the cost of natural gas to sales is at 36.45%. We assume that the average unit weight and cost to sales are in a constant relationship—averaging the values of the cost of materials rate across the years (from 2015-2017, that is, 7.66%, 9.32%, and 9.66%). We took an average of all three figures to come to a weighted average of 9.18%.
Remember, the actual number should, in practice, come from the internal accounting system, not the external US Energy Information Administration (EIA) data that is used only for the purpose of electric power.
Based on the last 3 years of data, we find that the average cost of materials to sales stood at 31.15% (the average of iii from the table), which translates into $3.39 million/thousand cubic feet. The cost of the material of sales is at 36.24% with the unit cost at $3.95 million/mcf in the upper range. The mcf is a standard unit cost of natural gas. It is equal to a thousand cubic feet. However, at the lower range, the cost of the material of sales is 26.07% with the unit cost at $2.84 million/mcf. The unit conversion details can be found on the EIA website.
Data is extracted from the preceding sales figure table: Operating Expenses/Operating Revenues = cost of materials to sales.
After we have established the procurement plan, we then need to understand where to source the natural gas from. In real life, we would consider how the model's insight gets executed; perhaps we also need to build a model to make the subsequent decision on how the model's insights get executed. This is exactly what we have mentioned in business understanding on how to execute the order in exchange markets.
To complete this story, we assume that we purchase the natural gas from the exchange markets using physical deliveries whenever the price hits the target range for the quantity we forecasted.
The following data flow outlines the steps we need to take in order to prepare and generate thecode to build the commodity procurement model. The first box denotes a script run on the SQLite database; the other boxes denote steps run on Python:
It generally fits into the framework ofCRISP-DM,with different areas of focus throughout this book—some may focus on understanding the business, while some may focus on evaluation. The steps in the preceding diagram are covered in detail in the following sections.
Preprocessing the data (in the SQL database)
Data preprocessing means converting the data into the desired data features. We run it outside of Python coding to reduce the layers involved (that is, we interact directly with SQLite rather than using Python to interact with SQLite). Here are the steps involved in performing database operations:
- Create the SQLite database.
- Import the data as a staging table.
- Create the required table(s)—a one-time operation.
- Insert the staging table into the actual table with data type and format transformation.
- Create the view that does the feature engineering.
- Output the preprocessed view as CSV data.
Importing libraries and defining variables
Import libraries and define variables to make sure that the relevant functions can be used. Import all of the relevant libraries:
- pandas: This is for data storage before data is fed to the machine learning module.
- keras: This an easy-to-use machine learning framework that has another library.
- tensorflow: This is used as the backend.
- sklearn: This is a very popular machine learning module that provides lots of data preprocessing toolkits along with some machine learning models that are easy to use. The models are not used in this example, as we wish to build up the foundation for the more extensive use of machine learning models afterward. In addition, sklearn also has metrics that appraise the performance of the models.
- matplotlib: This is the default data visualization tool.
The following code block is the code importing all the listed libraries:
2. import all the libraries required
import pandas as pd
from keras.models import Model
from keras.layers import Dense, Input
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
demand_model_path = 'demand_model.h5'
f_in_name = 'consumption_ng_exp.csv'
Reading in data
The following is the code snippet to read in the data and take on the result generated from step 1:
#Read in data
pd_trade_history = pd.read_csv(f_in_name,header=0)
pd_trade_history = pd_trade_history.drop('date_d',1)
Preprocessing the data (in Python)
Now we come to data preprocessing in Python. Some studies claim that data scientists spend 80% of their time on this step! It includes selecting features and target variables, checking/validating data types and handling missing values (this component is not included in this example to reduce complexity), and splitting data into a training set and a testing set. In some cases, when the ratios of the various classes of targets are not similar in quantity, we may need to do stratified sampling to ensure that balanced training samples are fed for machine learning. In this example, we set aside 20% for testing and 80% for training:
4. Pre-processing data
#4.A: select features and target
df_X = pd_trade_history.iloc[:,:-5]
df_Y = pd_trade_history.iloc[:,-4:]
np_X = df_X.values
np_Y = df_Y.values
#4.B: Prepare training and testing set
X_train, X_test, Y_train, Y_test = train_test_split(np_X, np_Y, \
test_size = 0.2)
#4.C: scaling the inputted features
sc_X = StandardScaler()
X_train_t = sc_X.fit_transform(X_train)
X_test_t = sc_X.fit_transform(X_test)
Training and validating the model
We train the neural network by feeding the training dataset to generate a model. The following code snippet defines the machine learning model in Keras and trains it. It builds the deep neural network model with 329 parameters:
#5. Build the model
inputs = Input(shape=(14,))
x = Dense(7, activation='relu')(inputs)
x = Dense(7, activation='relu')(x)
x = Dense(7, activation='relu')(x)
x = Dense(4, activation='relu')(x)
x = Dense(4, activation='relu')(x)
x = Dense(4, activation='relu')(x)
x = Dense(4, activation='relu')(x)
predictions = Dense(units=4, activation='linear')(x)
demand_model.compile(loss='mse', optimizer='adam', metrics=['mae'])
demand_model.fit(X_train_t,Y_train, epochs=7000, validation_split=0.2)
Y_pred = demand_model.predict(X_test_t)
#conver numpy as dataframe for visualization
pd_Y_test = pd.DataFrame(Y_test)
pd_Y_pred = pd.DataFrame(Y_pred)
Testing the model
We will compare the data points set aside (20%) in step 4 against the predicted outcome based on the models trained and the features data:
##6. Test model: Measure the model accuracy
combine both actual and prediction of test data into data
data = pd.concat([pd_Y_test,pd_Y_pred], axis=1)
data_name = list(data)
error1 = mean_squared_error(data['actual1'],data['predicted1'])
print('Test MSE 1: %.3f' % error1)
error2 = mean_squared_error(data['actual2'],data['predicted2'])
print('Test MSE 1: %.3f' % error2)
error3 = mean_squared_error(data['actual3'],data['predicted3'])
print('Test MSE 1: %.3f' % error3)
error4 = mean_squared_error(data['actual4'],data['predicted4'])
Visualizing the test result
This step allows us to cross-check the metrics that represent the performance of the model—the MSE:
#7. Visualize the prediction accuracy
This will result in the following plot:
Generating the model for production
The model that was trained and tested in steps 5 and 6 will be output as a file for the production system to run on unseen production data. We will output two files—one for scaling the input features and another one for the neural network:
#8. Output the models
Congratulations! We have now delivered a model that can be used at the operational level to identify the quantity to order for this month's demand, next month, and the month after. The following diagram shows the steps in the training versus the deployment of machine learning models:
However, we are not going to cover deployment right now. We will keep this in mind and address the topic as we progress in this book. We will explore wrapping the AI production solution as an API in Chapter 8, Building Personal Wealth Advisers with Bank APIs.
In this chapter, you learned about time series analysis, M2M communication, and the benefits of time series analysis for commercial banking. We also looked at two useful examples by defining the problem statement and deriving the solution step by step. We also learned about the basic concepts of time series analysis and a few techniques, such as ARIMA.
In the next chapter, we will explore reinforcement learning. Reinforcement learning is an area of machine learning involving algorithms. The application takes an appropriate action to maximize the effectiveness of the outcome in a particular situation. We will also look at how to automate decision-making in banking using reinforcement learning. Exciting, isn't it?