Automated Portfolio Management Using Treynor-Black Model and ResNet – Hands-On Artificial Intelligence for Banking

Automated Portfolio Management Using Treynor-Black Model and ResNet

In the previous chapter, we covered the basic concepts of investment banking. We also learned about the concepts of Mergers and Acquisitions (M&A) and Initial Public Offering (IPO). We examined the clustering model, which is a modeling technique of AI. We looked at detailed steps and examples to solve the problem with auto syndication. We implemented an example that identified acquirers and targets. So, the previous two chapters were intended for the issuers on the securities side of investment banking.

In this chapter, we will look at the dynamics of investors. Investors drive investment behavior strategically. The issuance of equity or debt can be done in either of two ways—via the primary market or the secondary market. The role of the primary market is to issue new securities on behalf of companies, the government, or other groups in order to receive financing by debt or equity-oriented securities. The role of the secondary market is to facilitate interested parties with the buying or selling of previously issued securities. The role of portfolio managers is to make smarter decisions based on the price movement of the securities to increase the amount of profit for their customers. The portfolio manager tries to understand the needs of investors and places money behind those investments that generate the maximum return.

In this chapter, we will cover the following topics:

  • Financial concepts
  • The Markowitz mean-variance model
  • The Treynor-Black model
  • Portfolio construction using the Treynor-Black model
  • Trend prediction

Financial concepts

In this section, we will explore various financial concepts. For an in-depth survey of the domain knowledge, you are encouraged to refer to the syllabus of the Chartered Finance Analyst (CFA).

Alpha and beta returns in the capital asset pricing model

According to the capital asset pricing model (CAPM), investment return equals the risk-free rate + alpha + beta * market return + noise (with a mean of zero). Alpha is the return earned by the superior performance of the firm or investors, while beta is the riskiness of the asset in comparison to the overall market return. Beta is high when the risk of the investment is riskier than the average market. Noise is the random movement or luck that has a long-term return of zero.

The asset management industry, especially professional investment managers, is commonly charging clients based on alpha. That explains why people pay so much attention to alpha.

Realized and unrealized investment returns

Investment returns (gain) can be realized or unrealized. Realized return is the return that is actualized and pocketed. Unrealized return is the return we would have pocketed today if we had sold the assets for money.

Investment policy statements

The investment industry works on investing on behalf of the asset owner. As an asset manager, it is our fiduciary duty to advise and invest on behalf of the client. So far in this book, we have tried to understand the investment needs of investors by looking at behavioral/trading data. However, the key data is, in fact, the investment policy statement (IPS) that the investors will establish.

An IPS contains the return objective, risk appetite, and constraints laid down by the investor. The return objective and risk appetite are both variables that we can define quantitatively. Return can be defined as the net annual return of the inflation rate. If the target return is 1% and the inflation rate is 1%, then this means the value of capital is preserved as the price level of goods increases by the inflation rate. In the long run, the purchasing power we put into the portfolio remains the same because the value grows in line with the price level.

Each of these variables can be mathematically expressed as follows:

  • Return objective: This return objective is called capital preservation. The 1% return rate is called the nominal rate of return. After deducting the inflation rate from the nominal rate, it is called the real rate of return:

  • Risk appetite: Risk appetite can be defined as the volatility of a return. We normally define it like this:

The choice of risk appetite is subjective—some people like the ups and downs that accompany this excitement. While some prefer to read this book while sitting on the sofa (pretty boring, isn't it?), some prefer to read it seated on a chair at the table. Some people prefer to work in a boring 9 to 5 with steady pay, while others prefer the excitement of a start-up, with the hope of getting rich quickly and risking the possibility of failure.

That said, boring does not mean there is a lower risk of being laid off, and an exciting job does not imply a high risk of losing out on the job. There are obvious cases where we have an exciting job, high potential, and stability. That's exactly the target result of asset allocation from this portfolio management process.

Given that this book concerns practical aspects of work, for more details, I recommendManaging Investment Portfolio, A Dynamic Processby the CFA Institute ( Our objective here is to define the necessary parameters to execute the machine learning program in Python.

The challenge, in the age of AI, is how to bring this policy to life in the form of code that a machine would understand. Indeed, the investment community has the task of digitizing investment policy.

Recent advancements in blockchain promise smart contracts, which is based on the assumption that certain statements could be digitized as logic. If a contract could be coded as a smart contract on the blockchain for execution, then why not an IPS? Let's assume that the investment policy is codified for the rest of this chapter.

Asset class

Portfolio management is the process of allocating capital to various investment assets based on the characteristics of the asset class or risk factors. We'll begin by focusing on asset class allocation. An asset class is defined as a group of assets that bear similar characteristics. It actually sounds quite similar to the outcome of a clustering model.

To blend in our the knowledge of finance, asset classes normally refer to equity, bonds, the money market, and alternative investments. Alternative investments can be subdivided into real estate, private equity, hedge funds, and commodities. Equity refers to the equity shares issued in the publicly traded market, while bonds refer to the debt issued by companies. The money market refers to short-term debt that has a duration of between one day and one year. They are different from bonds as the money market is highly liquid (heavily traded with a fairly priced market), whereas, in bond issues, the market can either be very illiquid or dominated by certain investors. Bonds typically refer to debt that has a longer duration, such as 10 years of maturity or beyond. Of course, it can include anything above 1 year, typically called notes.

Players in the investment industry

Investors play a central role in the finance industry. It is, however, equally important to know the other major players—investment managers (who manage the investor's money), brokers who are referred to as the sell side (typically, an investment bank or securities firm), and consultants and advisers who provide specialized advice to investors on how to choose investment managers. Custodians refer to the party that looks after the settlement and the administrative aspects of any investment transactions and filings with the exchange markets.

If the investment managers are from institutions, they are referred to as institutional investors, whereas someone who is acting on their own is called an individual investor. Institutional investors have fiduciary duties to the beneficiary owners of the investment money. Those beneficiaries are the real customers of the investment managers. For instance, in the case of Duke Energy, the ultimate beneficiaries could be the employees of Duke Energy. In between, it could either be the treasurer who manages the fund as the investment manager, or it could be the outsourced investment fund managers who are the chosen investment managers.

On the sales side of the industry, the fund could be for institutional investors, individual investors, or retail distribution via banks or insurance companies. In the case of retail distribution, the responsibility of fitting the investment to the needs of the owners lies with the distributors. While it is the institutional investors or individual investors that deal directly with the investment managers, it's the investment managers or consultants who are responsible for the matching.

Benchmark – the baseline of comparison

A benchmark is used by the investment portfolio to define what average market return they should be measured against. It could refer to the market return or the betain the CAPM. Anything above the average is called alpha. In this chapter's example, we assume that the global equity Exchange-Traded Fund (ETF) is the market benchmark.

If we were to construct the world's market benchmark for an investor in the world's assets, we could analytically create such an index that weighted the various indices or baskets of returns.

Investors are return-seeking

A study by the Bank of International Settlement shows that investors are exhibiting return-chasing behavior. This means that one of the key principles of investment is to follow the market (return). It also surely means that we will always be slower than the market if we solely let returns drive our allocation decisions. So, in the world of AI, there could be two ways to improve:

  • Being very quick to follow the return trend with a super-fast machine
  • Predicting the market better than the crowd

An ETF promises to do the former, provided that we do the allocation to the ETF quickly enough—which, in turn, defeats the purpose of following the markets as there are so many different kinds of ETFs. It is only possible when we invest in a true market representative—for example, a major market index ETF; otherwise, we are still going back to the same challenge of trying to allocate to the right securities/investments to generate alpha (beating the market).

In most trading books, the author will hide what their winning strategies are—which makes our understanding of what a strategy actually isdifficult. To remain really practical, we are going to work on a losing strategy, which you can improve on. This, at least, shows you end-to-end strategy development and gives you a full view of how trading works.

Trend following fund

After allocating assets to the fund manager, let's dig deeper into the fund being invested. If we were the ETF, one of our key needs would be to track the underlying securities. For example, if the fund's mandate is to track the performance of a basket of securities given a set of rules, we can just simply buy and hold the underlying assets until redemption (that is, the investors withdraw their money).

However, if we try to predict the pricing movement in advance and act accordingly, there is a chance that we win more than just the benchmark—this is what we refer to as alpha.

Using technical analysis as a means to generate alpha

A school of thought in trading believes in the trends exhibited in the pricing of securities. This is called technical analysis. It assumes that past pricing movements can predict future movement. The following graph shows the trend in the price of securities over a certain period of time:

At a very high level, we see that the securities price moves in trend, but the length of the trend is not always the same. There has been a wealth of studies on how to read the patterns seen on the pricing movement across time. But isn't this a computer vision challenge? As opposed to us hand-picking countless numbers of features, should we leave it to a computer to read the graphs and learn how to chart trend lines?

In terms of types of patterns, a good place to start is Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications (New York Institute of Finance). For information on exact data processing to detect patterns, please refer to Advances in Machine Learning. Here, Dr. Prado really takes it to another level by giving you an insight into the working before the data is fed to the machine.

Trading decisions – strategy

A trading strategy refers to the considerations and actions to be taken on trading activities. So, in this chapter, as opposed to barring the strategies, I will show you an actual strategy that failed at its design. For a real trader, disclosing the winning trading strategy will kill the strategy because people can trade against it—for example, an opposing trader can also sell when you are expected to buy and vice versa.

The brief strategy we present here does not generate positive alpha compared to a simple buy-and-hold strategy with the same asset. However, I will indicate ways to improve it.

To learn more about the comparative behaviors of traders and bankers, please refer to Compensating Financial Experts by The Journal of Finance (

Understanding the Markowitz mean-variance model

The objective of portfolio management is to minimize risk in order to ascertain the target return, given that, for the specific investor, we have the target return and risk tolerance captured from the IPS and historical returns. Typical portfolio optimization models used in the industry include the Markowitz mean-variance model and the Treynor-Black model.

An economist, named Harry Markowitz, introduced mean-variance analysis, which is also known as Modern Portfolio Theory (MPT), in 1952. He was awarded a Nobel Prize in Economics for his theory.

The mean-variance model is a framework for assembling asset portfolios so that a return can be maximized for a given risk level. It is an extension of investment diversification. Investment diversification is an idea that suggests investors should invest in different kinds of financial assets. Investment diversification is less risky in comparison to investing in only one type of asset.

The investors choose the asset allocation that maximizes return, also known as the variance of return. While investing in assets, the risk to reward ratio becomes a critical decision-making factor. The risk to reward ratio is calculated as the ratio of expected returns to possible losses. The difference between the expected return and the actual return is known as risk. The key challenge is the calculation of the variance of return on the target portfolio. For example, it could be 40% equity and 60% bonds or an even more complex asset class allocation, such as real estate, commodities, and more. To come up with a variance of return with 40% equity and 60% bonds, we need to first calculate the variance of return for equities and bonds separately. At the same time, we also have to consider the covariance between the equity and bond, that is, how the return of equities and bonds goes hand in hand in the same direction or a completely different direction.

For detailed insights into how the asset and wealth management industry is shaping up, please refer to PwC, 2017, Asset Management2020: Taking stock, Asset & Wealth Management Insights.

Imagine a team of people (where each person represents one asset class) working together on a task to deliver a return. The work of a portfolio manager is to determine who has more say and who has less say within the group (asset allocation work). This depends on the productivity (return) and fluctuation of the person's performance: some people exhibit extreme levels of performance, while some are pretty stable in terms of their productivity (variance). We also need to know the interaction between individual team members—this interaction also has to consider how each of them complements or amplifies each other's productivity (correlation). Some team members show strong chemistry between two of them to deliver extremely good results (positive correlation), some work at different times of the day—one is a night owl and the other a morning person—each with different productive hours (negative correlation), and some don't really have any consistent pattern of similarity or dissimilarity (zero correlation).

The following diagram shows the correlation matrix between two assets (i and j). The diagonal line in gray shows the securities' return variance and the remaining cells show inter-security return covariance. The cells in black are not needed, as they mirror the values opposite to the diagonal line. For a mere 20 securities, we will have 190 values to be estimated:

To illustrate this problem further, let's assume that security 20 is not liquid, and we cannot reliably estimate its covariance with another security. We could have impacted the data quality of covariance with the other 19 securities. The problem with this model in a real-life application is as follows:

  • Some assets do not have enough data points for us to calculate their correlation with others (for example, new team members to the team).
  • In financial markets, the correlation between assets changes dynamically, and it is hard to forecast a forward-looking correlation.
  • The correlation is not linear.

This model works on public equity with efficient pricing and lots of data points for modeling. But it does not work on non-liquid assets—such as private equity in start-ups or emerging market securities or bonds, where we don't have full visibility of the pricing and many are often reconstructed analytically.

One specific type of correlation in risk could be credit risk—in good times, the correlation of risk across assets is low; whereas, in crisis, the correlation spikes and moves in a similar direction. Please refer to Credit Risk Pricing, Measurement and Management, by Duffie D. and Singleton, K.J., for an example on default correlation.

Some treasurers in established firms are responsible for managing their own pension money. We assume that treasurers need to handle the target asset allocation for the pension fund. We will take the index return data for each of the asset classes. We will use Quandl's subscribed data on the ETF as the data source.

An ETF refers to funds that can be bought and sold on a public exchange such as the New York Stock Exchange (NYSE). It is a fund because it invests in many more underlying securities such as stocks or bonds. It is becoming more popular as it allows investors to focus on the themes that the funds are investing in, rather than individual stocks. For example, we can have a strong view about the strength of the US economy by buying the fund that invests in the largest 500 stocks of the US.

Exploring the Treynor-Black model

Due to the instability of the Markowitz mean-variance model in managing problems associated with multi-asset class portfolios, the Treynor-Black model was established. Treynor-Black's model fits the modern portfolio allocation approach where there are certain portfolios that are active and others that are passive. Here, passive refers to an investment that follows the market rate of return—not to beat the market average return but to closely follow the market return.

An active portfolio refers to the portfolio of investment in which we seek to deliver an above-market average return. The lower the market return with a market risk level, the higher the portfolio. Then, we allocate the total capital to an active portfolio. So, why take more risk if the market return is good enough? The Treynor-Black model seeks to allocate more weight to the asset that delivers a higher return/risk level out of the total risk/return level of the active portfolio.

Introducing ResNet – the convolutional neural network for pattern recognition

What is specific about applying the computer vision type of neural network is that we can use the following hidden layers. In our example, we will use ResNet implemented in Keras as an example to illustrate these ideas. We will also showcase an approach to improve performance—however, you are expected to dig deeper into hyperparameter tuning.

The convolutional layer is like taking a subset of the image that is being inputted. In technical analysis, it's like having a sliding window to calculate a statistical value. Each type of sliding window is trained to detect a certain pattern such as upward, downward, and flat lines. In neural network terminology, each type is called a filter. For each type of filter, there are numbers of windows to fully run (or slide) through the input image; the number is represented by the number of neurons in the layer.

To explain the terminology, let's take an input image of size 3×3 and a kernel shape of 2×2. Our actual input in the coding example is larger than this size.

Theinput imageis a 3×3 image with a black line (represented by 3 pixels) crossing from the bottom-left corner to the upper-right corner diagonally. It shows a stock price that is moving upward by one pixel every day:

The shape of the sliding windows is called a kernel. A kernel is a function that can transform an inputted matrix/vector into another form, as shown in the following diagram:

For illustration, we assume the kernel size to be 2×2, stride = 1, and zero padding unless specified.

The numbers in the following diagram show the sequence of the kernel movement. Each movement will be represented by one neuron at the convolution layer:

The following diagram shows the 2×2 kernel, and the kernel moves 4 times (that is, 4 neurons are required):

  • Kernel shape: As the kernel moves (we call it slides), it may or may not cover the same input pixel, which makes the blue darker as we want to show which pixels are covered more than once:

The kernel shape =2×2, and it takes 4 moves to cover the full image:

The kernel shape =1×1, and it takes 9 moves to cover the full image.

  • Stride: This shows how many pixels need to be moved toward the right and downward as it advances:

Here, stride = 1, and it takes 4 moves to cover the image. Note that every time there will be overlapping pixels covered:

Here, stride = 2, and it takes 4 moves to cover the image. Note that every time there will not be any overlapping pixels covered by the filter.

  • Padding: This shows how many white pixels are surrounding the inputted image:

The following diagram shows zero padding:

Here, padding = 1, which allows the edge cell on the side to be covered by a different neuron.

Pooling layer

The pooling layer is quite self-explanatory—it is meant for pooling the results from the input. Imagine that after the convolutional layer, for each type of filter, there will be a number of outputs—for example, four. Can we reduce this to one variable instead of four output variables? Pooling can play the role of compressing this information. For example, taking the maximum of the four outputs (max pooling) or averaging the four outputs (average pooling). Visually, the meaning of the pooling layer is to blur the image or calculate moving average trends.

ReLU activation layer

For finance professionals, the Rectifier Linear Unit (ReLU) layer is like a call option payoff—once a certain threshold is exceeded, the output value is changed linearly with the input. Its significance is to reduce the noise in the pricing to ensure that only a strong market trend is considered.


Softmax is a super-charged version of the logistics regression model that we touched on in the earlier chapters of this book with multiple predicted outcomes—for example, the version one binary outcome in the logistics regression model case. In our case, we wish to identify what the pricing would be on the next day.

Portfolio construction using the Treynor-Black model

Let's say we are given 10 days of pricing data, and the work of technical analysis is to draw the lines on the right to make sense of the trend in order to generate the next day's pricing for the 11th day. It is quite obvious to find that it is indeed what a convolutional neural network could tackle.

Knowing that, practically, the time unit we are looking at could be per 100 ms or 10 ms instead of 1 day, but the principle will be the same:

Let's continue with the Duke Energy example. In this hypothetical case, we assume that we are the treasurer running the pension fund plan of Duke Energy with a total asset size of 15 billion USD with a defined contribution plan. Presumably, we know what our IPS is in digital format:

  • Target return = 5% of real return (that means deducting the inflation of goods)
  • Risk = return volatility equals 10%
  • Constraints: No electricity utility companies to avoid investing in other peers/competitors
Please note that this is a hypothetical example. No inference will be made about the actual company.

Using the IPS, we will first illustrate how to allocate the fund to the various asset classes as the first example. Then, in the second example, we will look at the trend following strategy to enable the investment managers to follow the market, given the recent trend in passive investment.


We have created two separate Python files because the asset parameters should be independent of how the asset is allocated. There are a total of four steps for this. The two main steps (the files) are as follows:

We will download and estimate the asset parameters and generate the target asset allocation:

  1. To download and estimate the asset parameters, we will import libraries and key variable values. Then, we will define functions to download data for each of the assets, the market return, the risk-free rate, the asset return, and the parameters.
  1. To generate the target asset allocation, we will import libraries and key variable values, find out the weight of the securities in the active portfolio, and find out the weight of the active portfolio in the total portfolio.
As we progress through the chapters, we will try to illustrate the use of traditional databases via this example, rather than creating a data dump without database storage. It follows our point made earlier that a structured database (a SQL database) works perfectly with securities' pricing data where the data is structured. We are using SQLite, which is a lighter version of the database. It is only meant to illustrate to the finance professional how databases come into play in our use case. For actual IT implementation, of course, we can use a lot of enterprise-grade databases that are both secure and fast.

Downloading price data on an asset in scope

Individual assets and market assets used in this example are all ETFs. Data is downloaded from Quandl using free and paid subscriptions—including the risk-free data represented by the US treasury notes and the market return represented by the global equity ETF. After we download the data, which is the end-of-day data, we also need to define what we refer to as the price. In our example, we take the middle point between the daily high and the daily low as the price for the day.

The steps are as follows:

  1. Import the necessary libraries; sqlite3 is newly introduced in this chapter. This shows how the SQL database could be used for trading data use cases. We will use a lightweight SQL database, called SQLite, which itself is shown as a file:

#1. Import libraries and key variable values
import quandl
import pandas as pd
import numpy as np
from sklearn import linear_model
from sklearn.metrics import r2_score
import sqlite3
import math
import os
#not needed when using database
import pickle

#API Key

#dates variables for all the download

#db file
  1. Define the function to download data for each of the assets:
#2. Define function to download data for each of the asset

Without Python, you can also directly assess the file via a tool, such as a plugin to the Chrome browser, SQLite viewer and more.

The function will download the price data of any given ticker in the SHARADAR database from Quandl. In addition to this, it will calculate the return of the ticket on a daily basis.

What it does is download the data and then calculate the return series.

Calculating the risk-free rate and defining the market

In our example, we take the US 3-month treasury notes as the proxy for a risk-free rate of return. In the investment world, the US is considered risk-free and the government will never default. Any return we earn above the risk-free rate is the return we get by taking more risk.

The market, as a whole, can be represented by the return from all of the investment assets around the world—this is easy in theory, but in reality, it is really hard to define. The most challenging part is generating this market return on a regular basis so that it could be used in the next step. We will take a shortcut and use an ETF to represent the market return:

#3. Market Return

Given a ticker as the market proxy, run the preceding function:

#4. Risk Free Rate
#day count

#risk free rate

# override return of market

The risk-free rate is rather complex. By convention, the industry uses a 3-month treasury note. To obtain the risk-free rate for the whole period, we take around 10 years of data to calculate the risk-free rate for the period.

However, we also need to annualize the interest rate. By definition of a 3-month treasury note, the number of days counted is 360 days. The interest rate is where every day is counted.

Calculating the alpha, beta, and variance of error of each asset type

After understanding what the risk-free return rate and market return are, our next task is to find out the alpha, beta, and variance of error by regressing the market return against the asset's return:

Investment Return = Risk-Free Rate + Alpha + Beta * market return + noise (Variance of error)

After performing this calculation, we will keep the data in a SQLitedatabase for retrieval later on.

I believe, in the future of start-ups, our robo-advisor will be focusing on ETF/smart beta—that is, the allocation of sectors to generate a return against the market. Therefore, in this example, I choose the ETF tickers.

We will run a linear regression of the sector ETF against the market benchmark. However, the day in which we can have a price quotation of the ETF and the market could be different; therefore, we will regress only when both the sector ETF and market ETF have a price—using the inner join command on SQL.

Inner join implicitly requires that the index of both the sector ETF and the market benchmark have to be the same before joining. The index of the dataset refers to the date of return:

#5. Asset Return and parameters
#list of stocks for selection in the active portfolio

#connect to the databases and reset it everytime with drop indicator

#write out the risk free and market parameters

#loop through the tickers
for tkr in list_tkr:
#calculate the CAPM:
#download data for the ticker

#make sure the ticket we select has market data

#linear regression

#obtain the result and write out the parameters

Calculating the optimal portfolio allocation

We are at the second major process—that is, working out the portfolio allocation. First, we will calculate the active portfolio size and weight of different asset within the active portfolio. The steps are as follows:

  1. Import all of the relevant libraries:

#1. Import libraries and key variable values
import sqlite3
import datetime

#create a table to store weight
  1. Calculate the active portfolio's parameters and compare them against the market performance to find out how much weight from the total portfolio we should allocate to the active portfolio:
#2. Find out the weight of the securities in the active portfolio
#total alpha/variance of the active securities

#insert into the table the weight of each active securities

The weight of the active portfolio is solved by aggregating the parameters for the securities (sector ETF) that belong to the active portfolio.

  1. Then, based on the ratio of the market return/risk, as compared to the active portfolio return/risk, the stronger the active portfolio performs, the more weight it gets out of the total portfolio:
#3. Find out the weight of the active portfolio in the total portfolio
#calculate the parameters of the active portfolio

#read back the risk free and market para

#calculate the weight of active portfolio

#display the result

After the optimal portfolio is obtained, the next step is to allocate it to you according to the IPS on return and risk.

The following two constraints need to be satisfied:

  • % Optimal portfolio x Return by Optimal portfolio + (1-%Optimal portfolio) x Return by risk-free asset >= Return required by IPS
  • % Optimal portfolio x Risk by Optimal portfolio <= Risk required by IPS

Congratulations! You have learned how to allow capital to different investment assets to yield the optimal return and risk level. In the next section, we will look at an example on how to predict the trend of a security. This will help investors to make wise investment decisions.

Predicting the trend of a security

In the preceding example, we played the role of a trader who followed the portfolio allocation set by the treasurer. Assuming that our job is to follow the securities required by the treasurer, the profit and loss of the trader hinges on how can we profit from buying low and selling high. We took the daily pricing history of securities as the data to build our model. In the following section, we will demonstrate how to predict the trend before making buy decisions for assets.


There are two major processes—one on model development and another on model backtesting. Both processes include a total of eight steps for real-time deployment, which we will not include here. However, it is very similar to model backtesting. The following diagram illustrates the flow of the process:

Loading, converting, and storing data

In this step, we will load the data, convert the data into an image array, and then store it in a HDF5 data file format. First, we will load the data from Quandl as a data frame, and then convert the data into an array—which will plot the data like it was presented earlier. In our example, we simplify the problem by plotting only day end data points for one day. We only take the middle point of the day high and the day low, without considering its transaction volume.

When it comes to plotting the price on an array with fixed dimensions, on the y axis—the price—we will develop a function to fix the maximum and minimum values into the fixed dimension by scaling the data points in between the maximum and minimum accordingly. This is called normalization. On the x axis, each day is represented by one point on the x axis, where the far left is the earliest day and the far right is the latest day of a given window size. On a given point, the color of the price point is of the same, color = 255 for showing it in pictures, or 1 for feeding it to neural network.

The same treatment is done on the target variable—which is only the next day chart using the same scale for y. If the next day is actually higher than the maximum or lower than the minimum, we can force it to take the current maximum and minimum point.

After the array is prepared, we will then stack up the array for the duration specified—with every single day represented by one chart that shows the past X days, with X being the window size. When we have finished stacking up, we will put the whole array into a HDF5 file—this is a distributed file format in nature, and it allows the file to be stored across multiple physical locations.

Define the libraries and variables. We have a defined a list of tickers to go through for the download step:

#1. Import libraries and key variable values

import quandl
import plotly
import plotly.graph_objs as go
import numpy as np

from datetime import datetime
import Image
except ImportError:
from PIL import Image
import os
import h5py

#dates parameters
#quandl setting
#parameters for the image generation
#create path for the output dataset
#ticker lists
#generate png file for each of the input or now
#generate interactive plot to the ticket stock price or not

Define the function to put the stock price of variable range into a fixed-sized image of fixed height and width. It will return a column of values that has been rescaled along with the scaling factor:

  • Pixel value = (price value – minimum value of the column) x number of pixels per value
  • Number of pixels = (maximum value of the column - minimum value of the column) / total number of pixels

The code is as follows:

#2. Define the function to rescale the stock price according to the min and max values

#input_X is a series of price
#output_X is a series of price expressed in pixel
def rescale(input_X, pixel, min_x,max_x):

Ticker by ticker, we will download and convert the data into an input image and target result for machine learning as the next step. The most technical aspect of these codes relates to HDF5 file saving. Within the file, it is further divided into a dataset, and, within the dataset, we can store the files inside. One specific feature of dataset is that its size is fixed once it is defined at creation. Additionally, it is not meant to be dynamically updated—though, this is technically possible.

Colored images are stored in three channels—red, green, and blue—each channel is a matrix where each pixel ranges from the value of 0 to 255. However, in our example, we will only use one channel for black and white pictures. Before we store the image to HDF5, all the numbers are divided by 255 so that the input variables are in a range between 0 and 1 for neural network training later.

To give you a real feeling of the data, we also have another interactive chart feature (using ploty) provided. This was also turned off to improve speed. However, for a first-time user of the code, it is recommended that you try it out to see the data being downloaded.

Please refer to image processing texts for an in-depth discussion—my favorite is Feature Extraction & Image Processing for Computer Vision by Nixon M.S. and Aguado, A.S., as it focuses a lot on extracting features that we need as opposed to just laying out the theoretical background.

However, the downside is that this book's code is not in Python—which is an acceptable challenge given that learning the principle is more important than code that evolves:

#3. Go through the tickers
for tkr in tkr_list:
#if the ticker has been downloaded, skip the ticket and go for the next
#download and create dataset
#sort the date from ascending to descending...
#charting interactive chart for viewing the data
#calculate mid price of the day
#remove the file if there is one
#remove the file if there is one
#create dataset within the HDF5 file
#now we create the dataset with a fixed size to fit all the data, it
could also be create to fit fixed batches

#loop through the dates
for i in range(num_img):
#create min and max values for the mid price plot within a given
#in case of low liquidity ETF which has the same price, no graph be
#draw the dot on the x, y axis of the input image array
#output the image for visualization
#draw the dot on the target image for training
#stack up for a numpy for Image Recognition

Setting up the neural network

Following the code from the Keras examples in regard to ResNet, we do not perform any alternation of the network design. We take both version 1 and version 2 while disabling the batch normalization,given that the data point has the same color, and the y axis is normalized for a given window size, so there is not much significance in further normalizing it.

Batch normalization has to do with harmonizing the values seen in the network in the current batch of records—it works well if the color we plot on an image contains different colors. However, since we have already normalized the pricing at each data point on its y axis, the codes are unchanged for now as we need this when we feed in data with different scales and distributions.

Loading the data to the neural network for training

We retrieved the data from the HDF5 file earlier and put it in the network that was just set up in the previous step. There will be a splitting of the training, testing and validation sets. However, in our case, we just take all of the data as the training and testing sets at the same time. The validation set can be another stock—given that we are training only the general intelligence to observe the technical movement of stock.

We feed the network with batch normalization and a certain epoch number. This step takes the most time.

During training, we keep a log of the performance for visualization later:

#1. Import libraries and key variable values
#2. Define functions
def lr_schedule(epoch):
def resnet_layer(inputs,
def resnet_v1(input_shape, depth, num_classes=10):
def resnet_v2(input_shape, depth, num_classes=10):

Please refer to for an explanation on the design, and refer to the Keras documentation for further implementation details:

What the code essentially does is to create two different neural network designs of different structures—given that we have a sizable data input with the data source, readers will experience a better performance with version 2 as long as the data is sizable:

#3. Execute the model training
# Computed depth from supplied model parameter n

# Model name, depth and version

# create list of batches to shuffle the data

#check if the prev step is completed before starting

#decide if we should load a model or not

#loop through the tickers

#load dataset saved in the previous preparation step

#start if both file exists:

#calculate number of batches
#do it at the first one

# Input image dimensions.
# Prepare model model saving directory.

# Prepare callbacks for model saving and for learning rate

# loop over batches

# Run training, without data augmentation.

#when model training finished for the ticket, create a file to
indicate its completion

# Score trained model.

Saving and fine-tuning the neural network

The network is saved at the end. We did not fine-tune our model at all in this example—but it has to do with hyperparameters tuning—which means that we should tune every single parameter in the network so far. I would recommend that you look atMachine Learning Yearningby Andrew Ng ( This step is not implemented in this example. But we have illustrated it in more detail in Chapter 3, Using Features and Reinforcement Learning to Automate Bank Financing.

Loading the runtime data and running through the neural network

The network can be loaded again and run on a new dataset as a validation set. However, in our example, we take another stock to test whether this generic technical analysis machine works or not. The output of the network is the prediction of the next day's pricing.

In this program, the most special data will be the strategy parameters. It all starts with one monetary value. And we are testing three strategies—one buy and hold, which is the benchmark, and two takes on different pricing outputs to trade.

The steps involved are as follows:

  1. Import all the necessary libraries and variables:

#1. Import libraries and key variable values
#folder path

#date range for full dataset

#Create list of dates

#API key for quandl

#Parameters for the image generation

#model path

#number of channel for the image

#strategies parameter
With ResNet v2, we have close to 1 million parameters, while we are feeding roughly 3 millions of records ~14.5 years x 200 trading days x 125 tickers (but some tickers are not liquid to trade).
  1. Then, define the functions to fit the price points into the image with a fixed height:
#2. Define functions
  1. Get new data and run the functions to predict the price. Load data from a ticket and prepare the data; then, run the model built from the training process:
#3. Running the test
#Get the data

#write header for the log of the strategy back-testing

#loop through the dates
#make sure both start and end dates are valid

#prepare the input data

#if no trend, then drop this data point

#stack up for a numpy for Image Recognition
#print the historical data

#make prediction
#Obtain predicted price

Generating a trading strategy from the result and performing performance analysis

For a given price prediction, we can devise different actions to do with the price prediction.

The objective of this loop is to measure the profit and loss of the trading strategies by relying on the prediction made by the model in the previous section.

At any given date, there will be only one price prediction, in the form of a 1D array with the probability at each price point given the scale. The various strategies (1 and 2) handle what to do with the prediction:

#calculate expected values

#Strategy Back-Testing
#Benchmark - Strategy 0 - buy and hold

#Testing of strategy1

#Testing of strategy2

#print the final result of the strategies

Congratulations! You have walked through the process of price prediction using computer vision models.

In the real world, there could be more predictions made by different models, which will add complexity to the number of strategies tested. We require a benchmark to know whether these strategies outperform the normal market situation, which are buy and hold strategies. If our strategies are successful, then they should be able to outperform the market by showing higher profit and loss figures.

In strategy backtesting, we normally deploy it to an out-of-time, unseen sample.


In this chapter, we learned a number of portfolio management techniques. We combined them with AI to automate the decision-making process when buying assets. We learned about the Markowitz mean-variance model and the Treynor-Black model for portfolio construction. We also looked at an example of portfolio construction using the Treynor-Black model. We also learned how to predict trends in the trading of a security.

In the next chapter, we will look at the sell side of asset management. We will learn about sentiment analysis, algorithmic marketing for investment products, network analysis, and how to extract network relationships. We will also explore techniques such as Network X and tools such as Neo4j and PDF Miner.