Tensorflow and its Application in Financial Forecasting
Tensorflow and Its Application in Financial Forecasting
Today, Machine Learning (ML) and its more advanced derivative Deep Learning (DL) are no longer confined only to academic circles. In fact, Machine Learning is taking over the industrial world with ever increasing use of Artificial Intelligence (AI) for various industrial applications. Within the ML technology domain, Tensorflow has now become a quintessential terminology. This article will shed some light on how Tensorflow helps in building Machine Learning models and how it can be used to build time series forecast models which are quite applicable for financial analytics.
Building Models for Analysis
Traditionally, to solve scientific or technological problems, we rely on a mathematical or a computational equivalent of the underlying system which we call as ‘model’ of the system under consideration. If we can model a system with a set of rules, it becomes easy to predict its behaviour. When you are driving a vehicle at a particular speed, you will be able to forecast the distance you would cover in an hour by applying the model [distance = speed x time]. The variables which determine the state of the system are Speed and Time in this case. However, if you are tasked to pick-up three of your friends (read not all of them historically punctual) at multiple points on the way, it is likely that there are waiting periods, the duration of which may be uncertain to you. The above Distance model may not work properly in this case and you may not be able to predict the distance you would cover in the next 1 hour with reasonable accuracy. Many more independent state variables have come into picture in the latter scenario, making it complex to model and accurately predict.
When it comes to Financial Industry, positioning of various loan products, mortgages, overdrafts and credit cards involve forecasting to assess customer’s affordability and willingness to repay the debt. Even the pricing of the liability products like Savings Account, Term Deposits and Bonds are done based on some treasury forecasts. In order to decide future lending and borrowing positions, Treasury would have to forecast the appetite for lending and borrowing in the market. All these involve time series data in one or other form. We can use any of the traditional regression techniques like Linear Regression, Polynomial Regression etc. for time series forecast. Most of these methods do a curve fitting, making use of the available data while minimizing the difference between the points in the curve and actual data points. Such techniques produce results with accuracy only within a tolerance limit.
Recurrent Neural Network for Time Series Data
With Machine Learning techniques taking the central stage, more accurate predictions of time series elements have become possible. Deep Learning is a branch of ML that relies on Artificial Neural Network (ANN) for non-linear transformations, and uses that for modelling the problem at hand. A specific form of ANN called Recurrent Neural Network (RNN) is becoming more popular owing to its wide utility and ability to handle time series data. RNN works on the principle that one input is dependent on the previous input by having a hidden state or memory that captures what has been seen so far. The value of the hidden state at any point in time is a function of the value of the hidden state at the previous time step and the value of the input at the current time step. For example, when RNN is used to find the next word in a sentence, if someone writes “the grocery…”, it is most likely to pick up word “store” instead of “school”.
In ANN, neurons are the smallest computational blocks which apply weights and bias signals to the input given and transform it. The underlying mathematical process is nothing but matrix multiplication involving weights and bias, and a transformation of the output using an Activation function. The calculated output is compared with the actual data and the loss, a function of the difference between calculated and actual values, is determined. The higher the loss function, more inaccurate the model is. In the training process, the weights are adjusted using Stochastic Gradient Descent (an effective optimization technique, which is abbreviated as SGD) iteratively to minimize the loss function. Note that this method does not keep any memory of the input values. However, keeping memory of earlier inputs is important for time series data forecasting or word prediction in sentence making. That is where RNN makes the difference. It feeds the output of timestep n-1 to the next matrix multiplication of timestep n a number of times. It is termed as ‘recurrent’ because it performs the same operation in each activation block shown in figure below.
Tensorflow – Under the Hood
Tensorflow is one of the most popular tools used in the background for many Machine Learning or Deep Learning based applications. It is an open source numerical computing library originally developed by the Google Brain team. It is learnt that Google uses Tensorflow for Search Ranking, Speech Recognition, Youtube recommendations, Language Translation and many other areas. At present, several high end libraries such as Keras, TfLearn etc. are built on top of Tensorflow. It has such as flexible architecture that it can deploy ML and DL models in CPU, GPU, Distributed machines and Mobile devices. Tensorflow serves the need of a wide range of user base such as those who want to use common models or those who want to build custom models. It provides API’s at different levels of abstraction to enable this. The higher level API’s such as Keras, tf.estimator etc are built on top of Tensorflow core functionalities.
Tensorflow makes use of data structures called tensors as its building blocks. A tensor is nothing but a multi dimensional array, for which 0-D tensor is a scalar, 1-D tensor is a vector, 2-D tensor is a matrix and so on. The operations in Tensorflow happen in two steps – step 1 is to build a Graph, which is a data flow of computations and step 2 is to run a Session, which executes the operations in the graph. The Graph can be considered as the backbone of Tensorflow, as it is an equivalent of the computational model of the underlying problem statement. It comprises of multiple nodes connected to each other by edges, and each node represents an operation (abbreviated as op) for which tensors are supplied as input. Graph is just a template which does not perform any operation by itself. To perform computation, the graph must be launched in a Session. A small piece of Python code to illustrate this concept is given below.
Scalars are used in the example above for easy understanding. However, in practical problems, the tensors could be vectors or matrices with larger dimensions.
Time Series Forecasting using Tensorflow
First step in building a model for forecasting is to determine the hyper parameters for the model such as number of inputs at a time, time series data window considered, number of neurons expected in the RNN model and the number of output. As mentioned in the earlier sections, a neural network based model passes through multiple states (weights and biases) over many iterations in the process of attaining the optimal state and the whole process is driven by Data Sets. In a time series, the data sets will be characterized by seasonality and trend. There may be outliers which need to be handled in the pre-processing phase. Before we construct the model, the data sets must be pre-processed, organized and appropriately split to have training data and testing data (typically, 80% for training and 20% for testing).
The input data set (x_data) and output data set (y_data) are first grouped into batches (X_batches and y_batches respectively) prior to the training. There are straight forward functions in python to reshape data sets. Based on the time range for which forecasting is required, y_data batches may be considered appropriately time-shifted from x_data batches. For instance, you have history data sets for closing stock prices and want to predict the same for a week ahead, y_data batches must be constructed with a week’s shift from x_data batches.
To build a model, there are three key steps: (a) Definition of Variables or placeholders with the right shape for the tensors, (b) creation of RNN model and (c) creation of loss function and optimization. The construct of the variable definitions and model creation in Python are given below.
Typically, the loss function for continuous variables would the mean square of the error, which is the difference between the actual value and model output. One of the optimizers, Adam Optimizer, which is an extension to SGD, can be used to reduce the loss. The code construct for the same is given below.
Now the model could be launched in a session to start the training for as many epochs as desired. Once the model is trained, it could be evaluated using the test data set. If one finds room for improvement in the model after testing, it could be achieved by adjusting the hyper parameters.
Apart from estimators for time series analysis, Tensorflow has many in-built estimators for regression and classification problems, such as DNNRegressor, DNNClassifier, LinearClassifier etc. These estimators find various applications in the areas of Customer Analytics, Credit Scoring, Customer Segmentation etc.
In Banking and Financial Services industry, forecasting of various time series elements is essential to develop product strategies and business plans. This article has attempted to explain the computational approach focusing on time series forecasting making use of the very popular Tensorflow library. It also has elaborated on how recurrent neural network acts as the backbone for modelling and analysis of time series data. There are shortcomings later found in RNN as well and there are already improvements made in its architecture to overcome the same, the details of which could be found in related literature.