Quantitative Trading and Systematic Investing

Letian Wang Blog on Quant Trading and Portfolio Management

0%

This post discusses the Principal component analysis (PCA) dimension reduction technique and demonstrates its application in treasury yield curve analysis.

Introduction

Principal component analysis (PCA) rotates original dataset in such a way that the rotated dateaset is orthogonal and best represents the data variations. Then it becomes a dimension reduction technique by taking first few variables in the rotated dataset. That is, denote \(X \in R^{n \times p}\) be the original dataset, and let \(W \in R^{p \times p}\) be the rotation operator, then the new dataset is

\[ T=XW \;\; or \;\; T_L= XW_L \]

where \(W_L \in R^{p \times l}\) keeps the first \(L\) components of \(W\) according to their eigenvalues. \(W\) is the eigenvectors of covariance matrrix \(X^TX\) and usually obtained by performing SVD decomposition on \(X\) directly.

\[ X=U\Sigma W^T \Rightarrow X^TX=W\Sigma^2W^T \]

In this post we apply PCA to USD treasury curves. Treasury curves are known to be correlated, and first three principal components, namely, level, spread, and fly, explain most of the curve variations. The notebook can be found here.

Read more »

This post demonstrates how to predict the stock market using the recurrent neural network (RNN) technique, specifically the Long short-term memory (LSTM) network. The implementation is in Tensorflow.

Introduction

Finanical time series are time stamped sequential data where traditional feed-forward neural network doesn't handle well. Recurrent neural network (RNN) solves this issue by feeding output neurons back into the input to provide memories of previous states. This turns out to be a huge success, especially in Natural Language Processing. Later on, Long short-term memory (LSTM) and Gated Recurrent Unit(GRU) are designed to alleviate the so-called vanishing/exploding gradients issues in the back-propagation phase of RNNs.

In this post, I will build an RNN model with LSTM or GRU cell to predict the prices of S&P 500. To be specific, a supervised machine learning model will be calibrated to predict tomorrow's S&P 500 close price based on the prices from the previous 20 business days. This is a regression problem; yet the codes can be easily adapted to handle classification problem such as simply predicting tomorrow's market direction. The code is located on Github.

Read more »

This post discusses Hidden Markov Chain and how to use it to detect stock market regimes. The Markov chain transition matrix suggests the probability of staying in the bull market trend or heading for a correction.

Introduction

Hidden Markov Model (HMM) is a Markov Model with latent state space. It is the discrete version of Dynamic Linear Model, commonly seen in speech recognition. In quantitative trading, it has been applied to detecting latent market regimes ([2], [3]). I'll relegate technical details to appendix and present the intuitions by an example.

Read more »

This post shows how to apply Kalman Filter in pairs trading. It updates the cointegration relationship using Kalman Filter, and then utilize this relationship in a mean-reversion strategy to backtest the pairs trading performance.

Introduction

In previous post we have seen Kalman Filter and its ability to online train a linear regression model. In last post we have also seen the idea of cointegration and pairs trading. As pointed out at the end of last post, one way to avoid look-ahead bias and gain walk forward analysis is through Bayesian online training mechanism such as Kalman Filter. Today we'll apply this idea to pairs trading.

As usual, the backtest codes in this post is located here on Github.

Read more »

This post discusses stock pairs trading, including how to identify pairs or cointegration relationship using statistical tests, how to estimate the two-step error-correction model, and then backtests a pairs trading strategy in python.

Introduction

In last post we examined the mean reversion statistical test and traded on a single name time series. Often times single stock price is not mean-reverting but we are able to artificially create a portfolio of stocks that is mean-reverting. If the portfolio has only two stocks, it is known as pairs trading, a special form of statistical arbitrage. By combining two cointegrated stocks, we can construct a spread that is mean-reverting, even when these two stocks themselves are not. Please refer to the appendix if you want to check out cointegration first.

Cointegration and correlation talk abouot different things. For example, if two stock prices follow two straight lines with differennt slopes, then they are positively correlated but not cointegrated, as illustrated in this quora post. The mathematical formulas are relegated in the appendix.

The backtest codes in this post is located on Github.

Read more »

Introduction

This post considers time series mean reversion rather than cross-sectional mean reversion. Time series mean reversion processes are widely observed in finance. As opposed to trend following, it assumes that the process has a tendency to revert to its average level over time. This average level is usually determined by physical or economical forces such as long term supply and demand. Prices might deviate from that long term mean due to sentimental or short-term disturbances on the market but eventually will revert back to its intrinsic value.

A continuous mean-reverting time series can be represented by an Ornstein-Uhlenbeck process or Vasicek model in interest rate field, which is a special case of Hull-White model with constant volatility. It is also the continuous-time analogue of the discrete-time AR(1) process. I relegate the mathematical details to appendix.

To calibrate the OU process, there are generally two approaches based on formula (A9), i.e., the Least square and Maximum Likelihood Estimation. This document gives a good summary and comparison of results.

The backtest codes can be found on Github.

Read more »

This post shows how to backtest trading strategies in quanttrader python package. It discusses the event-driven backtest framework in general and the code structure of the quanttrader package in specific. It also shows how to do the grid-based parameter search in parallel.

Introduction

quanttrader is a low latency event driven backtest and live trading system in Python. This post goes over its backtesting module. A few years ago I wrote a QuantTrading system in C#. So this one gets suffix two. There are other Python based open-source platforms available such as Quantopian, Backtrader, and PyAlgoTrade. This one tries to be structurely simple. Data loading and strategy evaluations are moved out of the module and what is left is essentially an event engine surrounded by a few supporting functions. Some backtest examples can be found here.

Read more »

This post shows how to use Tensorflow library to do linear regression. We can either use TF low-level API or high-level Keras API. The data is trained in batches and stochastic gradient descent backpropagation is used to estimate parameters such as intercept and slope. It is a long way compared with previously discussed classical linear regression methods, yet serves as a starting point to deep learning and reinforcement learning world.

Introduction

After four posts on linear regression, we are finally at the door of deep learning. Today we will build a simple feed-forward neural network (but not deep) with the help of Tensorflow to solve the linear regression problem. Tensorflow is a popular open-source deep learning library; the other popular choice is PyTorch .

Instead of defining graph and then executing in a session, Tensorflow 2.0 offers dynamic graph through eager execution. The code structure is completely different from 1.0, so we updated code here on Github.

We will try two implementations, one with low-level Tensorflow API and the other with high-level Keras API.

Read more »

In this post, we examine the linear regression model in the Kalman Filter world. It assumes that the underlying states are unobservable or partially observable, and Kalman Filter is designed to trace the latent state evolution through observations.

Introduction

Kalman Filter is a state space model that assumes the system state evolves by some hidden and unobservable pattern. Instead we can only observe some measurable features from the system, based on which we try to guess the current state of the system. An example would be without openning Bloomberg terminal, you try to guess stock market movement every day from the trader's mood. The graphical narrative can be found in this post.

This book provides a simple random walk example where the hidden state is the true equilibrium stock price in some time interval while stock close price of that interval serves as an observation with errors. Then it proves that the true equilibrium price is the weighted average of close prices with the weights being Fibonacci sequence. Dr Chan makes Kalman Filter popular to the online quantitative trading community with his EWA-EWC ETF pairs trading strategy.

In this post we will continue with our simple linear regression example from last post, and follow the plain Kalman Filter logic without the help of Python packages such as PyKalman.

Read more »

This post discusses the Markov Chain Monte Carlo (MCMC) model in general and the linear regression representation in specific. MCMC is used to simulate posterior distribution when closed-form conjugate distribution such as the one in the previous Bayesian linear regression post is not available.

Introduction

In last post we examined the Bayesian approach for linear regression. It relies on the conjugate prior assumption, which nicely sets posterior to Gaussian distribution. In reality, most times we don't have this luxury, so we rely instead on a technique called Markov Chain Monte Carlo (MCMC). One popular algorithm in this family is Metropolis–Hastings and this is what we are looking at today. Before we proceed, I want to point out that this post is inpsired by this article in R.

MCMC answers to this question: if we don't know what the posterior distribution looks like, and we don't have the closed form solution as given in equation (2.5) of last post for \(\beta_1\) and \(\Sigma_{\beta,1}\), how do we obtain the posterior distrubtion of \(\beta\)? Can we at least approximate it? Metropolis–Hastings provides a numerical Monte Carlo simulation method to magically draw a sample out of the posterior distribution. The magic is to construct a Markov Chain that converges to the given distribution as its stationary equilibrium distribution. Hence the name Markov Chain Monte Carlo (MCMC).

Read more »