Systematic Investing

0%

Introduction

In previous post we reviewed the basics of mean-variance optimization (MVO), and portfolios such as minimum variance and maximmum sharpe. This post continues to discuss some popular practices in asset allocation, namely risk parity and maximum diversification. Then we evaluate these allocation strategies using historical market data.

It is known that all these portfolios are special cases of MVO under some conditions. For example, MVO becomes minimum variance if expected reutrns are equal; it coincides with maximum diversification if return-risk ratios are the same across assets ([2]). The portfolio selection depends on our information and knowledge about the market. If expected return and risks are known with certainty, a maximum sharpe ratio is good choice. If expected return is hard to measure, covariance knowledge can be leveraged to construct minimum variance or risk parity portfolios. If we have no knowledge at all about the market, a naive equal-weighting portfolio will be a default option, which is also served as benchmark in our backtest.

Introduction

It's been a while since the post of Hidden Markov Chain (HMM), this time let's continue to explore some other popular regime switching models, including Gaussian Mixture model and Markov Regime Switching model (MRSM). Similar to HMM, the market regime is served as hidden states so they are all approached by some sort of Expectation-Maximization (EM) machine learning techniques. It's in place to have a brief discussion on EM algorithm first.

The accompanying notebook can be found here.

Introduction

An ARMA (AutoRegressive-Moving Average) has two parts, the AR(p) part and MA(q) part, expressed as below

\begin{aligned} X_t &= c + \epsilon_t+\sum_{i=1}^p\varphi_iX_{t-i}+\sum_{i=1}^q\theta_i\epsilon_{t-i}\\\\ \left( 1-\sum_{i=1}^p\varphi_iL^i \right)X_t&=c+\left(1+\sum_{i=1}^q\theta_iL^i \right)\epsilon_i \end{aligned}

where $$L$$ is the lag operator and $$\epsilon_i$$ is white noise. It can be approached by Box-Jenkins method. We may use PACF plot to identify AR lag order $$p$$, and ACF plot to identify MA lag order $$q$$; or use information such as AIC and BIC to do the model selection.

ARIMA (AutoRegressive Integrated Moving Average) is a generalization of ARMA by adding an integrated part with order $$d$$ for non-stationary processes.

While ARIMA works on price level or returns, GARCH (Generalized AutoRegressive Conditional heteroskedasticity) tries to model the clustering in volatility or squared returns. It extends ARMA terms to the variance front

Introduction

Principal component analysis (PCA) rotates original dataset in such a way that the rotated dateaset is orthogonal and best represents the data variations. Then it becomes a dimension reduction technique by taking first few variables in the rotated dataset. That is, denote $$X \in R^{n \times p}$$ be the original dataset, and let $$W \in R^{p \times p}$$ be the rotation operator, then the new dataset is

$T=XW \;\; or \;\; T_L= XW_L$

where $$W_L \in R^{p \times l}$$ keeps the first $$L$$ components of $$W$$ according to their eigenvalues. $$W$$ is the eigenvectors of covariance matrrix $$X^TX$$ and usually obtained by performing SVD decomposition on $$X$$ directly.

$X=U\Sigma W^T \Rightarrow X^TX=W\Sigma^2W^T$

In this post we apply PCA to USD treasury curves. Treasury curves are known to be correlated, and first three principal components, namely, level, spread, and fly, explain most of the curve variations. The notebook can be found here.

Introduction

Finanical time series are time stamped sequential data where traditional feed-forward neural network doesn't handle well. Recurrent neural network (RNN) solves this issue by feeding output neurons back into the input to provide memories of previous states. This turns out to be a huge success, especially in Natural Language Processing. Later on, Long short-term memory (LSTM) and Gated Recurrent Unit(GRU) are designed to alleviate the so-called vanishing/exploding gradients issues in the back-propagation phase of RNNs.

In this post, I will build an RNN model with LSTM or GRU cell to predict the prices of S&P 500. To be specific, a supervised machine learning model will be calibrated to predict tomorrow's S&P 500 close price based on the prices from the previous 20 business days. This is a regression problem; yet the codes can be easily adapted to handle classification problem such as simply predicting tomorrow's market direction. The code is located on Github.

Introduction

Hidden Markov Model (HMM) is a Markov Model with latent state space. It is the discrete version of Dynamic Linear Model, commonly seen in speech recognition. In quantitative trading, it has been applied to detecting latent market regimes ([2], [3]). I'll relegate technical details to appendix and present the intuitions by an example.

Introduction

In previous post we have seen Kalman Filter and its ability to online train a linear regression model. In last post we have also seen the idea of cointegration and pairs trading. As pointed out at the end of last post, one way to avoid look-ahead bias and gain walk forward analysis is through Bayesian online training mechanism such as Kalman Filter. Today we'll apply this idea to pairs trading.

As usual, the backtest codes in this post is located here on Github.

Introduction

In last post we examined the mean reversion statistical test and traded on a single name time series. Often times single stock price is not mean-reverting but we are able to artificially create a portfolio of stocks that is mean-reverting. If the portfolio has only two stocks, it is known as pairs trading, a special form of statistical arbitrage. By combining two cointegrated stocks, we can construct a spread that is mean-reverting, even when these two stocks themselves are not. Please refer to the appendix if you want to check out cointegration first.

Cointegration and correlation talk abouot different things. For example, if two stock prices follow two straight lines with differennt slopes, then they are positively correlated but not cointegrated, as illustrated in this quora post. The mathematical formulas are relegated in the appendix.

The backtest codes in this post is located on Github.

Introduction

Mean reversion processes are widely observed in finance. As opposed to trend following, it assumes that the process has a tendency to revert to its average level over time. This average level is usually determined by physical or economical forces such as long term supply and demand. Prices might deviate from that long term mean due to sentimental or short-term disturbances on the market but eventually will revert back to its intrinsic value.

A continuous mean-reverting time series can be represented by an Ornstein-Uhlenbeck process or Vasicek model in interest rate field, which is a special case of Hull-White model with constant volatility. It is also the continuous-time analogue of the discrete-time AR(1) process. I relegate the mathematical details to appendix.

To calibrate the OU process, there are generally two approaches based on formula (A9), i.e., the Least square and Maximum Likelihood Estimation. This document gives a good summary and comparison of results.

The backtest codes can be found on Github.

Introduction

After four posts on linear regression, we are finally at the door of deep learning. Today we will build a simple feed-forward neural network (but not deep) with the help of Tensorflow to solve the linear regression problem. Tensorflow is a popular open-source deep learning library, especially after the retirement of Theano. To learn more about installing and using Tensorflow, their official website offers a lot of interesting materials.

Tensorflow structure consists of two phases -- Graphs and Sessions. Usually you start with building a graph and then let the data (in the format of multidimensional array or tensor) flow through the graph, hence the name Tensorflow.