DATA SCIENCE IN INVESTMENT MANAGEMENT

Recently, I left my role working on a quantitative research team, I admired my colleagues and the team’s overall goal: building scalable equity products underpinned by rigorous academic research. Despite this, my disillusion with traditional active asset management grew and my scepticism was posing a problem for someone working in a building filled with budding economics graduates seeking to prove their ‘investment prowess’. The idea that swathes of teams within the industry are using the same data, the same tools and employing the same skillsets to generate ‘competitive advantages’ is laughable. To all the economics grads hoping to seek out ‘alpha’, if you can fool yourself, unfortunately, you can fool others.

Let me start by saying, I don’t think it’s impossible to be a successful stock picker but history has shown that it’s simply improbable.

The Numbers

Is it hard to pick stocks, if so why? These are the first questions I expect anyone pursuing a career in investment management to tackle. If you’re interested in getting some way to framing this question, lets start with one fact before we move onto why it’s difficult:

“ The distribution of historical returns exhibits positive skewness ”

This just means we owe the incredible performance of indices worldwide e.g. the Russell 3000, to a few ‘extreme winners’ i.e. a few start studded stocks. From 1926–2016, only 1,092, out of 25,300, companies were responsible for the U.S. stock market’s gains.¹ From 1980–2014, 40% of the stocks in the Russell 3000 Index provided negative returns, despite the index returning 2633.12%, and the median stock underperformed the market by -54%. However, what is most interesting is that ~7% of these stocks may be classified as extreme winners (Figure 1).² It should be noted that we observe the same even if we exclude technology and biotech to avoid distortions caused by the Dot-com bubble. This observation, which is clearly a distribution exhibiting positive skewness, is intuitive for two reasons; the first being, losses on long positions are limited to 100% and secondly, upside returns are unbounded.

Figure 1

What Relevance Does This Have For Budding Active Managers?

To illustrate the effect of skewness, we start with a simple model. Let us construct an equally weighted index, containing five stocks with return profiles as given by Figure 2. The distribution presented in Figure 2 quite clearly has our desired property: positive skewness. We have one clear winner, Stock E, and our equally weighted index returns 18%. Let us assume that an active portfolio manager constructs a portfolio containing either 1 or 2 stocks and hence, there are 15 possible combinations.

Figure 2

These portfolios have the following possible returns:

Ten portfolios will return 10%; they will fail to include our ‘star’, Stock E.

One portfolio containing one stock will earn 50% given it solely holds Stock E.

Four two stock portfolios will earn 30% where one of these two stocks will be Stock E.

From our basic numerical example, two thirds of the actively managed portfolios underperform the index as a result of omitting Stock E. Furthermore, the median portfolio manager will only earn 10%, and the average manager’s portfolio return will be equal to our index’s return. This basic example alludes to the following; before we consider fees and trading costs, positive skewness handicaps the median active manager. Why?

If active managers were to randomly select a subset of stocks from the index i.e. exhibit zero prowess in stock picking, the median active manager will more than likely underperform the index.

It’s Down To Portfolio Construction

Expanding upon this simple numerical scenario, we use a Monte Carlo model in an attempt to better understand this phenomenon. Let a stock’s price, S^i, move over time according to:

For the sake of simplicity, we assume a constant volatility, 𝛔 = 20%, for all stocks. What we are doing here, is simulating a universe of stocks that will exhibit the same characteristics we have seen historically. That is, we’re creating a distribution that ensures we generate a small number of both extreme winners and losers. Let each stock have a starting price of $1.00, then the price of stock i picked at time T is given by:

In this model, we will assume an expected index return of 50%, over time T = 5 (years), and a median return of 10%. Hence the index drift is 4% and volatility is 13%.³ To create our portfolios, let us run a Monte Carlo simulation to generate 10,000 simulated stock returns (making use of equation 2). Then, randomly sampling from this set of 10,000 stocks, we create 6000 of each sub-portfolio of sizes 1–20 stocks. We define the probability of outperforming the index as the proportion of randomly sampled portfolios, for a given sub-portfolio size, that exceed the index’s returns of 50%. Plotting the probability of outperforming and underperforming for each sub-portfolio we obtain Figure 3.

Figure 3

Figure 3 (left) shows the probability with which a randomly selected portfolio of size n outperforms or underperforms the benchmark. Figure 3 (right) shows the probability of outperformance and underperformance when we consider a more extreme threshold (70% and 30%). Quite simply, a greater proportion of highly concentrated portfolios have both a lower probability of outperformance and a higher chance of underperformance than larger portfolios.

Traditional active managers are faced with an inherent disadvantage. The risk of considerable underperformance always dominates the possibility of index outperformance.

How can we overcome positive skewness?

It’s easy to complain about traditional active managers, especially given their historical performance. However, more interesting is determining how portfolio construction can overcome the phenomenon of positive skewness which is a headwind for our fabled stock pickers (masters of the universe). What should then be apparent is why quantitative equity products shouldn’t be overlooked.

Dismantling Positive Skewness

We saw that as the number of stocks were increased, the effect of skewness decreased. In an attempt to determine the number of stocks required to overcome this phenomenon, we will quantify the impact of skewness by observing the difference between the mean and median of active manager portfolio returns. These portfolios (which represent our hypothetical managers) are constructed by randomly sampling the historical returns of the S&P 500 from 1991 to 2016.

We start by assuming:

Our active managers create buy-and-hold portfolios at the start of each year

In constructing a portfolio, we randomly pick (without replacement) a set amount of stocks from our S&P 500 universe for a given year. We construct 5000 of these portfolios. We evaluate portfolio sizes of: 15, 25, 35, 50, 75 and 150 stocks. In the interest of simplicity, we also make the following assumptions:

There is an equal probability of selecting any given stock from our universe of stocks

We create equally weighted portfolios

Before stepping further, let us consider why we are concerned with the mean — median spread. As a result of increasing skewness, which we have shown to handicap portfolio managers in our Monte Carlo model, our randomly generated portfolios will occasionally contain extreme winners. This will increase the average manager’s returns whilst the median manager will remain relatively unaffected.

Initial Hypothesis: As we increase the number of stocks in our portfolio, we will observe a tighter spread between the median and the average portfolio managers’ returns of portfolios from 1991–2016.

This seems intuitive, as we increase the number of stocks in our portfolio; we increase the likelihood of including one of the few extreme winners. In the interest of clarity, I summarise the model thus far. From 1991–2016, 5000 bootstrap portfolios are selected from the S&P 500. We do this for 7 different portfolios, each of which has a different size (15 stocks to 150 stocks). We then calculate the average difference from 1991–2016 between the mean and the median of the portfolio returns created in the bootstrap procedure.

In Figure 1, we show this spread/difference. What we observe provides some of the intuition behind some quant equity products (or a low risk active management style). We see the impact/cost of skewness dramatically decreases non-linearly as we increase the number of holdings. During this period, our results show that for portfolio managers holding 15 stocks, the bias is ~70 basis points. The bias falls below 20bps for portfolios with ~150 stocks. Although the impact is still measurable with a large number of stocks, we have dramatically reduced the cost associated with skewness as measured by the mean-median spread.

Figure 1

In an attempt to tie this in with our findings above, we consider the probability of outperformance for each sub-portfolio. Let us consider an equally weighted S&P 500; we assume this for fair comparison with our equally weighted bootstrap portfolios. If we plot the average number of sub-portfolios that outperform their equally weighted benchmark through time we obtain Figure 2.

Figure 2

We have demonstrated that as we include more and more stocks, the probability of including an extreme winner increases and thus, the median active portfolio manager has a better probability of outperforming his/her benchmark.

So How Exactly Does This Relate To Quantitative Equity Products?

To determine how this phenomenon affects factor based quant strategies, I apply the bootstrap portfolio model used previously, however, in this instance we will not apply an equal probability to any given stock being selected. Instead, we do the following:

We bias the selection of constituents of the S&P 500 according to their alpha scores

Assign the probability of any given constituent being selected as being proportional to its alpha score i.e. the higher the alpha score, the higher the probability of it being picked

We generate ‘alpha scores’ by combining constituent ‘z-scores’ (to be thought of simply as a ranking) for three risk-premia:

value : undervalued or cheap stocks

: undervalued or cheap stocks quality : the strength of the underlying balance sheet and management prowess

: the strength of the underlying balance sheet and management prowess momentum: stocks doing well/poorly continue to do well/poorly

To elaborate, the higher a z-score, the better that stock ranks relative to its peer for a given factor. Therefore, by combining these z-scores, we obtain an alpha score, where the highest alpha score indicates a stock which captures all three risk-premiums most, relative to its peers.

Furthermore, in this process we include a small cap bias by construction via an optimiser, which pushes our portfolios towards equal weighting in the interest of avoiding stock-specific risk, and hence, in this experiment we will equally weight the bootstrap portfolios to reflect this. Randomly sampling (without replacement), we form 1000 portfolios for each portfolio size. The proportion of portfolios that outperformed the benchmark is calculated by determining the proportion of portfolios with positive excess returns. The median probability of outperformance from 12/31/1991 to 12/30/2016 is shown in Figure 3 (right).

Figure 3

In Figure 3 (left), we see a dramatic reduction in the “cost” of skewness as we did in our previous experiment. However, what is most interesting is that when we apply our backtested alpha scores the probability of outperforming dominates the probability of underperforming for the median hypothetical risk-premia portfolio manager. We observe that the probability of outperformance peaks at 150 stocks and the probability of underperforming is lowest at 150. That is not to say that 150 is the golden number; however, it provides us with a first step to determine a more optimal portfolio construction methodology to counter positive skewness.

Concluding Remarks

The two articles aimed to describe and illustrate an often overlooked but extremely important property of historical returns, positive skewness.

The implication of positive skewness is that overall index returns can be driven by a relatively small number of stocks and this can in turn present a significant headwind to managers of concentrated portfolios.

In the active versus passive debate the focus has generally been on fees, skill and trading costs. From the evidence presented, skewness should also be included in these discussions.

We’ve come some way to understanding the benefits of larger portfolios. Quantitative investment products, such as enhanced index or smart beta, all contain a relatively high number of stocks to counter this effect.