March 30, 2012

If popularity was decisive for risk metrics, the Sharpe ratio would tell us all that we need to know. Introduced by professor William Sharpe in 1966, the reward-to-variability ratio (as it was originally labeled) remains a hardy perennial for evaluating money managers and investment strategies. You can hardly pick up a finance journal or mutual fund study without finding a reference to this ubiquitous gauge.

But fame doesn't change the fact that the Sharpe ratio falls well short of capturing the full spectrum of risks in the money game. The same can be said of every other risk measure. Yet celebrity invites a higher level of scrutiny.



Still, the intuition behind the Sharpe ratio is no less compelling today than it was in 1966. Filtering investment returns through a risk prism is essential for analyzing track records. Performance statistics alone, if not misleading, reveal little if anything. Risk analysis is critical. But it's also complicated.

It's easier to identify weak points in a given risk metric than to offer solutions. So fairly or not, the Sharpe ratio's various flaws have been dissected in numerous studies over the years. Its leading drawback is arguably the fact that financial market returns don't follow a normal statistical distribution. The non-random aspect of returns is commonly measured instead by skewness, a measure of probability asymmetry, and kurtosis, which measures how "peaked" the random variables are.

These things are easily calculated in Excel. But the standard Sharpe ratio-the ratio of the average risk premium to average volatility (standard deviation)-isn't sensitive to skewness and kurtosis. Standard deviation can be used to analyze data for any distribution curve, but it's not particularly well suited for profiling the non-normality of investment returns.

Those who assume that returns are normally distributed, then, run the risk of underestimating the potential for big losses, particularly over short periods of time. That's because financial markets exhibit what are known as fat tails. A severe loss in financial markets occurs more often than a normal distribution predicts (see Figure 1). Professor Benoit Mandelbrot, the father of fat tails research in finance, emphasized the point in his 2004 book The (Mis)Behavior of Markets: If the stock market's returns followed a normal distribution, he wrote, equity market index swings of more than 7% "should come once every 300,000 years. In fact, the 20th century saw 48 such days."

The Sharpe ratio is a good first approximation of market risk, but a broader perspective is vital for finding the dangers that lurk. The good news is that the world is brimming with alternative risk gauges. That's also the bad news-too many choices, too little time. If you look at too many risk metrics, at some point you'll end up with statistical mush and a headache. The goal is selectively assembling a manageable mix of complementary metrics to supplement the Sharpe ratio.

That's easy in theory, as there are at least 100 risk metrics in use, most of which are intent on providing improvements over the Sharpe ratio. Where to begin? Software analytics firm StatPro offers an informative primer in the article, "How Sharp Is The Sharpe Ratio?" available at statpro.com.

In the meantime, two metrics that have attracted attention in recent years are worthy of a closer look: the "modified" Sharpe ratio and the "conditional" Sharpe ratio. Each offers an alternative methodology for computing the conventional Sharpe ratio's denominator-standard deviation. The goal is a deeper level of reality for risk measurement. They're not perfect-nothing is-but they deserve to be on the short list of complements to the original measure. To understand why, let's review the foundation behind these alternative calculations of risk-a concept known as value at risk, or VaR. Yes, that VaR.

In Search Of Reality

If you're familiar with VaR, you probably know that it's in the running for the most-maligned risk metric in financial history. Entire books have been written about its defects, including the latest addition to this library of ignominy: The Number That Killed Us by business professor Pablo Triana, who writes that VaR can "easily and severely underestimate market risk."

Much of the criticism of VaR is linked to its use (and abuse) in banking, although some of the caveats apply to money management. But the problem with VaR is less about its inherent limitations than the fact that it has been misused to skirt risk regulations. And yet even if it is flawed, it still provides a template for refining risk evaluation.

First, consider the basic VaR, which estimates a portfolio's potential for maximum loss at a given confidence level (typically set at 95%). The metric's basic calculation follows what's known as the parametric method with three inputs:

1) The portfolio's mean return (p)

2) The probability that corresponds with the chosen confidence level for the standard normal cumulative distribution, which is easily computed with Excel's NORMSINV (x%) function, with x% as the chosen confidence level. This works out to 1.65 for a 95% confidence level.

3) The portfolio's standard deviation (?p)

The estimates are plugged into the formula:

VaR = - p + 1.65 x ?p

For example, take a portfolio with a 5% risk premium and 20% volatility, which equates to a VaR of roughly 28% at the 95% confidence level (-5% + 1.65 x 20%). In this case, VaR tells us that there's a 95% probability that the loss-the value at risk-for a $1 million portfolio won't exceed $280,000 (28% of $1 million). We can flip that around and say that there's a 5% probability that the loss will exceed $280,000.

Like every risk metric, VaR suffers limitations, starting with the assumption that there are normally distributed returns. There are several ways to adjust the calculation to capture a more realistic estimate of risk. Unfortunately, most of the tweaks are computationally challenging. One approach that's relatively straightforward, however, is modified VaR, or MVaR, which incorporates skewness and kurtosis (see the sidebar, "Modified VaR" for details).

MVaR makes a valiant effort at balancing parsimony with reality. Ideally, risk estimates should be easy to calculate, they should use a minimal number of variables, and they should accurately model the true distribution of asset returns. That's an impossibly high standard in finance, so you must pick your poison. How much are you willing to sacrifice with one ideal in pursuit of another? MVaR is an attempt at finding a compromise between realism and computational simplicity.

But there's no free lunch. It's probably fine for you to apply MVaR to a broad equity market index. But it's questionable when you use it with assets and funds that exhibit high levels of non-normality.

As a test of how MVaR compares with VaR and standard deviation, let's run the numbers on the stock market (S&P 500) based on monthly returns for the ten years through November 2011:

MVaR: 27.1%

VaR: 23.6%

Standard Deviation: 14.5%

MVaR tells us that equity risk is higher (and so risk-adjusted performance is lower) than it appears using VaR or standard deviation. What accounts for the MVaR metric's higher reading? The S&P 500's return distribution isn't normal. If market returns were normally distributed, the asymmetry of skewness and the higher distribution peaks of kurtosis would be irrelevant and the risk estimates of MVaR and VaR would be identical. But in the real world, the metrics differ, which tells you something about market behavior and the outcome when you embrace a higher dose of quantitative realism.

Expecting Fat Tails

There are several other methodologies that boast higher accuracy in measuring non-normality, though none are as compact and easy to estimate as MVaR. Still, whether you choose MVaR or a sturdier technique, there's another challenge to consider: tail risk in the extreme. MVaR provides an estimate of the risk that accompanies the non-normal aspect of return distributions, but only up to a given confidence level. If we're using a 95% level, the worst losses at the 5th percentile and beyond is undefined with MVaR (or VaR). And yet within that tail resides the potential for the most crippling damage-think October 1987 or the autumn of 2008. (See Figure 2.)

Ignoring tail risk courts disaster, although modeling this corner of the distribution is venturing into a gray area. One metric that takes a stab at quantifying this dark sliver is conditional VaR, or CVaR. Calculating CVaR at the 95% confidence level for normally distributed returns works out to:

CVaR = - p + 2.06 x ?p

This version of CVaR is identical to its VaR counterpart, except for the probability value that corresponds with the chosen confidence level. For CVaR at 95%, the related input equates with 2.06, which can be calculated in Excel with the function:

NORMDIST(NORMSINV(x%),0,1, FALSE)/(1-x%)

which uses x% as the chosen confidence level. Using the past 10 years of S&P 500 monthly price history and taking into account a normal distribution, the CVaR formula above yields a reading of 29.7%. The comparable figure for MVaR is 27.1% and 23.6% for VaR. A loftier estimate with CVaR is expected for modeling the worst 5% of outcomes. Using CVaR as the denominator in the Sharpe ratio provides an estimate of risk that uses the extreme slice of losses in the tail.

This parametric estimate of CVaR is straightforward, but it fails to factor in the non-normal aspect of return distributions. For a higher degree of realism, an alternative approach is required. You could instead use another distribution curve that closely matches the actual return behavior in the tail. It's unclear, however, which distribution is optimal and practical for this purpose. Even when analysts agree on a replacement for a normal distribution for estimating CVaR (or VaR, for that matter), the solution can be quantitatively challenging.

For example, Morningstar's Ibbotson Associates recently outlined a process for modeling tail risk for asset allocation with a version of CVaR that relies on what's known as the truncated Lévy flight distribution. (See "Mean-Variance Versus Mean-Conditional Value-at-Risk Optimization: The Impact of Incorporating Fat Tails and Skewness into the Asset Allocation Decision," by James Xiong and Thomas Idzorek, at corporate.morningstar.com.) The authors favor a methodology that delivers a statistically robust solution for risk estimation, though it's no one's idea of a computational picnic.

A less formidable alternative is using market history as a guide for estimating CVaR. This approach allows us to sidestep the thorny issue of choosing a distribution; instead, we let market history do the heavy lifting. The past may not fully represent the future, of course, but it's an obvious place to start. In the decade ending this past November, for instance, the worst monthly losses in the 5% tail ranged from -6.6% to a crushing -20.4%. The average of those losses was -10.1%, which can be interpreted as the average expected monthly decline for the worst 5% of cases.

Let's double check that estimate with a longer sampling of history. Using the past for risk modeling requires a sufficiently long stretch of time to include as many tail events as possible. Let's assume that the S&P 500's history since 1881 is a representative sample, courtesy of data from Yale professor Robert Shiller (see www.econ.yale.edu/~shiller). Does that make a difference? Not really. After extending the sample data back to the late 19th century, the average tail loss is -9.7%, which is comparable to the parametric estimate based on the last 10 years.

What should we make of the wide divergence between the equity tail risk at roughly 10% we see in history and the CVaR metric's normally distributed parametric estimate (29.7%)? Is the historical record too low for anticipating future risk? Or is the parametric estimate too high? Perhaps we can sort this out with a third perspective by running a simple Monte Carlo test in Excel and simulating monthly returns in 10,000 trials (the equivalent of more than 800 years) based on a Student's t-distribution. This alternative to the normal distribution sports relatively fatter tails while remaining practical for applications with mean variance optimization, according to modern finance founding father Harry Markowitz. (For details, see Chapter 27, "What Does Harry Markowitz Think?" in Frontiers of Modern Asset Allocation). Using the Student's t-distribution (with assumptions recommended by Markowitz) in a Monte Carlo simulation tells us that the average monthly loss in the worst 5% of cases is a much harsher -43.4%. Ouch!

It appears that actual history underestimates the potential for tail-risk loss. We can debate the point, but it seems reasonable to assume that the future could get a lot worse than what we've experienced. On the other hand, estimating tail risk with monthly data may be inappropriate for anticipating trouble for long-term investment horizons. For instance, consider rolling annualized ten-year periods for anticipating tail risk at the 5th percentile and beyond. With this benchmark of time, tail risk drops by roughly half of what it would be with estimates using monthly data. More precisely, the average loss for ten-year returns at or below the 5th percentile has been -4.9% since 1881. Time, it seems, can dull the tail's bite.

Measuring Risk Is Risky Too

By now, it should be obvious that the road to reasonable risk estimates is riddled with potholes. Much depends on the assumptions and model. If we run the various estimates we've examined at a higher 99% confidence level, all the risk measures rise accordingly. The lesson is that expecting one number to tell us all that we need to know is like thinking that a lone weather forecast will carry us through all four seasons.

There's no perfect measure of risk and there are many types of risk. Perspective, then, is the only game in town. Risk is multi-dimensional, which implies that our tool kit for measuring risk should be no less. The standard Sharpe ratio is still a reasonable beginning, but the analytical journey shouldn't end there. In fact, the process of estimating risk with several methods may be even more important than the numbers per se. It's one thing to quote risk values; it's far more valuable to understand how you've arrived at a given estimate, and how (and why) the underlying results vary.

Ultimately, estimating risk is about getting comfortable with what could go wrong under different scenarios. That includes thinking about the weak link in your choice of risk metrics.

"Both modified and conditional VaR are improvements on basic VaR in the sense of taking account of the non-normal nature of returns," says Shafiq Ebrahim, a researcher at the quantitative investment shop Aronson Johnson Ortiz. "But it's no panacea." For instance, these metrics don't factor in the possibility that liquidity can dry up in a crisis.

Mike Reed, a former director in Morgan Stanley's process-driven trading unit who now runs MJ Reed Investment Consulting, reminds us that any historical correlations baked into a risk model can break down during panic selling. Models are vulnerable because history isn't a perfect guide to the future. Bonds, for instance, have been in a bull market for the past 30 years and inflation has been relatively low. As a result, risk models may be underestimating the fallout for fixed-income markets if inflation spikes, he warns.

"The problem with all these measures is that they're ad hoc," adds Kent Osband, an economist and author of the recently published Pandora's Risk: Uncertainty at the Core of Finance. Do you want to measure VaR at a 95% or 99% confidence level? Every answer is arbitrary. Osband says CVaR is an improvement over VaR-"almost anything would be better than VaR"-but focusing on CVaR assumes that tail risk is more important than the rest of the distribution. Sometimes that's true, but not always, or at least not for every investor at all times.

"The main problem with VaR, or any risk measure," says Dario Cintioli, who heads up risk analysis at StatPro, "is that you're using historical data in some way." There's a conflict in choosing how much, or how little, data to analyze. If you look too far into the past, the results may be irrelevant. Yet the same problem, albeit for different reasons, can apply when reviewing recent history.

Frank Knight famously distinguished between risk and uncertainty in his classic 1921 book Risk, Uncertainty, and Profit. Statistical methods can be useful for quantifying certain aspects of market fluctuations, but that's only a piece of risk.