Data

Firstly we estimate the realized volatility to test the information content of implied volatility and evaluate how it can help forecast realized volatility. Following Siriopoulos and Fassas [18], we compute the squared return without mean-reversion assumption. We also use non-overlapping observations and compute the realized volatility, \(RV_m\) , separately for each calendar month. This method can improve the predictive power of implied volatility and avoid over estimation of past volatility. In particular, the ex-post realized volatility during the next calendar month is calculated according to the following equation:

$$\begin{aligned} R{V_m} = \sqrt{\frac{{365}}{{{n_m}}}\sum \limits _{t = 1}^{{N_m}} {{{({r_{Oil,t}})}^2}} } \end{aligned}$$ (12)

where \(R_t\) is the return of stock index on day t and \(n_m\) is the number of calendar days in month m. We annualize the volatility according to the actual 365 days counting convention, since the calculation of volatility index is based on calendar days instead of trading days [33].

According to the calculation method (see [33]) of implied volatility index, the volatility index recorded at the close of the last trading day in month \(m-1\) can represent the market forecast of future volatility of the underlying index in month m, in essence. In this analysis we use \(OVX_m\) to denote the volatility index recorded at the close of the last trading day of month m .

Moreover, the information content of OVX at a relatively short time-horizon (5 days and 10 days) is examined as follows. By definition, the forward looking time horizon of implied volatility index is approximately equal to 30 calendar days and the implied volatility index is expressed in annualized terms. Therefore, we should transform annualized implied volatility index to the required 5- and 10-day interval by square root time rule. The forward looking implied volatility index is presented in Eq. (13).

$$\begin{aligned} I{V_{i,t}} = \sqrt{\frac{i}{{365}}} OV{X_t} \end{aligned}$$ (13)

Thus \(IV_{i,t}\) is the expected volatility over the next i days.

5-day (10-day) forward-looking realized volatility is computed by taking the square root of the sum of the (future) squared returns over \(\left[ t+1,t+5\right] \) ( \(\left[ t+1,t+10\right] \) ) period as in Eq. (14).

$$\begin{aligned} R{V_{i,t}} = \sqrt{\sum \limits _{j = 1}^i {r_{t + j}^2} } \begin{array}{*{20}{c}} ,&{i = 5,10} \end{array} \end{aligned}$$ (14)

As for the 5- and 10- day realized volatility, they are both defined from non-overlapping data. As noted by Christensen and Prabhala [14], the use of realized volatility calculated from overlapping data in regression analysis may potentially lead to strong auto-correlation problems in the regression’s residuals.

Table 4 presents the descriptive statistics of volatility and log-volatility series. According to Christenson and Prabhala [14] and Hansen [34], log-transformed data brings skewness and kurtosis of their volatility data closer to a normal distribution. However, Fleming [35], Fleming, Ostdiek and Whaley [1] and other studies used untransformed volatility data. Corrado and Miller [15] performed parallel regressions using both the original volatility measures and the log-transformed volatility measures. But by the definition of implied volatility index, the original volatility measures can be interpreted directly. Our sample contains 411 non-overlapping 5-day observations, 205 non-overlapping 10-day observations and 98 non-overlapping monthly observations on realized volatility and implied volatility covering the period from June 2007 to July 2015. All distributions exhibit positive skewness and kurtosis and the null hypothesis of Jarque–Bera test of normality is rejected. This indicates the return deviates from the normal distribution.

Table 4 Descriptive statistics Full size table

Empirical Analysis

We assess the relationship between implied volatility index and realized volatility based on a linear regression of the form:

$$\begin{aligned} R{V_m} = {\alpha _0} + {\alpha _1}I{V_{m - 1}} + {\varepsilon _m} \end{aligned}$$ (15)

Christensen and Prabhala [14] suggested three hypotheses that can be tested regarding Eq. 15. First, if IV contains at least some information about future realized volatility, coefficient \(\alpha _1\) should be statistically significant against a null hypothesis of \(\alpha _1=0\) . Second, if IV is an unbiased estimate of realized volatility, then the intercept of Eq. 15 should be zero and coefficient should equal one. This joint hypothesis can be tested using F statistic [36]. Lastly, if IV is indeed an efficient estimate, the residuals should be pure white noise and uncorrelated with any other variable.

In the first step, we use ordinary least square to estimate the coefficients in Eq. 15 and test three hypotheses proposed by Christensen and Prabhala [14]. The results are summarized in Table 5.

Table 5 Descriptive statistics of the OLS estimation Full size table

From results in Table 5, we find \(\alpha _1\) is statistically different from zero, which indicates that OVX contains information regarding future realized volatility of crude oil spot prices. Testing the null hypothesis of \(\alpha _1\)=1, 5- and 10-day period data reject this hypothesis. The joint hypothesis of \(\alpha _0=0\) and \(\alpha _1=1\) are accepted for all the three data set, which suggests OVX is an unbiased estimation of future realized volatility. However, the Durbin–Watson (DW) statistic values indicates that OVX isn’t an efficient predictor of future realized volatility, since for all the three sets of data are significantly different from two (indicating residuals are autocorrelated).

In the second step, we transform Eq. 15 with Kalman filter and analyze the parameters in a dynamic system:

$$\begin{aligned}&\left[ {\begin{array}{*{20}{c}} {{\alpha _{0,k + 1}}}\\ {{\alpha _{1,k + 1}}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} 1&{}0\\ 0&{}1 \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{\alpha _{0,k}}}\\ {{\alpha _{1,k}}} \end{array}} \right] + {w_k} \end{aligned}$$ (16)

$$\begin{aligned}&R{V_{m,k}} = [\begin{array}{*{20}{c}} 1&{I{V_{m - 1,k}}} \end{array}]\left[ {\begin{array}{*{20}{c}} {{\alpha _{0,k}}}\\ {{\alpha _{1,k}}} \end{array}} \right] + {v_k} \end{aligned}$$ (17)

where \(w_k\) and \(v_k\) have the same meaning as Eqs. 1 and 2. We use the first quarter of the observations to initialize the covariance matrix of process noise Q, measurement noise R. These volatility series are synchronized, so that realized volatility in month m is aligned with implied volatility observed on the last trading day of month \(m-1\).

The results of Kalman filter are presented and analyzed in Table 6.

Table 6 Descriptive statistics of the Kalman Filter estimation Full size table

Results in Table 6 indicate that Kalman filter achieves a smaller mean squared error. Moreover, the mean values of \(\alpha _0\) are very close to zero and the mean values of \(\alpha _1\) are around one. The results are very similar to the OLS estimates.

Quantification of Market Risk

In this paper, we quantify market risk based on the information content of OVX to future realized volatility. More specifically, we assess the added value of OVX based volatility forecasts when these forecasts are used to quantify and estimate short-term market risk level. We consider the widely used Value-at-Risk (VaR) framework which provides, at a given percentage level, the most likely loss for an investor. In this framework, we use the estimated RV calculated from OVX to substitute the predicted conditional standard deviation in the calculation of VaR as in Eq. 18.

$$\begin{aligned} Va{R_{t + 1|t}} = {F_\alpha }({z_t};\theta ){\sigma _{t + 1|t}} \end{aligned}$$ (18)

where the distribution is assumed to be normal and \({F_\alpha }({z_t};\theta )\) is the relevant quantile at the \(100 \cdot \alpha \% \) level from the normal distribution. \({\sigma _{t + 1|t}}\) is the forecast of the conditional standard deviation at time \(t+1\).

Furthermore we introduce RiskMetrics model, which can easily be used by market practitioners, as a competing model to evaluate the performance of the proposed model. The RiskMetrics model is defined in the following equation.

$$\begin{aligned} {\sigma _{t|t - 1}} = \sqrt{{\alpha _0}r_{t - 1}^2 + {\alpha _1}\sigma _{t|t - 1}^2} \end{aligned}$$ (19)

In which \(\alpha _0\) and \(\alpha _1\) are estimated based on the historical data.

In order to back-test the VaR results, we use Kupiec LR test in the paper [37]. Given the ex-post observed returns \(\{r_{t+1}\}\) and ex-ante forecast \(\{VaR_t\}\), the empirical failure rate \(\hat{f}\) is expected to be equal to the number of returns smaller than the VaR. If the number of violations differs considerably from \(\alpha \times 100 \, \% \) of the sample, then the accuracy of the underlying risk model is called into question. The null hypothesis \(H_0: f=\alpha \) against \(H_1: f

e \alpha \) can be tested with the LR statistics, which takes the form as follows.

$$\begin{aligned} LR = - 2\ln [{(1 - \alpha )^{T - N}}{\alpha ^N}] + 2\ln [{(1 - N/T)^{T - N}}{(N/T)^N}] \end{aligned}$$ (20)

where N is the number of violations in the sample, T is the total number of observations. Under the null hypothesis, the test statistic is \(\chi ^2\) distributed with one degree of freedom.

We apply both methods to estimate VaR of crude oil returns. The VaR for 5- , 10-day and 1-month are tested using the same period as in the last section. These VaR forecasts \(\{VaR_t\}\) , pertaining to returns defined on \(\left[ t,t+1\right] \) , can then be back-tested against the observed returns \(\{r_{t+1}\}\) . For back-testing of VaR forecasts, we compute the empirical failure rates and Kupiec LR tests for both left and right quantile at 1 %, 2.5 % and 5 % since investors can hold short positions of crude oil. Empirical results for the two models based on OVX (estimated with Kalman filter and OLS) and RiskMetrics based on historical volatility, using time series of different frequencies, are shown in Table 7. The confidence level of LR test is 0.05. We find that in general Kalman filter based VaR model outperforms the OLS based VaR model and RiskMetrics model. The results of Kalman filter with \(\alpha =95\,\%\) are illustrated in Figs. 2, 3 and 4.

Table 7 Back-testing results with different and different time frequencies Full size table

Fig. 2 5-day VaR estimates of the OVX based model with Kalman filter (\(\alpha =95\,\%\)) Full size image

Fig. 3 10-day VaR estimates of the OVX based model with Kalman filter (\(\alpha =95\,\%\)) Full size image