Correlation between prices or returns?

  • If you are interested in determining whether there is a correlation between the Federal Reserve Balance Sheet and PPI, would you calculate the correlation between values (prices) or period-to-period change (returns)?

    I've massaged both data sets to be of equal length and same date range and have labeled them WWW (WRESCRT) and PPP (PPIACO). Passing them into R we get the following:

    > cor(WWW, PPP)
    [1] 0.7879144
    

    Then applying the Delt() function:

    > PPP.d <- Delt(PPP)
    

    Then applying the na.locf() function:

    PPP.D <- na.locf(PPP.d, na.rm=TRUE)
    

    Then passing it through cor() again:

    > cor(WWW.D, PPP.D)
    [1] -0.406858
    

    So, bottom line is that it matters.

    NOTE: To view how I created the data view http://snipt.org/wmkpo. Warning: it needs refactoring but good news is that it's only 27 iines.

    If the period-to-period change shows a correlation, you've shown a correlation between the first derivative (with respect to time) of the two quantities. That's a little different than showing correlation between the values themselves (although I'm guessing there's some relation between the two).

    WRT your bottom line: don't forget that you now have quantitative easing, which you should consider as a structural break in the model. Using only data up to 2007 I get a correlation of only -0.1524359

    The blog post: http://www.portfolioprobe.com/2011/01/12/the-number-1-novice-quant-mistake/ shows (and tries to explain) why you generally want to use returns and not prices.

    Clicking the link I get "The resource you are looking for has been removed, had its name changed, or is temporarily unavailable."

  • Short answer, you want to use the correlation of returns, since you're typically interested in the returns on your portfolio, rather than the absolute levels.

    Also, correlations on price series have very strange properties. If you think about a time series of prices, you could write it out as [P0,P1,P2,...,PN], or [P0,P0+R1,P0+R1+R2,...,P0+R1+...+RN], where Ri = Pi-P(i-1). Written this way you can see that the first return R1, contributes to every entry in the series, whereas the last only contributes to one. This gives the early values in the correlation of prices more weight than they should have. See the answers in this thread for some more details.

    This answer makes intuitive sense based on the premise that R1 has an inordinate influence.

    I would add that one may want to look at either correlation of returns or cointegration of prices rather than correlation on prices

  • It depends, usually you would want to measure correlation between variables that are both stationary, else you would always be able to measure a correlation in the case of variables developing with a trend, even if they are unrelated. In this case I would guess that you should use first differences.

  • There is a good explanation here. Summarizing, what we are computing with Pearson correlations are relations between deviations respect to the means, which is no meaningful using prices. So, you should calculate Pearson correlations using returns.

License under CC-BY-SA with attribution


Content dated before 7/24/2021 11:53 AM

Tags used