Commodity Contract - PowerPoint

Document Sample
Commodity Contract - PowerPoint Powered By Docstoc
					Forecasting and Trading Commodity
  Contract Spreads with Gaussian
             Processes

  Nicolas Chapados and Yoshua Bengio
         University of Montreal
                  and
       ApSTAT Technologies Inc.
            Approach in a Nutshell

• Commodity spreads exhibit regularities
• Use a flexible regression approach to forecast the
  complete future price trajectory of a spread
   – Gaussian Processes
   – Augmented functional representation of trajectory
• From the forecast trajectory, identify profitable
  opportunities (accounting for risk)
• Experiments with a portfolio of 30 spreads
• Profitable out-of-sample after transaction costs
                Preliminary Remarks

• Statistical learning algorithms will not make you rich
• Overfitting is a central problem in finance
   –   Only one historical trajectory
   –   Extremely low signal-to-noise ratio
   –   The economy is non-stationary
   –   Bias-variance dilemma takes an interesting form
        • If you use a long history, you reduce variance but introduce bias
        • Conversely, with a short history you have little bias but high
          variance
   – As a result, model selection is difficult

• Bayesian approches promise (theoretically) an
  automatic control of overfitting
               Portfolio Choice:
             Conceptual Landscape
• One-Period Models
   – Classical « mean-variance » framework (Markowitz)
   – Fixed investment horizon (one month, one quarter)
   – Predict the moments of the next-period asset return
     distribution (e.g. mean and covariance matrix)
   – Quadratic programming to find optimal portfolio weights
     that maximize a utility function: best return subject to risk
     constraint
• Direct models using learning algorithms
   – Train a (e.g.) neural network to directly make a portfolio
     allocation decision from input variables
   – Can use a regression or classification framework
   – Training criterion: can maximize a financial utility that
     incorporates risk aversion and the effect of trading costs
               Commodity Spreads

 • Price difference between two futures contracts
 • Example, as of July 24th, 2008:
    – Closing price for « Wheat, September 2008 »: $787.75
    – Closing price for « Wheat, December 2008 »: $811.25
    – Difference (Spread) : 787.75 – 811.25 = –23.50
 • Objective: forecast these spreads
                                                             100
Jul-Dec CME Lean Hogs                                        80
15-year average (1991-2005)                                  60
                                                             40
                                                             20
                                                             0
    Aug Sep Oct     Nov Dec Jan Feb Mar Apr May Jun Jul
           Empirical Regularities in
            Commodity Spreads
• Soybeans Crush Spread (Simon, 1999)
   – Long-run cointegration among the constituents
   – Short-term mean reversal (5-day horizon)
   – Simple rules yield in-sample profits after transaction costs
• Petroleum Crack Spread (Girma & Paulson, 1998)
   – Seasonality at both monthly and trading-week levels
   – Out-of-sample profits after transaction costs
• Gold-Silver Spread (Liu & Chou, 2003)
• Dunis et al. (2006 a,b) study both the crack and the
  crush spreads
            Modeling Objectives

• Nonparametrically exploit seasonalities that
  occur in commodity spreads
• Concentrate on the simplest kind:
  intracommodity calendar spreads
• Fixed maturities: e.g. March–July Wheat
  – Does not require the definition of a roll schedule
  – Problem is characterized by a large number of
    separate historical time series (one per trading
    year in the historical data)
What do Gaussian Processes Buy Us?

• Rather than forecasting the distribution of the
  next-period returns, we can model the
  complete future price trajectory
• A classical approach represents P(rt+1|It)
  – It is the information set available at time t
  – Example, an AR(1) model: yt+1 = a + b yt + e,
    with e ~ N(0, s2)
• A Gaussian Process can represent the joint
  distribution of all future prices, in
  particular P(pt+D|It, D), for D>0.
            Gaussian Processes

•   General tools for nonlinear regression
•   Fully Bayesian Treatment

1. Start with a prior probability distribution
   on the space of functions
2. Observe some data
3. Infer a posterior distribution, given the
   observed data (from Bayes’ rule)
Example
     Gaussian Processes — Details

• Generalization of the normal distribution
  – Multivariate normal: elements of a vector are
    related by a covariance matrix
  – Gaussian process: values of the function at two
    points are linked by a covariance function
• Analytical solution
  – Not subject to the optimization difficulties of
    neural networks — simple matrix algebra
  – Can produce a full covariance matrix between
    a set of new test points
     Gaussian Processes — Details 2

• Let k(x,y) be a semidefinitive positive covariance
  function (kernel)
• X — M x d matrix of training inputs
  y — M-vector of training targets
  X* — M’ x d matrix of test inputs
• Predictive distribution of test outputs at test inputs is
  normal with mean and covariance matrix given by



   – with
      Historical Data: March–July Wheat




Normalized
     Price




             Year              Days to Maturity
     Inputs and Target Representation
• Time is an independent variable. Split into:
   – Current series index (e.g. trading year)
   – Operation time: time at which the forecast is made
   – Forecast horizon: # of days ahead we are forecasting
• Other inputs must be known at operation time
• Target is (normalized) spread price




• We are learning a model of
Example of Forecast given History
    Wheat March–July / 1996
          Forecasting Performance

• AugRQ/all-inp: Reference model
    – Inputs: augmented time representation
    – Spread price + term-structure shape
    – Economic inputs (USDA ending stocks + stock-to-
      use ratio)
•   AugRQ/less-inp. Remove USDA inputs
•   AugRQ/no-inp. Remove price inputs
•   StdRQ/no-inp.
•   Linear/all-inp. Bayesian linear regression
•   AR(1)
         Evaluation Methodology




• Perform comparison using a modified
  Diebold-Mariano (1995) test that accounts for
  cross-correlations between test sets.
Forecasting Performance
 From Forecasts to Trading Decisions
• Use a forecast of the complete future trajectory (made
  at time t0) to find best trading opportunity
• Information Ratio-like Criterion



• Each component is obtainable from the Gaussian
  process forecast, e.g.


• Entry condition: find t1, t2 > t0 which maximize the
  IR criterion
• Exit condition: find exit time t2 which maximizes
  the IR criterion, given the current position
   Behavior on a Single Trading Year
       Wheat March–July / 1996
• Re-train model every 25 days
• Sequence of decisions: short – neutral – long
• Lower panel: Cumulative P&L ($)
                     Portfolio of 30 Spreads

Commodity      Maturities (short-long)      • Common Input Variables
                                               – Current spread price
Cotton         10–12, 10–5
                                               – Prices of first 3 near
FeederCattle   11–3, 8–10                        contracts
Gasoline       1–5                             – Normalization
               12–2, 2–4, 2–6, 4–6, 7–12,   • Grains and related
LeanHogs                                      (SBM, SB, W)
               8–10, 8–12
LiveCattle     2–8                             – USDA Ending Stocks
                                                 (YoY difference)
NaturalGas     6–9, 6–12                       – USDA Stocks-to-Use Ratio
SoybeanMeal    5–9, 7–9, 7–12, 8–3, 8–12    • Transaction costs
Soybeans       1–7, 7–11, 7–9, 8–3, 8–11       – 5 basis points per trade
Wheat          3–7, 3–9, 5–9, 5–12, 7–12       – (Each leg=separate trade)
Portfolio Performance 1994–2007
Portfolio Performance
Correlation Matrix between
      Sub-portfolios
   Conclusions and Future Research
• Functional representation of time series
  – Make (relatively) long-term forecasts
  – Progressively-revealed information sets
  – Handle irregularly-sampled data
• Trading decisions based on IR-like criterion
• Good out-of-sample performance on a
  portfolio of 30 commodity spreads

• Limits of Gaussian processes: computation
  time grows as O(N3) with the data size
  – Approximation methods to handle larger data sets

				
DOCUMENT INFO
Description: Commodity Contract document sample