Multivariate dependence in complex systems

Reviews
Shared by: Juan Agui
Stats
views:
28
rating:
not rated
reviews:
0
posted:
4/28/2009
language:
pages:
0
Multivariate dependence in complex systems Auroop R. Ganguly*, Shiraj Khan**, David J. Erickson III*, Rick W. Katz***, George Ostrouchov*, Vladimir A. Protopopescu*, Sharba Bandyopadhyay****, and Sunil Saigal** * Oak Ridge National Laboratory, Oak Ridge, TN ** University of South Florida, Tampa, FL *** National Center for Atmospheric Research, Boulder, CO **** Johns Hopkins University, Baltimore, MD 5th Symposium on Understanding Complex Systems University of Illinois at Urbana-Champaign May 16-19, 2005 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Definitions  “Multivariate Dependence”  Generalized (linear or nonlinear) dependence within one or among multiple variables, in spatial, temporal or other dimensions  “Complex Systems”  Nonlinear, multidimensional, multi-scale, component processes OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 2 Summary  Correlation and Dependence  Linear and Nonlinear Dependence  Simulated System (Time series)  Real Systems (Time series and Spatial)  Extremal Dependence  Next Steps OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 3 Dependence Linear and Nonlinear OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Correlation and Dependence  Linear Correlation  Linear relationship among variables, r  Time Series, Autocorrelation and Cross-correlation functions  Nonlinear Dependence  Linear and nonlinear relationships, by using information theoretic concepts like mutual information  Multiple dimensions  Temporal  Spatial  Spatio-Temporal OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 5 Information Entropy and Mutual Information     Information entropy, H(X) = – S pi lnpi Mutual entropy, H(X,Y) = – S S pij lnpij Mutual Information, I(X;Y) I(X;Y) or IXY = H(X) + H(Y) – H(X,Y)  “Distance” between the joint distribution F(X;Y) and the product distribution F(X)F(Y)  Independence implies F(X;Y) = F(X) F(Y) OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 6 Nonlinear dependence measure  I(X;Y) goes from 0 to ∞; r2XY from 0 to 1  For bivariate normal, IXY = – ½ log(1–r2XY)  Granger defined l = 1 – exp(–2*IXY)  This new quantity is like a “nonlinear correlation” measure that goes from 0 to 1  Can be extended for the multivariate case OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 7 Nonlinear dependence: Space-time  Nonlinear dependence measure  Some applications in time series  Almost nothing for spatial statistics  Nothing for spatio-temporal  Significant  Linear is only one of several types  Nonlinear is general  Linear may not predominate in all situations OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 8 Bounds on Predictability  MSE from linear regression  MSE = (1 – r2XY)s2Y  Minimum MSE from nonlinear methods     Maximum bound on predictability Theorem: E{Y–g(X)}2 ≥ (1/2pe) exp{2(HY–IXY} The left hand side is the MSE bound g(X) is the best possible function of X that explains or predicts Y OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 9 Simulated System Lorenz Equations OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Dynamical Time Series: Simple nonlinear system s = 10 r = 28 b = 8/3 X(0) = 1.1 Y(0) = 5.0 Z(0) = 1.1 Courtesy: Mathworld OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 11 Lorenz Equations  Lorenz Equations: Lorenz Y vs. Lorenz X       r = 0.8606 Linear MSE (theoretical): 22.5026 Linear MSE (validation): 21.3181 l = 0.9192 Nonlinear MSE (theoretical bound): 9.6031 Nonlinear MSE (validation with ANN): 21.2995  Impact of noise & Seasonality:  Lower r & l imply higher MSE OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 12 Autocorrelation – Lorenz X Cross-Correlation Function r  r (lag) Cross-Dependence Function l  l (lag) OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 13 Cross-correlation – X vs. Y & X vs. Z X vs. Y X vs. Z CrossCorrelation Function r  r (lag) CrossDependence Function l  l (lag) OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 14 Lagged Cross-correlation – X vs. Y X/t=Xt+10 vs. Y X/t=Xt+25 vs. Y CrossCorrelation Function r  r (lag) CrossDependence Function l  l (lag) OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 15 Cross-correlation with Noise – X vs. Z X vs. Z/ = Z + N (0,1) X vs. Z/ = Z + N (0,5) CrossCorrelation Function r  r (lag) CrossDependence Function l  l (lag) OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 16 Real Systems 1. Time Series (Linear & Nonlinear) a. Hydro-climatology 2. Spatial (Linear) a. Wind velocity b. High-resolution population c. Wind velocity (potentially, spatio-temporal) OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 17 Hydro-climatology  El Nino Southern Oscillation Index  Variability in river flows around the world OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 18 0.8 0.7 0.6 Correlation Coefficient Correlation Coefficient 0.7 0.6 Correlation Coefficient 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 Year 4 5 6 7 0 1 2 Year 3 4 5 Ganges (l) Ganges (nl) 0.5 0.4 0.3 0.2 0.1 0 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 Year 4 5 6 Amazon (linear) Amazon (nonlinear) Parana (linear) Parana (nonlinear) Parana (l) Parana (nl) 0.8 0.7 0.6 Correlation Coefficient 0.8 0.7 0.6 Correlation Coefficient 0.9 0.8 0.7 correlation coefficient Et & ENSO (l) Et & ENSO (nl) 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 Year 5 6 7 8 Nile (l) Nile (nl) 0.5 0.4 0.3 0.2 0.1 0.6 Nile & ENSO (l) 0.5 0.4 0.3 Et, Enso & Nile (l) 0.2 0.1 Et, ENSO & Nile (nl) Nile & ENSO (nl) Et & Nile (l) Et & Nile (nl) Congo (l) Congo (nl) Nile (l) Nile (nl) 0 0 1 2 3 Year 4 5 6 0 0 0.5 1 1.5 2 Year 2.5 3 3.5 4 4.5 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 19 High-resolution Population  “LandScan”: Developed at ORNL, used globally by mapping agencies & for disaster management  30 arc seconds for global; 3 arc seconds for USA  Census counts allocated to higher resolutions: Re-distribution model  Input variables like proximity to roads, slopes, night-time lights, land cover, etc.  Correlations can be directional OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 20 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 21 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 22 Directional Spatial Correlations: Autocorrelation, Aggregated LandScan USA 1 2 3 4 5 6 5 4 3 2 1 6 7 8 9: Identical 9 7 8 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 23 Directional Spatial Correlations: Cross-correlation, LandScan USA vs. Global 1 2 3 4 5 6 5 4 3 2 1 6 7 8 9: Identical 9 7 8 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 24 Directional Spatial Correlations: Cross-correlation, LandScan Global vs. Lights 1 2 3 4 5 6 5 4 3 2 1 6 7 8 9: Identical 9 7 8 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 25 Wind Velocity     U and V components 1013 millibars Entire globe, 1 degree lat-long coverage Note: Projections do not consider spherical nature of the data  Correlations can be directional OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 26 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 27 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 28 Spatial auto-correlation: U OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 29 Spatial auto-correlation: V OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 30 Spatial cross-correlation: U & V OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 31 Extremal Dependence Emerging Literature Ongoing Work Applications OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Statistics of Extremes: State of the Art  Law of large numbers & probability model     Regular values ~ CLT ~ Normal distribution Extreme values (rate of occurrence) ~ Poisson distribution Severity ~ Generalized Pareto (GP) distribution Where number of occurrences are rare, probability models help to model and predict  Time series extremes  Declustering to identify extremes / events – Last decade  Probability models (probability of exceedence and probability given exceedence) – 2000 to 2004  ACF-like time lagged univariate dependence – 2003  Multivariate normalizations – 2004 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 33 Extremal Dependence: Multiple Time Series  Which set of conditions in indicator variables produce extremes in a time series? Prob. { Y > u | X = x }  Poisson for occurrence of high threshold  Poisson Parameter: P (l)  f (x, s, t); s: space; t: time  Generalized Pareto distribution for Pr { Y > u }  GP Parameters: GP (q); q  f (x, s, t)  Can we develop a new measure for quantifying the dependence in extremes? Prob. { Y (t + t) > u | X (t) > v }  Autocorrelation like measures exists for single series (2003)  Multivariate extremal transformations exist (2004)  Develop cross-correlation measures for multiple series OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 34 A Method  Dependent variable, Y  Regional Precipitation  2D non-homogeneous point process  Fit Poisson, P(l) for occurrence of extremes  Fit Generalized Pareto, GP (q) for extremes  Independent Variable, X     Ocean Temperature Express q  q (x, s, t) and l  l (x, s, t) Find {x} which trigger {Y > u} for any s, t Pr { Y > u | X = x } as a function of (s, t) 35 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY A Measure  Time-lagged extremal associations  New method gives Pr { Y > u | X = x }  Find Y > u given X > v  Develop CCF-like measure  Ledford and Tawn (2003): Extremal ACF: Pr { Y (t + t) > u | Y (t) > u }  Heffernan & Tawn (2004): Multivariate extremal transforms: Pr { Y (t + t) > u | X (t) > v }  CCF-like measure  Time-lagged multivariate extremal dependence  First step to space-time extremal dependence OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 36 OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 37 Applications: Climate Extremes Abrupt Changes in the Paleo Climate Science: August, 2004 Heat Waves in the 21st century Higher: Intensity, frequency, duration Science: March, 2003 Sudden regional change in past climate 1920-s North latitudes NRC (2002): Abrupt Climate Change Panel “Current use of statistics needs to be re-examined, as one cannot treat abrupt climate change in the same manner as one would treat the occurrence of a 100-year floods” OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 38 Next Steps  Regular Dependence  Linear correlation for space-time  Nonlinear dependence in space & space-time  Extremal dependence  Multivariate extremal dependence measures  Space and space-time  Linear versus nonlinear?  Real applications OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY 39 Thank you! OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY

Related docs
premium docs
Other docs by Juan Agui
To execute bonds as surety
Views: 141  |  Downloads: 0
Municipal parking space rental permit
Views: 1250  |  Downloads: 2
Assumption agreement
Views: 308  |  Downloads: 3
Microbiology Catalase Test Results
Views: 6798  |  Downloads: 30
Certificate of partnership
Views: 221  |  Downloads: 4
Signature page for limited partner
Views: 262  |  Downloads: 2
layout_engine
Views: 264  |  Downloads: 3
Alternative form
Views: 157  |  Downloads: 0
WRITTEN CONSENT TO ACTION WITHOUT MEETING
Views: 814  |  Downloads: 6
Inventory
Views: 287  |  Downloads: 4
Railroad
Views: 115  |  Downloads: 0