VIEWS: 86 PAGES: 13 CATEGORY: Childrens Literature POSTED ON: 12/20/2009 Public Domain
Probability and Mathematical Statistics (2008-2009) Handout #1: Introduction Jiandong Wang jiandong@coe.pku.edu.cn http://www.mech.pku.edu.cn/robot/teacher/wangjiandong/index.htm http://www.mech.pku.edu.cn/robot/teacher/wangjiandong/index.htm Dept. of Industrial Engineering and Management College of Engineering, Peking University Outline • What is probability （概率）? • What is mathematical statistics （数理统计）? • Why study probability and mathematical statistics? • Applications of probability and mathematical statistics • Course information 学习重点： • • 理解课程大体包含什么内容 思考为什么学习该课程 Handout #1, Probability and Mathematical Statistics (2008-2009) 2 1 What is probability? • Probability as defined by Webster’s dictionary is “the chance that a given event will occur”. 概率是表示某种情况（事件）出现的可能性大小的一种数量指标。 • Example 1: The probability that it will rain the next day. If we say that there is a 30% chance of rain, we believe that if identical conditions prevail, then 3 times out of 10, rain will occur the next day. • Example 2: The toss of a fair dice （骰子）. The chance of one dot is 1/6. Handout #1, Probability and Mathematical Statistics (2008-2009) 3 What is probability? • Experiment （试验）: the toss of a fair dice • Outcome/sample point （结果、样本）: s = two dots • Sample space （全体、样本空间）: The set of all possible outcomes. The sample space is denoted as S. S = {one dot, two dots, · · · , six dots} E = {one dot, two dots, three dots} • Probability （概率）: the unpredictability of an event is measured by a number – Used to quantify likelihood or chance – Used to represent risk or uncertainty in engineering applications Probability of one dot = 1/6 Probability of E = 1/2 • Event （事件）: a subset of the sample space • Random variable （随机变量）: a function/mapping that assigns a real number to each outcome. x(one dot) = −1, x(two dots) = −2, · · · , x(six dots) = 3 Handout #1, Probability and Mathematical Statistics (2008-2009) 4 2 What is probability? • Graphical illustration of random variables （随机变量） Outcome: ζ1 , ζ2 , · · · , ζ6 x(ζ1 ) = 1 x(ζ2 ) = 2 Sample space: . . . x(ζ6 ) = 6 S = {ζ1 , ζ2 , · · · , ζ6 } Handout #1, Probability and Mathematical Statistics (2008-2009) 5 What is probability? • How to describe a random variable? – Description using probability functions （概率函数） （离散随机变量） For discrete random variables (RV), the probability that RV x(ζ) takes a value equal to xk is termed the probability mass function. （概率质量函数/概率分布） p(xk ) := P r{x(ζ) = xk } Example 2: The toss of a fair dice. p(x1 ) := P r{x(ζ) = x1 = 1} = 1/6 Handout #1, Probability and Mathematical Statistics (2008-2009) 6 3 What is probability? • How to describe a random variable? – Description using first- and second-order statistical averages （统计平均值） Mean （均值）: average µx = P k xk p (xk ) p(xk ) : PMF (概率质量函数) Variance （方差）: a measure of the spread/variability around the mean 2 σx = P k (xk − µx ) p (xk ) 2 Example 2: The toss of a fair dice µx = 3.5 2 σx = 2.9167 7 Handout #1, Probability and Mathematical Statistics (2008-2009) What is probability? • Example 3: The duration (time cost) from your apartment to the library Owing to various unpredictable reasons, the duration is a random variable, say, taking values of [5, 20] minutes. If the duration was recorded for the last 7 days, are the 7 numbers exactly the same? Based on these numbers, can we say something about tomorrow’s duration? For instance, does the average value of the 7 numbers provide information? A B Handout #1, Probability and Mathematical Statistics (2008-2009) 8 4 What is mathematical statistics? • In most practical applications, the statistical properties (probability functions & statistical averages) of random variables are unknown. As a result, these properties should be obtained by collecting and analyzing finite sets of measurement. In other words, they have to be estimated from data. P µx = k xk p (xk ) Mean （均值） （从数据中估计） Sample mean（样本均值） µx = ˆ N −1 1 X x(n) N n=0 • Mathematical statistics is the development and application of methods to collect, analyze and interpret data (data is a realization of random variables or random processes). 数理统计学使用概率论和数学的方法，研究如何收集数据（数据是通过试验或观 察获得的），对这种数据进行分析，以对所研究的问题做出判断。 概率论是数理统计学的基础，而数理统计学是概率论的重要应用。 Handout #1, Probability and Mathematical Statistics (2008-2009) 9 Why study probability & mathematical statistics? • If all past, present and future values of a variable are known precisely, without any uncertainty, then the variable is called as deterministic （确定的）. • However, in most practical situation, we cannot predict variables exactly. Such variables are called stochastic or random （随机的）. The fundamental characteristic of a stochastic or random variable is the inability to precisely specify its values beforehand. Although random variables evolve in an imprecise manner, their average properties exhibit considerable regularity. Handout #1, Probability and Mathematical Statistics (2008-2009) 10 5 Why study probability & mathematical statistics? • Random variables （随机变量）and random processes （随机过程）appear in various areas, including engineering (temperature, level, pressure), natural sciences (weather forecast), economics (stock market), social sciences (population), medicine (electrocardiogram) and etc. • Applications of probability and mathematical statistics in the field of engineering – Statistical quality control （统计质量控制） – Detection of plant-wide oscillation （全厂范围的震荡） – Noise suppression in music （噪声抑制） – CO soft sensor （一氧化碳软测量） Handout #1, Probability and Mathematical Statistics (2008-2009) 11 Applications of probability & mathematical statistics • Statistical quality control （统计质量控制） – Visually inspecting data to improve product quality （产品质量） Average （平均值） v.s. Mean （均值） Slot depth （槽孔深度）was measured on three product parts selected from production every half hour during the first shift from 6AM to 3PM. Handout #1, Probability and Mathematical Statistics (2008-2009) 12 6 Applications of probability & mathematical statistics • Statistical quality control （统计质量控制） – Visually inspecting data to improve product quality （产品质量） Mean （均值） Variance（方差） Central line: 217.5, Lower control limit: 215.0, Upper control limit: 220.0 Handout #1, Probability and Mathematical Statistics (2008-2009) 13 Applications of probability & mathematical statistics • Statistical quality control （统计质量控制） – 根据生产数据分析其内含信息，实时监测生产运行是否正常，例如目前广泛 使用的6-Sigma管理体系。 – 目前最新的有效工具是Latent variable modeling techniques (e.g., PCA, PLS, CCA)，用子空间模型来替代一维/二维数据的统计特征（例如方差），来区 分正常与异常工况 Multivariate Statistical Analysis (多元统计分析) Handout #1, Probability and Mathematical Statistics (2008-2009) 14 7 Applications of probability & mathematical statistics • Detection of plant-wide oscillation （全厂范围的震荡） Normal Oscillating Handout #1, Probability and Mathematical Statistics (2008-2009) 15 Applications of probability & mathematical statistics • Detection of plant-wide oscillation （全厂范围的震荡） – 某个震荡可能导致整个工厂的震荡； – 从时间趋势图来判断震荡的存在费时费力，但从自协方差（autocovariance） 和频谱图（spectrum）上就可以较容易和快速地作出判断。 Handout #1, Probability and Mathematical Statistics (2008-2009) 自协方差、功率谱 16 8 Applications of probability & mathematical statistics • Noise suppression in music （噪声抑制） A short sound clip from “Hallelujah” chorus (let us listen!) 0.4 0.3 0.2 0.4 Signal of Interest, v(n) 1 0.8 0.6 Measured Signal 0.1 Amplitude Amplitude 0.2 0 -0.2 -0.4 0 -0.1 -0.2 -0.6 -0.3 -0.4 -0.8 -1 0 1 2 3 4 5 Time [sec] 6 7 8 9 0 1 2 3 4 5 Time [sec] 6 7 8 9 “Hallejujah” chorus “Hallejujah” chorus with FM noise See PS_Handout1_Hallelujah.m Handout #1, Probability and Mathematical Statistics (2008-2009) 17 Applications of probability & mathematical statistics • Noise suppression in music （噪声抑制） The noise is nearly periodic so that it can be predicted using its past samples. 均方差 = 方差 + 均值的平方 （延迟） 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0 1 2 3 4 5 Time [sec] 6 7 8 9 Measured Signal （自适应滤波器） 0.6 Signals Error Signal e(n) 0.4 0.2 Amplitude Amplitude 0 -0.2 -0.4 -0.6 -0.8 0 1 2 3 4 5 Time [sec] 6 7 8 9 Handout #1, Probability and Mathematical Statistics (2008-2009) See PS_Handout1_Hallelujah.m 18 9 Applications of probability & mathematical statistics • CO soft sensor （一氧化碳软测量）at Celanese Canada Inc. − 5台锅炉仅有1台 CO分析仪，怎么办？ Handout #1, Probability and Mathematical Statistics (2008-2009) 19 Applications of probability & mathematical statistics • CO soft sensor （一氧化碳软测量）at Celanese Canada Inc. − 建立数学模型，基于过去信息和其它变量的信息来估计（预测）CO值 Regression Analysis （回归分析） Handout #1, Probability and Mathematical Statistics (2008-2009) 20 10 Course outline 1. Introduction 2. Basic probability （概率） 3. Random variables & probability distributions （随机变量及概率分布） 4. Statistical averages of random variables （随机变量的数字特征） 5. Parameter estimation （参数估计） 6. Hypothesis test （假设检验） 7. Advanced topics: random processes （随机过程） 8. Course project Handout #1, Probability and Mathematical Statistics (2008-2009) 21 Course outline Basic probability Probability Discrete RVs and their expected values Continuous RVs and their expected values Parameter estimation Math. Statistics Hypothesis test Advanced topics, e.g. random processes Handout #1, Probability and Mathematical Statistics (2008-2009) 22 11 Course books • • • • 10 Handouts (available online at course webpage) （英文为主、中文注释） Blackboard notes (please take the notes) Textbook: 《概率论与数量统计》，何书元，高等教育出版社，2006年 (available at bookstores (13.3元)) References (* major reference): *Intuitive Probability and Random Process Using Matlab, S.M. Kay, 2005 (available at Tsinghua’s library （O211 FK23 外文图书阅览室）and an e-copy available at Prof. Steven M. Kay’s webpage: http://www.ele.uri.edu/faculty/kay.html ) – Probability and Statistics for Engineers and Scientists, R.E. Walpole, R.H. Myers, S.L. Myers, K. Ye, 7th edition, 2002. (available at bookstores (69.8元）and library (O21/P94.7/2004 中心馆) – Probability and Statistics for Engineering and The Sciences, J.L. Devore, 5th edition, 2000. (available at bookstores (58元）and library (TB114/D498.5 中心馆) – Probability, random variables, and random processes, H. Hsu, 1997. (available at library (O21-44/H859 中心馆) – Statistics Toolbox 6: User’s Guide, Mathworks, March 2007 (downloadable at www.mathworks.com/support/) Handout #1, Probability and Mathematical Statistics (2008-2009) 23 – Course grading • Assignments （作业）35% – TA: to be announced – Office hour: Tuesday 2-4pm, 方正大厦206 • Attendance （考勤）10% – Quiz in the lecture • Course project （课程设计）20% (written report + oral presentation) • Final exam （期末考试）35% – Problems will be similar to assignments and examples in the handouts. • Unless an acceptable excuse, there is a penalty on the late due assignments or course project report according to the following rules: – Late by 24 hours or less, the grade will be reduced by 25% – Late by more than 24 hours, a zero grade will be assigned Handout #1, Probability and Mathematical Statistics (2008-2009) 24 12 Course project • Students conduct and present a course project orally and in written form on the basis of the materials covered in the course. • Topics are open to your own of interest, but are related to course materials. Some possible topics will be given in later lectures. For instance, you may select a topic uncovered in lectures, study related concepts, and apply the knowledge you have learned to some examples (by simulation). • Write a report （书面报告）limited to 6 pages. The report should be in a similar format as a journal article. Typically, the report is organized as follows: – – – – – Abstract Introduction Concepts/theories/examples Conclusion Reference • Offer an oral presentation （口头报告）evaluated by your classmates Handout #1, Probability and Mathematical Statistics (2008-2009) 25 Course survey (anonymous) (anonymous) 1. Student status ( 3rd or 4th Year, major): 2. If not required, would you like to take the course? If yes, why would you take it ? 3. Have you studied courses or materials related to probability, mathematical statistics and random process before? 4. Have you used Matlab before? Do you have access to Matlab for hand-on computer-based exercises? 5. Suggestion or question? Handout #1, Probability and Mathematical Statistics (2008-2009) 26 13