But pairwise means x1 and x2 are independent, but x1, x2, and x3, they may not be independent. So just remember that even if they have the same moments, they don't necessarily have the same distribution. And one of the most universal random variable, our distribution is a normal distribution. So weak law of large numbers says that if you have IID random variables, 1 over n times sum over x i's converges to mu, the mean in some weak sense. Find materials for this course in the pages linked along the left. This will be less than or equal to the variance of x. So-- sorry about that. But there are some other distributions that you'll also see. So if you take n to be large enough, you will more than likely have some value which is very close to the mean. If you look at a very small scale, it might be OK, because the base price doesn't change that much. So proof assuming m of xi exists. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu. Before that, let's just rewrite that in a different way. But altogether, it's not independent. And in each point t, it converges to the value of the moment-generating function of some other random variable x. OK. What's normally distributed is the percentage of how much it changes daily. PROBABILITY THEORY 1 LECTURE NOTES JOHN PIKE These lecture notes were written for MATH 6710 at Cornell University in the allF semester of 2013. And expectation, our mean is expectation of x is equal to the sum over all x, x times that. So this moment-generating function encodes all the k-th moments of a random variable. So that does give some estimate, but I should mention that this is a very bad estimate. But for now, just consider it as real Numbers. So let's say you have a random variable x. And that's happening because we're fixed. But from the player's point of view, if you're better than the other player, and the amount of edge you have over the other player is larger than the fee that the casino charges to you, then now you can apply law of large numbers to yourself and win. You can let w1 of x be log x square t1-- no, t1 of x be log x square, w1 of theta be minus 1 over 2 sigma square. One thing I should mention is, in this case, if each discriminant is normally distributed, then the price at day n will still be a normal random variable distributed like that. So log normal distribution can also be defined at the distribution which has probability mass function of this. Because log x as a normal distribution had mean mu. Before going into that, first of all, why is it called moment-generating function? Technische Hochschule Zürich, Eidgenössische Technische Hochschule Zürich. So a distribution is called to be in an exponential family. The mathematical concepts It doesn't always converge. So that doesn't give the mean. There may be several reasons, but one reason is that it doesn't take into account the order of magnitude of the price itself. Anyway, that's proof of there's numbers. So if moment-generating function exists, they pretty much classify your random variables. You may consider t as a fixed number. I will prove it when the moment-generating function exists. y times. So you want to know the probability that you deviate from your mean by more than 0.1. It factors out well. in this ?] Introduction to Probability Theory. It's about 48%, 49%. But that's just some technical detail. And so with this exponential family, if you have random variables from the same exponential family, products of this density function factor out into a very simple form. That's like the Taylor expansion. And I will talk about moment-generating function a little bit. And this is known to be sigma square over n. So probability that x minus mu is greater than epsilon is at most sigma square over ne squared. Lecture-01-Basic principles of counting; Lecture-02-Sample space , events, axioms of probability; Lecture-03-Conditional probability, Independence of events. But from the casino's point of view, they have enough players to play the game so that the law of large numbers just makes them money. A distribution belongs to exponential family if there exists a theta, a vector that parametrizes the distribution such that the probability density function for this choice of parameter theta can be written as h of x times c of theta times the exponent of sum from i equal 1 to k. Yes. So let's see-- for the example of three random variables, it might be the case that each pair are independent. Lecture Notes | Probability Theory Manuel Cabral Morais Department of Mathematics Instituto Superior T ecnico Lisbon, September 2009/10 | January 2010/11 If you take some graduate probability course, you'll see that there's several possible ways to define convergence. So it's centered around the origin, and it's symmetrical on the origin. Dice and the Theory of Probability. So if you model in terms of ratio, our if you model it in an absolute way, it doesn't matter that much. So now we're talking about large-scale behavior. And you at least get the spirit of what's happening. So if we're seeing something uniformly about t, that's no longer true. So as n goes to infinity-- if n is really, really large, all these terms will be smaller order of magnitude than n, 1 over n. Something like that happens. Sum becomes products of e to the t 1 over square root n xi of x mu. The edX course focuses on animations, interactive features, readings, and problem-solving, and is complementary to the Stat 110 lecture videos on YouTube, which are available at https://goo.gl/i7njSb The Stat110x animations are available within the course and at https://goo.gl/g7pqTo And then because each of the xi's are independent, this sum will split into products. But when it's clear which random variable we're talking about, I'll just say f. So what is this? Then our probability mass function is fx 1 equals fx minus 1 1/3, just like that. Toggle navigation. Your use of the MIT OpenCourseWare site and materials is subject to our Creative Commons License and other terms of use. Selected Topics in Probability FS 2020; Statistik I (für Biol./Pharm. So for independence, I will talk about independence of several random variables as well. And all of these-- normal, log normal, Poisson, and exponential, and a lot more can be grouped into a family of distributions called exponential family. Because moment-generating function is defined in terms of the moments. Even if you have a tiny edge, if you can have enough number of trials, if you can trade enough of times using some strategy that you believe is winning over time, then law of large numbers will take it from there and will bring you money profit. Then what should it look like? PROFESSOR: I didn't get it. Most of the material was compiled from a number of text-books, such that A first course in probability by Sheldon Ross, An introduction to probability theory and its applications by William Feller, and Weighing the odds by David Williams. So let's try to fit into this story. But that's just some technicality. Freely browse and use OCW materials at your own pace. That is the percent. The probability distribution is very similar. Second term ix 0, because xi has mean mu. It says that it's not necessarily the k-th set. Equal to the product from 1 to n expectation e to the t times square root n. OK. Now they're identically distributed, so you just have to take the n-th power of that. Any mistakes that I made? So suppose there is a random variable x whose mean we do not know, whose mean is unknown. We want to model a financial product or a stock, the price of the stock using some random variable. The only problem is that because-- poker, you're not playing against the casino. I'll just write it down. What this means-- I'll write it down again-- it means for all x, probability that Yn is less than or equal to x converges the probability that normal distribution is less than or equal to x. And that's your y. But in practice, if you use a lot more powerful tool of estimating it, it should only be hundreds or at most thousands. So as you can see from these two theorems, moment-generating function, if it exists, is a really powerful tool that allows you to control the distribution. That's the glitch. And that means as long and they have the slightest advantage, they'll be winning money, and a huge amount of money. That's why moment-generating function won't be interesting to us. It has to be 0.01. Thank you. PROFESSOR: Ah. The statement is not something theoretical. It's because if you take the k-th derivative of this function, then it actually gives the k-th moment of your random variable. Because of that, we may write the moment-generating function as a sum from k equals 0 to infinity, t to the k, k factorial, times a k-th moment. These notes are for personal educational use only and are not to be published or redistributed. Lectures: MWF 1:00 - 1:59 p.m., Pauley Ballroom This term-- we have 1 over 2 t squared over n xi minus mu square. Your mean is 50. Now we'll do some estimation. The reason I'm making this choice of 1 over square root n is because if you make this choice, now the average has mean mu and variance sigma square just as in xi's. It also tells a little bit about the speed of convergence. What I want to say is this. That means our goal is to prove that the moment-generating function of these Yn's converge to the moment-generating function of the normal for all t pointwise convergence. So whenever you have identical independent distributions, when you take their average, if you take a large enough number of samples, they will be very close to the mean, which makes sense. The linearity of expectation, 1 comes out. Emphasis is given to the aspects of probabilistic model building, hypothesis testing and model verification. But what I'm trying to say here is that normal distribution is not good enough. Then a big part of it will be review for you. So now let's look at our purpose. AUDIENCE: The notion of independent random variables, you went over how the-- well, the probability density functions of collections of random variables if they're mutually independent is the product of the probability densities of the individual variables. ?]. I want x to be the log normal distribution. It will be law of large numbers and central limit theory. More broadly, the goal of the text The tools of probability theory, and of the related field of statistical inference, are the keys for being able to analyze and make sense of data. The law of large numbers. So for all non-zero t, it does not converge for log normal distribution. If it doesn't look like xi, can we say anything interesting about the distribution of this? So hx tix depends only on x and c theta on my value theta depends only on theta. It's almost the weakest convergence in distributions. If two random variables have the same moment, we have the same moment-generating function. Then by using this change of variable formula, probability density function of x is equal to probability density function of y at log x times the differentiation of log x of 1 over x. Any questions? For fixed t, we have to prove it. As you might already know, two typical theorems of this type will be in this topic. The moral is, don't play blackjack. So when we say that several random variables are independent, it just means whatever collection you take, they're all independent. When we look at this long-term behavior or large scale of behavior, what can we say? In that case, what you can do is-- you want this to be 0.01. You should be familiar with this, but I wrote it down just so that we agree on the notation. That's only the case for normal random variables. OK. International Relations and Security Network, D-BSSE: Lunch Meetings Molecular Systems Engineering, Empirical Process Theory and Applications, Limit Shape Phenomenon in Integrable Models in Statistical Mechanics, Mass und Integral (Measure and Integration), Selected Topics in Life Insurance Mathematics, Statistik I (für Biol./Pharm. So the convergence is stronger than this type of convergence. The function from the sample space non-negative reals, but now the integration over the domain. Lecture: TTh 8-9:30am, Zoom But I will not go into it. The log normal distribution does not have any moment-generating function. And then the central limit theorem tells you how the distribution of this variable is around the mean. Do you want me to add some explanation? So take independent trials x1, x2 to xn, and use 1 over x1 plus xn as our estimator. Probability Theory and Applications. So that is equal to sigma square. OK. For those who already saw large numbers before, the name suggests there's also something called strong law of large numbers. All positive [INAUDIBLE]. Instead, we want the percentage change to be normally distributed. It's no longer centered at mu. y. I want x to be-- yes. Then the distribution of Yn converges to that of normal distribution with mean 0 and variance sigma. It looks like this if it's n 0 1, let's say. Independent Identically-distributed random variables. Let's say you want to be 99% sure. So 1 over x sigma squared 2 pi e to the minus log x [INAUDIBLE] squared. What we get is expectation of 1 plus that t over square root n xi minus mu plus 1 over 2 factorial, that squared, t over square root n, xi minus mu squared plus 1 over 3 factorial, that cubed plus so on. And most games played in the casinos are designed like this. But if you want to do it a little bit more like our scale, then that's not a very good choice. That's not really. https://www.patreon.com/ProfessorLeonard Statistics Lecture 4.2: Introduction to Probability So it will take negative values and positive values. Video lectures; Captions/transcript; Lecture notes; Course Description. Because all the derivatives, you know what the functions would be. Some very interesting facts arise from this fact. It explains the notion of random events and random variables, probability measures, expectation, distributions, characteristic function, independence of random variables, types of convergence and limit theorems. Learn more », © 2001–2018 So there are that say 1, a2, a3, for which this does not hold. PROFESSOR: OK. Maybe-- yeah. So that's all about distributions that I want to talk about. So using this formula, we can find probability distribution function of the log normal distribution using the probabilities distribution of normal. So that disappears. So I can take it out and square my square. So you will see something about this. It looks like the mean is really close to 50%, but it's hidden, because they designed it so the variance is big. Lec : 1; Modules / Lectures. I like this stuff better. Massachusetts Institute of Technology. When I first saw it, I thought it was really interesting. There is a hole in this argument. It might be mu. So remember that theorem. And continuous is given by probability distribution function. So if you just take this model, what's going to happen over a long period of time is it's going to hit this square root of n, negative square root of n line infinitely often. And say it was $10 here, and $50 here. ... with Applications in Finance » Video Lectures » Lecture … A few more stuff. And this part is well known. Be careful. And expectation of y is the integral over omega. Other questions? Because normal distribution comes up here. Want to be 99% sure that x minus mu is less than 0.1, or x minus 50 is less than 0.1. Is it mu? And then it can go up to infinity, or it can go down to infinity eventually. So be careful. » f sum x I will denote. All the more or less advanced probability courses are preceded by this one. So you will parametrize this family in terms of mu with sigma. Our second topic will be we want to study its long-term our large-scale behavior. So let's do that. Is this a sensible definition? Yes? OK. And then the terms after that, because we're only interested in proving that for fixed t, this converges-- so we're only proving pointwise convergence. That will be our first topic. So first of all, just to agree on terminology, let's review some definitions. Topics in Mathematics with Applications in Finance. So we don't know what the real value is, but we know that the distribution of the value that we will obtain here is something like that around the mean. So it contains all the statistical information of a random variable. Log x is centered at mu, but when it takes exponential, it becomes skewed. That's just totally nonsense. So that's good. OK. Any questions? The reason this inequality holds is because variances x is defined as the expectation of x minus mu square. So plug in that, plug-in your variance, plug in your epsilon. Mathematics as a subject is vast and with these online tutorials, we have tried to segregate some major topics into distinct lectures. The moment-generating function of Yn is equal to expectation of e to t Yn. overview. We will mostly just consider mutually independent events. Before proving it, example of this theorem in practice can be seen in the Casino. PROFESSOR: OK, so good afternoon. So let's start with our first topic-- the moment-generating function. Pointwise convergence implies pointwise convergence.