WEBVTT Kind: captions; language: en-us NOTE Treffsikkerhet: 91% (H?Y) 00:00:00.700 --> 00:00:08.000 In this video we'll go over some symbols you may not be very familiar with, or you may not be 00:00:08.000 --> 00:00:17.800 entirely certain about. The first symbol is a letter, like, the letter X. We use the letter X, Y, and 00:00:17.800 --> 00:00:25.300 Z, and others to denote variables. So X can be any variable that we're working with. NOTE Treffsikkerhet: 77% (H?Y) 00:00:26.200 --> 00:00:34.500 And when we collect data on that variable and have specific values for it, then those are indicated 00:00:34.500 --> 00:00:43.600 with the letter that signifies the variable plus an index. So X1 would be the first data point on 00:00:43.600 --> 00:00:51.300 variable X or the first measurement on variable x, x2 would be the, second x3 would be the third 00:00:51.300 --> 00:00:56.450 and so on. So these five things on the screen are five NOTE Treffsikkerhet: 75% (MEDIUM) 00:00:56.450 --> 00:01:00.800 data points, five measurements of variable X. NOTE Treffsikkerhet: 82% (H?Y) 00:01:00.900 --> 00:01:09.600 The next symbol that is useful in this context is the symbol of summation. This scary looking thing 00:01:09.600 --> 00:01:18.900 is actually a capital greek letter sigma and it actually means to summation and the thing that 00:01:18.900 --> 00:01:27.600 we will see most of the time it's something like this. This is beginning to look even scarier. So it 00:01:27.600 --> 00:01:31.600 is useful to try and parse these components and realize that it NOTE Treffsikkerhet: 88% (H?Y) 00:01:31.600 --> 00:01:39.700 actually a very simple and easy shorthand to display things that could otherwise get very long. NOTE Treffsikkerhet: 84% (H?Y) 00:01:39.700 --> 00:01:47.800 So the first thing to notice is that there is the letter i and the letter i appears here and also 00:01:47.800 --> 00:01:56.350 appears here. And then, if you look on above and below the sigma symbol, you see that there is also 00:01:56.350 --> 00:02:03.300 a number or some value down here and some value up here. Now, it turns out the first value is 00:02:03.300 --> 00:02:09.899 actually the number one and the symbol up here is an unknown value N NOTE Treffsikkerhet: 89% (H?Y) 00:02:09.899 --> 00:02:17.900 it could be anything. We just need to know what it is in a given context. So, whatever B is. So what 00:02:17.900 --> 00:02:29.000 does what does all of this mean? Well, the idea is that i, which is our index, which is the same as 00:02:29.000 --> 00:02:37.000 this one. So our index takes on different values in sequence. So it starts with the value shown 00:02:37.000 --> 00:02:39.550 here. So, the first thing that, NOTE Treffsikkerhet: 72% (MEDIUM) 00:02:39.550 --> 00:02:47.900 this means is that i would be equal to one and then the next thing would be that i would be equal to 00:02:47.900 --> 00:02:58.000 2. So i would take sequential values all the way up to N. However, many that is. So if N is 5, i 00:02:58.000 --> 00:03:05.149 would first be 1. Then we to then it would be three then would be for and then it would be 5. NOTE Treffsikkerhet: 79% (H?Y) 00:03:05.149 --> 00:03:17.300 In for each value of i then we have whatever is after the sigma symbol. So for i equals 1 we get x1 NOTE Treffsikkerhet: 80% (H?Y) 00:03:17.300 --> 00:03:24.399 This thing becomes whatever, i is at this point. NOTE Treffsikkerhet: 85% (H?Y) 00:03:24.399 --> 00:03:38.900 And when i equals 2, we get x2. So this becomes whatever this is at this point. And for i equals 00:03:38.900 --> 00:03:49.900 3, we get x3 and whatever for the next steps until the last one, which would be xn. So, this is 00:03:49.900 --> 00:03:54.600 the first value of x. Therefore, this is the second value of x. NOTE Treffsikkerhet: 89% (H?Y) 00:03:54.600 --> 00:04:03.000 This is the third value of x in. This is the n, the final value of x of our X variable. And we're 00:04:03.000 --> 00:04:10.250 not done yet because we haven't yet, used the summation. So we actually have to add these. NOTE Treffsikkerhet: 88% (H?Y) 00:04:10.250 --> 00:04:23.800 So what this means is that, this whole thing is exactly the same as this. We use this as a 00:04:23.800 --> 00:04:35.400 shorthand, instead of writing all of these. So in the case, where N equals 5, I would go from one to 00:04:35.400 --> 00:04:40.700 five. And so the sum the sum of NOTE Treffsikkerhet: 65% (MEDIUM) 00:04:40.700 --> 00:04:55.650 Xi for i equals 1 to 5 is X1 plus X2, plus X3 plus X4, plus X5 instead of writing, all of this. 00:04:55.650 --> 00:05:04.800 We write this. So, we save some space and when we're not sure, what N will be in any given 00:05:04.800 --> 00:05:10.650 instance. We just use this as a shorthand in a more abstract sense to NOTE Treffsikkerhet: 72% (MEDIUM) 00:05:10.650 --> 00:05:15.800 I mean, however many they are we just add them up. NOTE Treffsikkerhet: 86% (H?Y) 00:05:16.400 --> 00:05:28.600 So this kind of formula appears when we have the mean. You already know that the mean is the sum of 00:05:28.600 --> 00:05:37.500 all the measured values divided over how many they are. So it's X1 plus X2, plus X3 plus however, 00:05:37.500 --> 00:05:45.150 many we have plus until the last one divided by N. So you see the use of an unknown number NOTE Treffsikkerhet: 89% (H?Y) 00:05:45.150 --> 00:05:53.350 to mean as many as we have. Instead of writing all of this, because N could be very large 00:05:53.350 --> 00:06:00.700 and then this would get very long. We just say and write this which is exactly the same thing. So 00:06:00.700 --> 00:06:11.400 the sum of Xi where i equals 1 through N divided over N and these two things are the same. NOTE Treffsikkerhet: 90% (H?Y) 00:06:11.700 --> 00:06:18.750 And actually, you'll see in books to save printing space. They don't right NOTE Treffsikkerhet: 88% (H?Y) 00:06:18.750 --> 00:06:28.500 i=1 to N below and above Sigma. Sometimes they print it like this, which is exactly the same 00:06:28.500 --> 00:06:35.200 thing. So they save a bit of vertical space on the page and this can go within one line. You'll see 00:06:35.200 --> 00:06:42.800 this in books very often. It means exactly the same thing. And so this is our mean and as you 00:06:42.800 --> 00:06:49.500 probably seen in books, this is how we represent the mean with a symbol. NOTE Treffsikkerhet: 85% (H?Y) 00:06:49.500 --> 00:06:58.700 And this is a thing that is very general symbol. This little horizontal line above the variable, 00:06:58.700 --> 00:07:07.300 name means the mean value of all the measures that we have for this variable. So x-bar, which is how 00:07:07.300 --> 00:07:12.400 this is pronounced means the mean value of x. NOTE Treffsikkerhet: 90% (H?Y) 00:07:12.400 --> 00:07:19.400 And by now, you will have noticed that there is a bunch of symbols that we use to represent the mean 00:07:19.400 --> 00:07:29.150 unfortunately. So in some books, you see lowercase x bar. In other books, you'll see, capital x bar 00:07:29.150 --> 00:07:37.100 in other places. You'll see lowercase m in many papers that follow especially papers that follow the 00:07:37.100 --> 00:07:42.549 APA format. You'll see capital M and in some places. NOTE Treffsikkerhet: 91% (H?Y) 00:07:42.549 --> 00:07:50.200 including some previous videos here you'll see the lowercase letter ¦Ì. It turns out almost all of 00:07:50.200 --> 00:07:57.400 these are the same thing. So all of these symbols are used in different contexts to indicate the 00:07:57.400 --> 00:08:03.600 mean of a set of values that we actually have, which is called the sample mean. NOTE Treffsikkerhet: 89% (H?Y) 00:08:03.600 --> 00:08:13.900 In contrast this symbol, the Greek letter ¦Ì, lowercase Greek ¦Ì, refers to the population mean 00:08:13.900 --> 00:08:21.000 which we usually don't know. So it's still a mean value but it's a mean value of everyone including 00:08:21.000 --> 00:08:28.600 those who haven't measured. So to be formally correct when we have a sample when we measure some 00:08:28.600 --> 00:08:33.900 people on a variable, the mean value of our observations would be NOTE Treffsikkerhet: 83% (H?Y) 00:08:33.900 --> 00:08:41.500 x-bar. And if we were writing a paper or reading about them in the paper, we would most likely see 00:08:41.500 --> 00:08:44.049 that referred as capital M. NOTE Treffsikkerhet: 87% (H?Y) 00:08:44.049 --> 00:08:51.750 But if we were, if we wanted to refer to the mean of the population, so the mean of everyone 00:08:51.750 --> 00:08:59.000 including those, we didn't measure. So every human, or every Norwegian, or every student, or every 00:08:59.000 --> 00:09:04.550 child. That would be the population mean ¦Ì. NOTE Treffsikkerhet: 91% (H?Y) 00:09:04.550 --> 00:09:12.600 And we've used a sample of the population, hoping to get an estimate for that. Will get more about 00:09:12.600 --> 00:09:19.300 that. The idea here is, we have different symbols, some of which mean the same thing. And some of 00:09:19.300 --> 00:09:20.750 which don't NOTE Treffsikkerhet: 91% (H?Y) 00:09:20.750 --> 00:09:25.500 Now to go back to this way of writing things. NOTE Treffsikkerhet: 91% (H?Y) 00:09:25.500 --> 00:09:33.700 The sigma symbol of summation, you don't, you aren't restricted to just a simple expression like 00:09:33.700 --> 00:09:34.750 this. NOTE Treffsikkerhet: 90% (H?Y) 00:09:34.750 --> 00:09:40.350 So, the sum of xi, which is exactly this thing. NOTE Treffsikkerhet: 68% (MEDIUM) 00:09:40.350 --> 00:09:47.300 Could be the sum of something more complex in other situation. So, you can use this symbol, for 00:09:47.300 --> 00:09:57.550 example, to mean, like that. So, the sum of all the differences of xi minus X bar, or i=1 00:09:57.550 --> 00:10:04.600 to n. Now. I realize this may look a bit scary at first glance, but it's no more complicated than we 00:10:04.600 --> 00:10:05.849 had before. NOTE Treffsikkerhet: 87% (H?Y) 00:10:05.849 --> 00:10:13.900 To parse this. You just have to go step by step and think what is going on here. So this means that 00:10:13.900 --> 00:10:21.849 we first have x1 minus the mean of all the x and that's here. NOTE Treffsikkerhet: 83% (H?Y) 00:10:21.849 --> 00:10:29.600 and then we have x2, minus the mean of all the x that here, and then x3, minus the mean, and 00:10:29.600 --> 00:10:34.200 then, so on until the last x minus the mean, NOTE Treffsikkerhet: 86% (H?Y) 00:10:34.700 --> 00:10:42.300 And then we just add them all up because of the sigma. So, all of these take on their sequential 00:10:42.300 --> 00:10:49.100 values whenever you see, i, you replace whatever value i has at the moment, you write all these 00:10:49.100 --> 00:10:54.400 down and then you add them up. Well, in fact, you never actually go and do this by hand. The 00:10:54.400 --> 00:11:04.000 computer does it for us, but this means this and it can get even more complex, if you want to add 00:11:04.000 --> 00:11:05.300 another operation, so NOTE Treffsikkerhet: 91% (H?Y) 00:11:05.300 --> 00:11:12.700 if you want every difference to be multiplied by itself, then you just add it here and then it means 00:11:12.700 --> 00:11:19.000 that you have to do it in each case before you add them up. So this is a very general and very 00:11:19.000 --> 00:11:26.300 useful symbol to indicate summation of sequences that can become very long and often times you can 00:11:26.300 --> 00:11:34.000 use this symbol when we don't know exactly how many they are to indicate a general principle like 00:11:34.000 --> 00:11:35.250 for example, in the mean, NOTE Treffsikkerhet: 68% (MEDIUM) 00:11:35.250 --> 00:11:41.000 And where we just have to add up. However many samples we have.