WEBVTT Kind: captions; language: en-us NOTE Treffsikkerhet: 86% (H?Y) 00:00:00.399 --> 00:00:07.600 In this video, we will continue our exploration of probability by considering what happens when a 00:00:07.600 --> 00:00:10.050 whole class tosses coins. NOTE Treffsikkerhet: 90% (H?Y) 00:00:10.050 --> 00:00:16.100 If you remember from the previous video, you were asked to toss a coin 10 times, then write down 00:00:16.100 --> 00:00:19.850 heads or tails in the order in which you got that. NOTE Treffsikkerhet: 91% (H?Y) 00:00:19.850 --> 00:00:27.700 In previous years. We always got the whole class to do that and we would count the outcomes and be 00:00:27.700 --> 00:00:36.100 able to display them. So we could calculate what would be the expected frequency for each outcome by 00:00:36.100 --> 00:00:42.400 counting the possibilities, how many different patterns there are and how many different ways there 00:00:42.400 --> 00:00:44.750 are for each outcome to occur. NOTE Treffsikkerhet: 91% (H?Y) 00:00:44.750 --> 00:00:53.000 Then we could compare the theoretically expected frequencies by the empirically observed ones. How 00:00:53.000 --> 00:00:58.200 frequently? How much? How many times something actually happened? NOTE Treffsikkerhet: 88% (H?Y) 00:00:58.700 --> 00:01:05.750 So, if you recall the calculation for the probabilities for two coin tosses. NOTE Treffsikkerhet: 89% (H?Y) 00:01:05.750 --> 00:01:15.000 There was one way out of four possible patterns to get two heads in two tosses because you had to 00:01:15.000 --> 00:01:23.600 get two heads in both tosses and that's one out of four and that's 25 % 00:01:23.600 --> 00:01:29.800 which is the same as the probability for getting no heads and two tails and there was 50 % 00:01:29.800 --> 00:01:32.300 chance of getting one of each. NOTE Treffsikkerhet: 90% (H?Y) 00:01:32.300 --> 00:01:40.400 The probabilities for 4 tosses were based on the fact that there are 16 different possible 00:01:40.400 --> 00:01:49.700 outcomes. One of which is all heads 4 heads. And one is all tails. They were four different ones. 00:01:49.700 --> 00:01:55.200 So 25% for once heads or once tails NOTE Treffsikkerhet: 91% (H?Y) 00:01:55.200 --> 00:02:03.700 And we get this shape of expected probabilities for 4 coin tosses. NOTE Treffsikkerhet: 91% (H?Y) 00:02:04.200 --> 00:02:14.750 And these are the actual numbers. This was observed in the class of 2018 and the class of 2019. So 00:02:14.750 --> 00:02:22.649 this is the proportion of people as a percentage. The proportion of students in class who got all 00:02:22.649 --> 00:02:25.100 tails, zero heads. NOTE Treffsikkerhet: 78% (H?Y) 00:02:25.100 --> 00:02:35.100 Looks like about 11% and the proportion of people who got three times tails once heads and so on, in 00:02:35.100 --> 00:02:44.200 2018, to of each three times heads 4 times heads. And in 2019. This is the proportion of people who got 00:02:44.200 --> 00:02:47.200 all tails. All heads. NOTE Treffsikkerhet: 91% (H?Y) 00:02:47.200 --> 00:02:54.500 Two of each and so on. So these are actual data taken from previous years. NOTE Treffsikkerhet: 89% (H?Y) 00:02:55.100 --> 00:03:04.500 The expected frequencies, the probabilities for different outcomes when you toss a coin 6 times 00:03:04.500 --> 00:03:06.550 are plotted here. NOTE Treffsikkerhet: 91% (H?Y) 00:03:06.550 --> 00:03:15.700 So you see that the probability for getting heads once or five times is about 10%? NOTE Treffsikkerhet: 76% (H?Y) 00:03:15.700 --> 00:03:24.100 The probability of getting an equal number of heads and tails is just about 30 percent and so on. 00:03:24.100 --> 00:03:31.600 These are the expected frequencies and these are the actual observed frequencies as proportions for 00:03:31.600 --> 00:03:39.750 the two classes that we also showed before. So in 2018. This was the proportion of people getting 00:03:39.750 --> 00:03:46.050 all tails in their first six tosses. This is not a negligible number. NOTE Treffsikkerhet: 82% (H?Y) 00:03:46.050 --> 00:03:56.300 Six out of six and then once heads out of 6 and then twice heads and 3 of each and so on for 2018 00:03:56.300 --> 00:04:01.100 and then this for 2019. And as you can see, NOTE Treffsikkerhet: 86% (H?Y) 00:04:01.300 --> 00:04:09.200 The actual observed data are not identical between the two years but there are similar to each other 00:04:09.200 --> 00:04:17.250 and similar to the theoretically expected proportions. What about looking at all 10 coin tosses? 00:04:17.250 --> 00:04:24.200 These are the mathematically expected proportions that are calculated by counting the number of 00:04:24.200 --> 00:04:26.700 patterns in each case. NOTE Treffsikkerhet: 91% (H?Y) 00:04:26.700 --> 00:04:31.200 And these are the actual observed data from the two classes. NOTE Treffsikkerhet: 91% (H?Y) 00:04:31.200 --> 00:04:39.800 So if we join the observed proportions, we can see the shape more clearly. And we see that in both 00:04:39.800 --> 00:04:50.500 years we got low proportions here, low here and falling, and falling and high proportions around 00:04:50.500 --> 00:04:54.450 the equal or nearly equal numbers of heads and tails. NOTE Treffsikkerhet: 91% (H?Y) 00:04:54.450 --> 00:05:00.900 Similar but not identical between the two years and between the observed and the theoretically 00:05:00.900 --> 00:05:11.900 expected numbers. So as main observations based on these actual data, we can say, first of all, the 00:05:11.900 --> 00:05:19.400 theoretically expected proportions are actually approximated every year and this was true of 00:05:19.400 --> 00:05:21.200 previous years as well. NOTE Treffsikkerhet: 88% (H?Y) 00:05:21.200 --> 00:05:29.300 It is why we do this experiment every year because it always works. We approximate the theoretically 00:05:29.300 --> 00:05:35.250 expected proportion. Approximate means you're close, but are not exactly there. NOTE Treffsikkerhet: 91% (H?Y) 00:05:35.250 --> 00:05:40.450 So, if you think about this for a moment. NOTE Treffsikkerhet: 91% (H?Y) 00:05:40.450 --> 00:05:48.100 It could be seen as a bit strange because there's a bunch of coins tossed by a bunch of people. Each 00:05:48.100 --> 00:05:52.549 coin toss is completely independent of each other coin toss. NOTE Treffsikkerhet: 91% (H?Y) 00:05:52.549 --> 00:06:00.300 Comes up heads or tails regardless of anything. Yet in the end if you take them all together, they 00:06:00.300 --> 00:06:06.700 resemble a mathematically defined shape. So it's like all the coin and all the times they were 00:06:06.700 --> 00:06:15.100 tossed, conspire to produce something close to a very well defined pattern. It's as if they know 00:06:15.100 --> 00:06:21.750 about each other and communicate and say, oh you got lots of heads so I should get less heads. NOTE Treffsikkerhet: 91% (H?Y) 00:06:21.750 --> 00:06:29.400 That's not how it works. Okay, obviously coins don't talk to each other and they don't remember what 00:06:29.400 --> 00:06:36.200 they came up with in the previous time they were flip. So this is not magic and this is not 00:06:36.200 --> 00:06:42.900 interaction. This is just probability. That's the whole point of going through these examples. NOTE Treffsikkerhet: 82% (H?Y) 00:06:44.900 --> 00:06:53.400 This second thing to keep in mind is that different classes produce different proportions. So, every 00:06:53.400 --> 00:06:59.700 year is different, there is variability in the actual outcome. The shape is not identical. NOTE Treffsikkerhet: 86% (H?Y) 00:06:59.700 --> 00:07:06.800 So, why would we not get the same shape? This is kind of the opposite question from the previous 00:07:06.800 --> 00:07:11.000 one. Think about we have the same population. NOTE Treffsikkerhet: 91% (H?Y) 00:07:11.000 --> 00:07:19.500 Students in this class and they do the same thing. They flip coin, the same number of times and 00:07:19.500 --> 00:07:26.100 there is a population target which is the mathematical curve. How we expect these coin tosses to 00:07:26.100 --> 00:07:32.450 short out probabilistically and the outcome exhibits variation around that. NOTE Treffsikkerhet: 91% (H?Y) 00:07:32.450 --> 00:07:36.900 So the approximate that but they're not exactly that. NOTE Treffsikkerhet: 91% (H?Y) 00:07:36.900 --> 00:07:48.400 And we can explore how that varies by asking the computer to play the role of different classes. And 00:07:48.400 --> 00:07:55.600 actually, this gives us a lot more flexibility in what parameters we can choose. But because coins 00:07:55.600 --> 00:08:03.400 are just random switches between heads and tails, and we can ask the computer to produce randomly 00:08:03.400 --> 00:08:06.850 zeros and ones or essentially heads and tails. NOTE Treffsikkerhet: 91% (H?Y) 00:08:06.850 --> 00:08:11.600 This is the exactly the same as having lots and lots of classes. NOTE Treffsikkerhet: 91% (H?Y) NOTE Treffsikkerhet: 89% (H?Y) 00:08:14.250 --> 00:08:21.200 We asked the computer to produce random events that are are the same as coin flips from a 00:08:21.200 --> 00:08:23.600 probabilistic point of view. NOTE Treffsikkerhet: 76% (H?Y) 00:08:23.600 --> 00:08:29.900 This is it makes it very easy to do it many many times and watch what happens. NOTE Treffsikkerhet: 91% (H?Y) 00:08:29.900 --> 00:08:39.950 And we can also observe variability. If many classes do this. How similar are they going to look? NOTE Treffsikkerhet: 86% (H?Y) 00:08:39.950 --> 00:08:48.100 We can also observe the effects of different parameters. So small classes, large classes, a few coin 00:08:48.100 --> 00:08:54.599 tosses for each person, lots of coin tosses for each person. How does that affect the outcome? NOTE Treffsikkerhet: 91% (H?Y) 00:08:54.599 --> 00:09:02.900 You can actually do this for yourself after watching this video by going to this link and trying out 00:09:02.900 --> 00:09:12.600 different combinations. So let's start with not very many people. So, just 10 people, that just small 00:09:12.600 --> 00:09:17.650 simulated class and each person flips, the coin 10 times. NOTE Treffsikkerhet: 91% (H?Y) 00:09:17.650 --> 00:09:22.850 So this is one simulated experiment. NOTE Treffsikkerhet: 84% (H?Y) 00:09:22.850 --> 00:09:28.500 And that is something that happened in my first actual simulation. When I ran this through the 00:09:28.500 --> 00:09:36.900 computer at asked, for this number of coin flips, and then counted heads and tails in each sequence 00:09:36.900 --> 00:09:43.950 of ten, and then plotted them as this bar graph. And I got NOTE Treffsikkerhet: 87% (H?Y) 00:09:43.950 --> 00:09:54.200 two of the ten people having three heads and three people having four heads and so on. Okay, then I 00:09:54.200 --> 00:10:01.800 run this again. So it's like a different group of 10 people tossing, each tossing a coin 10 times 00:10:01.800 --> 00:10:04.100 and that's what happened. NOTE Treffsikkerhet: 85% (H?Y) 00:10:04.100 --> 00:10:10.550 And then I did it again a third time and that's what happened. So you see that the outcome 00:10:10.550 --> 00:10:18.800 fluctuates a lot. It doesn't look very much like the previous or the one before that. NOTE Treffsikkerhet: 91% (H?Y) 00:10:20.100 --> 00:10:31.000 What if we had more people flipping coins? So 100 students, simulated students, tossing simulated 00:10:31.000 --> 00:10:33.200 coin, 10 times each NOTE Treffsikkerhet: 91% (H?Y) 00:10:33.200 --> 00:10:37.099 This is my first simulated outcome. NOTE Treffsikkerhet: 90% (H?Y) 00:10:37.099 --> 00:10:44.150 Let's have a different simulated group of a hundred people. What will happen? This is what happened. 00:10:44.150 --> 00:10:51.900 Let's have a third group of a hundred people. That's the outcome. So there's still fluctuation, but 00:10:51.900 --> 00:10:57.400 they are much more similar to one another. These groups of a hundred people are much more similar 00:10:57.400 --> 00:11:00.850 to one another than the groups of 10 people. NOTE Treffsikkerhet: 85% (H?Y) 00:11:00.850 --> 00:11:10.150 What if we had a really large class with 1,000 students, each tossing a coin 10 times? NOTE Treffsikkerhet: 91% (H?Y) 00:11:10.150 --> 00:11:13.500 Well, this is one outcome. NOTE Treffsikkerhet: 86% (H?Y) 00:11:13.500 --> 00:11:19.200 And this is another group of simulated 1000 persons. NOTE Treffsikkerhet: 91% (H?Y) 00:11:19.200 --> 00:11:26.300 These don't change very much. So if you have lots of students, if you have a very big group doing 00:11:26.300 --> 00:11:34.100 this, then the differences between the groups become very small. The result becomes stable and 00:11:34.100 --> 00:11:41.300 doesn't depend on the group anymore. If we had a class of a thousand students and ask them to toss a 00:11:41.300 --> 00:11:50.450 coin 10 times each, the results between successive years would be very very similar. Okay. NOTE Treffsikkerhet: 79% (H?Y) 00:11:50.450 --> 00:11:57.800 What if they didn't toss the coin, just 10 times? Let's just go back to our starting point of an 00:11:57.800 --> 00:12:06.250 actual class of about a hundred persons tossing a coin 10 times each and simulate that once. 00:12:06.250 --> 00:12:12.200 This is 100 simulated persons tossing, simulated coins 10 times each. NOTE Treffsikkerhet: 76% (H?Y) 00:12:12.200 --> 00:12:18.600 What if they were to toss the coin 100 times each? and this is where it's actually not very 00:12:18.600 --> 00:12:24.400 convenient to try and do this in class because of how long it would take. The computer does it very 00:12:24.400 --> 00:12:33.000 quickly. So what happens is that each of the hundred simulated persons NOTE Treffsikkerhet: 89% (H?Y) 00:12:33.000 --> 00:12:44.400 tosses a coin a hundred times each and you don't see anyone having heads 20 times or 10 times or 0 00:12:44.400 --> 00:12:49.100 times or 80 times or 90 times. NOTE Treffsikkerhet: 82% (H?Y) 00:12:50.000 --> 00:12:58.900 Let's go back to the 10 times to see what this means. So 100 people tossing a coin 10 times each. 00:12:59.600 --> 00:13:08.700 This is what we did in class, and we could expect to have a person every year getting nine times 00:13:08.700 --> 00:13:14.050 heads or nine times tails, which in itself is a bit surprising. NOTE Treffsikkerhet: 91% (H?Y) 00:13:14.050 --> 00:13:19.900 We don't usually have someone and all heads or all tails in a class of a hundred. This would happen 00:13:19.900 --> 00:13:24.500 rarely because it's probability is less than one in a thousand. NOTE Treffsikkerhet: 88% (H?Y) 00:13:24.500 --> 00:13:33.000 But you see that you you can have nine times heads. So this means that you have nine times more 00:13:33.000 --> 00:13:38.900 heads than tails. If you flip the coin 10 times, the ratio is very large. You can have very 00:13:38.900 --> 00:13:44.700 different proportions. You can have eight to two which is four times more heads than tails or 00:13:44.700 --> 00:13:51.300 actually 9 to 1, nine times more heads than tails or nine times more tails Our Heads. This is conceivable 00:13:51.300 --> 00:13:55.900 and indeed happens in the class of a hundred have a ratio of 1 to 9. NOTE Treffsikkerhet: 72% (MEDIUM) 00:13:56.600 --> 00:14:01.250 But if they toss a coin a hundred times each, NOTE Treffsikkerhet: 88% (H?Y) 00:14:01.250 --> 00:14:10.900 a ratio of 9 to 1 would be like having 90 times heads and ten times tails. This doesn't happen, a ratio 00:14:10.900 --> 00:14:19.600 of four-to-one, would be like flipping 80 times heads and 20 times tales. That doesn't happen when 00:14:19.600 --> 00:14:25.950 you have a hundred coins. Not even 70 to 30 is observed. NOTE Treffsikkerhet: 87% (H?Y) 00:14:25.950 --> 00:14:35.900 So everyone gets a ratio that is closer to 50 50. It's not exactly 50 50 but it's closer to 50 50. 00:14:35.900 --> 00:14:43.849 What if everyone goes to coin a thousand times? Now everyone is very close to 50/50. NOTE Treffsikkerhet: 91% (H?Y) 00:14:43.849 --> 00:14:52.900 So if you toss very many times, then you can expect the ratio of heads to tails events to be close. 00:14:52.900 --> 00:15:01.700 Very close to the mathematically expected 50/50, which is how likely the coin is to get tails or 00:15:01.700 --> 00:15:10.750 heads. Now, don't be misled by this histogram. This is a histogram now and the pins are very wide. NOTE Treffsikkerhet: 91% (H?Y) 00:15:10.750 --> 00:15:20.100 So from from 450 to 500 and 500 to 550, there is a lot of different possible outcomes here and there 00:15:20.100 --> 00:15:29.700 is actually a shape, not all of these outcomes between 450 times heads and 550 times heads. These 00:15:29.700 --> 00:15:33.099 aren't equally likely, of course. NOTE Treffsikkerhet: 87% (H?Y) 00:15:33.099 --> 00:15:43.650 And you can already think about this and realize that you'd be more likely to have 500 and 500 than 00:15:43.650 --> 00:15:47.050 450 and 550. NOTE Treffsikkerhet: 77% (H?Y) 00:15:47.050 --> 00:15:54.700 So a shaped like that is hidden by the fact that the pins of this histogram are 50 wide are very 00:15:54.700 --> 00:16:00.600 wide. Will see that shape later. But the point I want to make here is that NOTE Treffsikkerhet: 89% (H?Y) 00:16:00.600 --> 00:16:07.500 when you toss many times, you expect the ratio of heads to tails to be close to the mathematically 00:16:07.500 --> 00:16:16.500 expected probability of the coin, which is 50/50. So there are two ways NOTE Treffsikkerhet: 91% (H?Y) 00:16:16.500 --> 00:16:23.150 that you can get predictability from randomness in this simulated experiment. NOTE Treffsikkerhet: 91% (H?Y) 00:16:23.150 --> 00:16:31.300 In general, the overall result of many random independent events is predictable. NOTE Treffsikkerhet: 91% (H?Y) 00:16:31.300 --> 00:16:37.850 And actually it's predictable with known confidence, and we'll get into that later in the course. 00:16:37.850 --> 00:16:45.400 For now. We can remember the two aspects of the coin toss experiment, which is if you have more and 00:16:45.400 --> 00:16:53.500 more people tossing coins, then you get closer and closer to the theoretically expected curve. What 00:16:53.500 --> 00:17:00.300 is the theoretically expected curve? It is the expected frequencies of occurrence of the events that 00:17:00.300 --> 00:17:01.600 you get by just counting NOTE Treffsikkerhet: 74% (MEDIUM) 00:17:01.600 --> 00:17:08.849 how many patterns are possible, and how many ways each outcome can happen. NOTE Treffsikkerhet: 83% (H?Y) 00:17:08.849 --> 00:17:16.750 And the second thing is, as you get more events. So each person tosses more and more times, then 00:17:16.750 --> 00:17:23.099 each person's outcome approaches the theoretically expected proportion of 50/50. Why is 00:17:23.099 --> 00:17:30.300 that theoretically expected? Because the coin is fair equally likely to land heads or tails. NOTE Treffsikkerhet: 89% (H?Y) 00:17:30.500 --> 00:17:40.600 And I encourage you to go to this link and try the experiment multiple times with different settings 00:17:40.600 --> 00:17:47.800 to see the kind of variability you can expect in the kind of distribution you can expect