WEBVTT Kind: captions; language: en-us NOTE Treffsikkerhet: 85% (H?Y) 00:00:00.000 --> 00:00:09.600 In this video I want to show you how to run T tests in jamamovie. Let us first load our data set. NOTE Treffsikkerhet: 86% (H?Y) 00:00:13.300 --> 00:00:22.700 This is the same data set we have used in previous examples and includes data for 47 children NOTE Treffsikkerhet: 88% (H?Y) 00:00:22.700 --> 00:00:31.299 boys and girls, some of which have Norwegian as their home language, so these are majority home 00:00:31.299 --> 00:00:36.000 language and others have minority home language. NOTE Treffsikkerhet: 89% (H?Y) 00:00:36.000 --> 00:00:42.550 Some of these children participated in an intervention and the others were in the control group 00:00:42.550 --> 00:00:52.300 and we have a number of measurements for these children including a measure of nonverbal cognitive 00:00:52.300 --> 00:00:57.800 ability in kindergarten, that's what the k stands for, measure of letter knowledge in 00:00:57.800 --> 00:01:04.400 kindergarten, receptive vocabulary in kindergarten, first grade and second grade. NOTE Treffsikkerhet: 83% (H?Y) 00:01:06.300 --> 00:01:15.000 Word list reading fluency in kindergarten first grade and second grade, and also the measure of word 00:01:15.000 --> 00:01:21.800 reading fluency after the intervention, for those children who receive the intervention and after a 00:01:21.800 --> 00:01:29.100 the same amount of time for the control children who did not receive intervention. In jamovie like 00:01:29.100 --> 00:01:36.500 in many other statistical software programs each row in this spreadsheet of data concerns one case. NOTE Treffsikkerhet: 89% (H?Y) 00:01:36.500 --> 00:01:44.800 Usually in our field this means one person, here it is one child, therefore when I talk about 00:01:44.800 --> 00:01:53.300 independent samples this refers to different groups of children. Therefore the different sets of 00:01:53.300 --> 00:01:59.200 values that we want to compare in an independent samples t-test would have to correspond to 00:01:59.200 --> 00:02:07.400 different rows. For example if we want to compare the vocabulary NOTE Treffsikkerhet: 91% (H?Y) 00:02:08.199 --> 00:02:11.000 in kindergarten NOTE Treffsikkerhet: 85% (H?Y) 00:02:11.000 --> 00:02:20.100 between the two Sexes we see that we need the vocabulary variable for each child and the sex variable 00:02:20.100 --> 00:02:28.800 for each child, so the sex variable defines the group's by its values. So all the rows that have F will 00:02:28.800 --> 00:02:35.100 belong to the female group the girls group, and all the rows that have M will belong to the boys 00:02:35.100 --> 00:02:36.200 group. NOTE Treffsikkerhet: 91% (H?Y) 00:02:36.200 --> 00:02:40.000 Once you have your data sorted out in this way NOTE Treffsikkerhet: 79% (H?Y) 00:02:40.000 --> 00:02:46.300 we click on t-test, Independent samples T Test NOTE Treffsikkerhet: 91% (H?Y) 00:02:46.600 --> 00:02:53.100 and we can enter our quantitative dependent variable NOTE Treffsikkerhet: 81% (H?Y) 00:02:53.900 --> 00:03:02.800 in the dependent variables panel in our grouping or independent variable, which must be a categorical 00:03:02.800 --> 00:03:07.150 variable with only two levels, as a grouping variable. NOTE Treffsikkerhet: 88% (H?Y) 00:03:07.150 --> 00:03:13.500 If we try to use a categorical variable with more than two different levels this is not going to 00:03:13.500 --> 00:03:20.600 work because t-test admit only two levels if there are three levels or more in your variable you 00:03:20.600 --> 00:03:26.100 will need to use a filter to only select the cases for two groups. NOTE Treffsikkerhet: 91% (H?Y) 00:03:26.400 --> 00:03:34.550 As you see as soon as we entered these two variables we have received some results on the right side 00:03:34.550 --> 00:03:36.350 of the jamovie screen, NOTE Treffsikkerhet: 91% (H?Y) 00:03:36.350 --> 00:03:44.700 However before we proceed to interpret those results we first have to check whether our test is 00:03:44.700 --> 00:03:49.200 actually interpretable. We must go to assumption checks NOTE Treffsikkerhet: 87% (H?Y) 00:03:49.200 --> 00:03:56.700 and click on the homogeneity, and normality test. And it never hurts to take a look at the QQ plot as 00:03:56.700 --> 00:03:57.750 well. NOTE Treffsikkerhet: 86% (H?Y) 00:03:57.750 --> 00:04:06.700 In this case we see that the normality test, which is the familiar Shapiro wilks test produces a non 00:04:06.700 --> 00:04:16.300 significant p-value. This means that our data distribution is not significantly different from the 00:04:16.300 --> 00:04:20.500 normal distribution, this is what a normality test does. NOTE Treffsikkerhet: 89% (H?Y) 00:04:20.500 --> 00:04:28.800 It tests the hypothesis that the data are in a different distribution from the normal distribution, 00:04:28.800 --> 00:04:36.150 therefore the null hypothesis for this test is that our data are in a distribution that is 00:04:36.150 --> 00:04:42.100 indistinguishable from the normal distribution. In other words to pass the test for the normality 00:04:42.100 --> 00:04:49.100 assumption this P value must be high, it must be higher than Alpha level, higher than the conventional 00:04:49.100 --> 00:04:51.049 level of 0.05 NOTE Treffsikkerhet: 81% (H?Y) 00:04:51.049 --> 00:04:59.600 in which case we cannot reject the null hypothesism which means that we do not reject the assumption 00:04:59.600 --> 00:05:07.500 that our data are consistent with the normal distribution. This is the case here, then we'll look at 00:05:07.500 --> 00:05:14.100 the homogeneity of variances test, this is the levene's test this test whether the variances in the 00:05:14.100 --> 00:05:20.650 two groups are different. Therefore a statistically significant value here NOTE Treffsikkerhet: 82% (H?Y) 00:05:20.650 --> 00:05:23.049 a low p-value NOTE Treffsikkerhet: 91% (H?Y) 00:05:23.049 --> 00:05:31.300 would lead us to reject the null hypothesis that the two variances are equal indicating that our two 00:05:31.300 --> 00:05:33.900 groups have unequal variances, NOTE Treffsikkerhet: 84% (H?Y) 00:05:33.900 --> 00:05:42.900 In that case if we had unequal variances that is a statistically significant test, Levine test, we 00:05:42.900 --> 00:05:52.250 would simply need to check Welch's instead of students and use these values to report our t-test. 00:05:52.250 --> 00:05:59.800 Because the Welch's t-test is one in which a difference in variances is accounted for. NOTE Treffsikkerhet: 90% (H?Y) 00:06:01.700 --> 00:06:08.600 If we had received a statistically significant p-value in the normality test this means that we 00:06:08.600 --> 00:06:15.400 cannot use the t-test, but we should choose the mann-whitney U test, the corresponding nonparametric 00:06:15.400 --> 00:06:17.350 test instead. NOTE Treffsikkerhet: 91% (H?Y) 00:06:17.350 --> 00:06:25.600 In our case neither of these tests produces statistically significant value, therefore we cannot 00:06:25.600 --> 00:06:33.400 reject either null hypothesis. We cannot reject the null hypothesis that are our data distribution is 00:06:33.400 --> 00:06:39.250 normal, and we cannot reject the null hypothesis that the variances of the two groups are equal. 00:06:39.250 --> 00:06:44.250 Therefore we can use the standard student's t-test, NOTE Treffsikkerhet: 90% (H?Y) 00:06:44.250 --> 00:06:53.500 and we can also choose to look at the descriptives, a plot of the descriptives, the effect size and 00:06:53.500 --> 00:06:55.600 its confidence interval. NOTE Treffsikkerhet: 84% (H?Y) 00:06:55.600 --> 00:07:02.200 As you can see this gives us all the information we will need to report, here are the group 00:07:02.200 --> 00:07:09.300 descriptors that we need to report, the means and standard deviations. We can see graphically the 00:07:09.300 --> 00:07:16.450 difference between the two groups, and we can see the means along with their 95% confidence intervals, 00:07:16.450 --> 00:07:19.500 as well as the medians. NOTE Treffsikkerhet: 67% (MEDIUM) 00:07:21.000 --> 00:07:27.100 Ans in the test results this is the value of our T statistic, NOTE Treffsikkerhet: 88% (H?Y) 00:07:27.100 --> 00:07:33.750 so statistic goes with whatever is here. This is the students T statistic NOTE Treffsikkerhet: 91% (H?Y) 00:07:33.750 --> 00:07:40.500 the number of degrees of freedom, the p-value and here's the effect size which in this case is 00:07:40.500 --> 00:07:49.300 Cohen's D, this is the value of D, and this is the confidence interval for d. As we see in this case 00:07:49.300 --> 00:07:55.200 the p-value is much greater than the conventional Alpha level, this is not a statistically 00:07:55.200 --> 00:08:01.900 significant difference, we cannot reject the null hypothesis that girls and boys have the same mean 00:08:01.900 --> 00:08:04.050 vocabulary in kindergarten. NOTE Treffsikkerhet: 76% (H?Y) 00:08:04.050 --> 00:08:10.100 And we see that the confidence interval for the effect size include 0 which starts with the negative 00:08:10.100 --> 00:08:13.300 number and end with a positive number. NOTE Treffsikkerhet: 91% (H?Y) 00:08:13.300 --> 00:08:18.200 So we cannot conclude that there is a difference in vocabulary between boys and girls in 00:08:18.200 --> 00:08:19.900 kindergarten. NOTE Treffsikkerhet: 91% (H?Y) 00:08:20.700 --> 00:08:31.200 Let us look at one more example of this, if we go back and instead of sex we use home language as our 00:08:31.200 --> 00:08:38.600 grouping variable, this amounts to comparing those with a majority home language to those within 00:08:38.600 --> 00:08:45.100 minority home language on the majority vocabulary in kindergarten. NOTE Treffsikkerhet: 90% (H?Y) 00:08:45.100 --> 00:08:52.250 As we see here the homogeneity of variances test is not significant, NOTE Treffsikkerhet: 91% (H?Y) 00:08:52.250 --> 00:08:59.500 which is a little suspicious because there is actually very few children in the minority home 00:08:59.500 --> 00:09:04.500 language group, there's only eight children, this is a very small group and you should never run 00:09:04.500 --> 00:09:07.400 analysis with such small groups. NOTE Treffsikkerhet: 89% (H?Y) 00:09:07.500 --> 00:09:14.600 It is quite possible that the variances of the two groups are in fact very different, but the very 00:09:14.600 --> 00:09:21.900 small sample size for the minority home language group does not allow the homogeneity of variance 00:09:21.900 --> 00:09:28.200 test to be significant. Of course we don't know that that's true, but we cannot really trust any 00:09:28.200 --> 00:09:33.500 estimates made on the basis of eight children alone. So we're only proceeding for the sake of 00:09:33.500 --> 00:09:35.550 illustrating this example. NOTE Treffsikkerhet: 89% (H?Y) 00:09:35.550 --> 00:09:43.800 As we see here the normality test is actually not passed because this is a significant p-value which 00:09:43.800 --> 00:09:50.250 indicates a statistically significant difference of our distribution from the normal distribution. NOTE Treffsikkerhet: 85% (H?Y) 00:09:50.250 --> 00:09:53.850 Therefore we cannot report this test, NOTE Treffsikkerhet: 83% (H?Y) 00:09:53.850 --> 00:09:58.500 we uncheck students and check Mount Whitney. NOTE Treffsikkerhet: 84% (H?Y) 00:09:59.200 --> 00:10:03.850 And now this is our U statistic. NOTE Treffsikkerhet: 80% (H?Y) 00:10:03.850 --> 00:10:07.099 The corresponding p-value NOTE Treffsikkerhet: 84% (H?Y) 00:10:07.099 --> 00:10:14.200 and this is what we can report when we want to compare the majority and minority home language 00:10:14.200 --> 00:10:18.150 children. Here are the corresponding graphs NOTE Treffsikkerhet: 84% (H?Y) 00:10:18.150 --> 00:10:23.800 and the group descriptives that we also need to report. NOTE Treffsikkerhet: 84% (H?Y) 00:10:24.700 --> 00:10:30.100 Let us now look at paired-samples cases. NOTE Treffsikkerhet: 91% (H?Y) 00:10:30.100 --> 00:10:41.050 A different kind of test in this data set we could compare for example the same kind of measurement 00:10:41.050 --> 00:10:49.400 across grades, obviously we would expect children to have higher scores on every kind of test between 00:10:49.400 --> 00:10:50.550 grades. NOTE Treffsikkerhet: 88% (H?Y) 00:10:50.550 --> 00:10:57.600 For example we can test the hypothesis that the children have a higher word reading fluency in grade 00:10:57.600 --> 00:10:59.900 2 that in grade 1. NOTE Treffsikkerhet: 90% (H?Y) 00:10:59.900 --> 00:11:05.700 These are the same children who have scores and reading fluency in both grades, so we are comparing 00:11:05.700 --> 00:11:12.300 score pairs that are on the same row, and this is very important for being able to run a paired 00:11:12.300 --> 00:11:19.450 samples test, that the paired samples have to actually lie on the same Row in the data spreadsheet. NOTE Treffsikkerhet: 89% (H?Y) 00:11:19.450 --> 00:11:26.700 This is very natural because the spreadsheet has one row per child so this is how you would normally 00:11:26.700 --> 00:11:35.100 have your data in any way. However we're not going to run this test because it doesn't make any sense 00:11:35.100 --> 00:11:42.600 because all the scores in grade 2 are greater than in grade 1, which is of course what you 00:11:42.600 --> 00:11:49.650 would expect, therefore we cannot calculate a reasonable statistic using this method. NOTE Treffsikkerhet: 91% (H?Y) 00:11:49.650 --> 00:11:56.099 It is clear that the difference is statistically significant as there is not even any overlap. NOTE Treffsikkerhet: 88% (H?Y) 00:11:56.099 --> 00:12:02.700 So we have to choose a different example for our test here and we are going to check whether the 00:12:02.700 --> 00:12:10.600 children who received intervention improved in their word fluency scores, so that means if they had a 00:12:10.600 --> 00:12:17.800 greater word reading fluency score after the intervention than before the intervention. It's not 00:12:17.800 --> 00:12:25.200 enough to just compare these two variables now because first we have to check to select only those 00:12:25.200 --> 00:12:26.500 children who received NOTE Treffsikkerhet: 88% (H?Y) 00:12:26.500 --> 00:12:36.400 the intervention. So I clicked on filter and I have to type that I want condition to be intervention, NOTE Treffsikkerhet: 87% (H?Y) 00:12:36.400 --> 00:12:43.300 this is the value that I want to calculate the analysis for. So now all the control group children 00:12:43.300 --> 00:12:45.150 are deselected NOTE Treffsikkerhet: 87% (H?Y) 00:12:45.150 --> 00:12:51.700 as we've said from a methodological point of view this is not the appropriate way to run the study 00:12:51.700 --> 00:12:58.300 or the analysis but here is just an example on how to run a paired samples T Test. So we have 00:12:58.300 --> 00:13:01.950 selected only children in the intervention group, NOTE Treffsikkerhet: 89% (H?Y) 00:13:01.950 --> 00:13:11.100 we can also specify perhaps that were interested in children with a majority home language as an 00:13:11.100 --> 00:13:13.600 additional restriction NOTE Treffsikkerhet: 91% (H?Y) 00:13:14.400 --> 00:13:21.500 because you might be afraid that the effects could be different for minority language speakers. NOTE Treffsikkerhet: 91% (H?Y) 00:13:23.300 --> 00:13:30.600 So you can exclude those children even if some of them actually received the intervention NOTE Treffsikkerhet: 84% (H?Y) 00:13:36.100 --> 00:13:48.200 and now we are ready to run our tests. Going to analyses t-test, paired samples T Test NOTE Treffsikkerhet: 86% (H?Y) 00:13:48.200 --> 00:13:55.900 we want to compare word reading fluency in grade 2, which is our Baseline measure before the 00:13:55.900 --> 00:14:04.100 intervention, with word reading fluency post tests, or after intervention for this selected subset of 00:14:04.100 --> 00:14:05.300 children. NOTE Treffsikkerhet: 91% (H?Y) 00:14:05.300 --> 00:14:12.600 Before we get excited about reporting this we should go and check the assumptions NOTE Treffsikkerhet: 85% (H?Y) 00:14:12.600 --> 00:14:19.250 as you note there aren't any homogeneity of variance test here and that's because there aren't any 00:14:19.250 --> 00:14:21.550 groups but just one. NOTE Treffsikkerhet: 88% (H?Y) 00:14:21.550 --> 00:14:30.000 The paired samples t-test is just a one-sample t-test which is run on the differences, so there 00:14:30.000 --> 00:14:34.600 aren't any two groups to compare their variances, there just one group and all that we're 00:14:34.600 --> 00:14:41.400 interested in is its distribution. The p-value here indicates that there is no 00:14:41.400 --> 00:14:48.650 statistically significant difference from the normal distribution to a good to report students t NOTE Treffsikkerhet: 86% (H?Y) 00:14:48.650 --> 00:14:57.400 for this comparison and here is our T statistic. And here are the degrees of freedom and the p-value. 00:14:57.400 --> 00:15:05.200 And in case you're wondering if we can choose how many decimal places to show here jamovie gives 00:15:05.200 --> 00:15:12.100 us the opportunity of setting this if we go and click on the three vertical dots to the top right 00:15:12.100 --> 00:15:19.050 corner, so below the x that closes the program and above the plus sign for the modules, in between NOTE Treffsikkerhet: 86% (H?Y) 00:15:19.050 --> 00:15:21.450 there are three vertical dots NOTE Treffsikkerhet: 83% (H?Y) 00:15:21.450 --> 00:15:28.849 and in that menu that pops out we can choose number format. And jamovie by default shows three 00:15:28.849 --> 00:15:36.000 significant figures that means three significant digits which means that it will show as many 00:15:36.000 --> 00:15:41.750 decimal places as needed so that we have three significant digits. NOTE Treffsikkerhet: 83% (H?Y) 00:15:41.750 --> 00:15:50.100 Three reliable places no extra decimals, but we can choose to have for example three decimal places 00:15:50.100 --> 00:15:52.000 no matter what. NOTE Treffsikkerhet: 82% (H?Y) 00:15:52.400 --> 00:15:59.200 go back here and so now we have three decimal places, you do not need to report t with three 00:15:59.200 --> 00:16:06.600 decimal places in any result so this was just to show you how it's possible to change this. You 00:16:06.600 --> 00:16:10.800 should be reporting t with two decimal places. NOTE Treffsikkerhet: 78% (H?Y) 00:16:11.300 --> 00:16:21.200 You can choose two decimal places here or you can leave the default and get two or one as necessary, 00:16:21.200 --> 00:16:25.500 most of the time this isn't what you want. NOTE Treffsikkerhet: 89% (H?Y) 00:16:27.100 --> 00:16:34.500 You can also choose here to show the descriptives and plots of the descriptives, so we can get the 00:16:34.500 --> 00:16:40.100 mean and standard deviation for each of the two conditions, these aren't different groups now these are 00:16:40.100 --> 00:16:47.400 different time points, they are before and after intervention, and these are the graphical representations 00:16:47.400 --> 00:16:54.300 including means with confidence intervals and media's. Note that these confidence intervals are not 00:16:54.300 --> 00:16:56.400 relevant for our comparison NOTE Treffsikkerhet: 89% (H?Y) 00:16:56.400 --> 00:17:01.200 because we're not interested in the differences between children we're interested in the 00:17:01.200 --> 00:17:04.300 differences between time points. NOTE Treffsikkerhet: 81% (H?Y) 00:17:04.300 --> 00:17:12.300 And this is the QQ plot that doesn't show any reason to worry about deviation from normality, if we 00:17:12.300 --> 00:17:20.599 had gotten a significant p-value here, so a low p-value below the conventional Alpha level, we should 00:17:20.599 --> 00:17:28.000 not be reporting student's t-test, instead we should be reporting the wilcoxon rank test. NOTE Treffsikkerhet: 84% (H?Y) 00:17:28.000 --> 00:17:37.750 And here is the W statistic and the associated p-value, which of course in this case is again 00:17:37.750 --> 00:17:44.600 statistically significant. There is no difference between these two in what the conclusion should be. 00:17:44.600 --> 00:17:52.100 Either with the parametric t-test or with the nonparametric wilcoxon rank test we reject the null 00:17:52.100 --> 00:17:57.550 hypothesis and conclude that there was a difference between word fluency NOTE Treffsikkerhet: 91% (H?Y) 00:17:57.550 --> 00:18:03.000 scores before and after intervention for this group of children. NOTE Treffsikkerhet: 86% (H?Y) 00:18:05.400 --> 00:18:15.500 You can also report the effect size for the student's t-test that's coens d and you can report 00:18:15.500 --> 00:18:21.050 its confidence interval if that's required by where you are submitting. NOTE Treffsikkerhet: 90% (H?Y) 00:18:21.050 --> 00:18:25.700 And that's how you run a t-test in jamovie