Background
Quantitative data
Many measurement studies in computer science yield their results in the form of numbers, which can be sorted, ordered and measured in spaces where zero, addition and multiplication have the traditional meaning. Sometimes difference axiomatic systems are equally capable.
When the evaluation is performed by people, you must carefully evaluate whether you are in this situation. People have quite different perceptions and preferences, and two people may assess two observation in different orders. A categorical study (Friedman's text) allows you to test whether a statistically relevant number of people make the same ranking decision, or whether a rank between two observations cannot be established. A quantitative study (ANOVA) allows you to assign values to options and assess whether they show relevant statistical differences to other options.
With Friedman's test, you can establish a partial order between observations. With ANOVA, you can establish average values with a standard deviation and determine the statistical relevance of the distance between options.
System-under-test
You have together conducted a user study, where candidates compared the quality of some videos. The results of the user study can found here: all.csv
These videos emulate the uneven visual quality that is typical for delivering a visual region of a 360-degree panorama video to a mobile phone or VR headset using adaptive video streaming over HTTP. You have collected several ratings on a 5-point Likert scale for each of the example videos, and are now prepared to determine whether the videos can be ranked according to a global order, or whether the videos can be grouped to form a total order.
Task
First, arrange the video ratings from each participant in the study in a table according to the procedure for ANOVA. Test whether you can achieve a high confidence in distinguishing all example videos from each other. If yes, document and illustrate your results.
If no, define groups of videos. It is typical at this point to group videos by properties of the test input. In the given test data, you could group videos by the percentage of low vs high quality pixels in the video, or group them by the intensity of the introduced blur, or by content. However, you could also group the videos by similarity of values that you have discovered in the first step.
The grouping method is your choice. You should choose a single grouping method that allows you to establish a ranking, and then explain your grouping method as well as the results.