NONPARAMETRIC TESTS OF DISTRIBUTIONS
The simplest case is a two-sample test of distributions, the Kolmogorov-Smirnov (K-S) Test. The K-S Test is based on the entire set of ICA values obtained from a single subject on two tasks. Because this test is not as well-known as alternative tests such as ANOVA or regression, we provide more detail here than for the others.
Consider the nature of the ICA. It is an estimate of brain activity based upon the signal of pupil size. When an individual performs a task, the brain responds, and a stream of ICA values is computed, with one value for each second of the task. Now suppose the individual performs a second task. If his brain is responding in basically the same way under the same conditions as on the first task, then we should expect to see roughly the same workload for the second task as for the first. The values won’t necessarily line up exactly the same, but the entire set of ICA values on each task should be very similar. This is the rationale behind the two-sample test of distributions. Each task generates a sample set of values, and the null hypothesis is that the two samples do not differ in shape or location.
An example may be useful: three ICA signals are presented in the graphs below.These ICA signals were generated by one subject performing the three different tasks described in FollowSolveListen, with each task lasting 40 seconds (time is on the x axis and ICA size is on the y axis). Notice the many oscillations in all signals. Means of the signals confirms that Task3 is the largest, followed by Task2 and then Task1, with values of 0.262, 0.399 and 0.485 respectively.
What makes the K-S Test so useful is that it allows us to compare signals like these. If we only looked at mean values, we would not be able to make a statistical test. With only three data points, it would be impossible to say anything other than one appears larger than the other. However, when we consider all ICA values generated across any two tasks, we have both a sufficient number of data points and an underlying rationale for making the tests.
Each signal has many high and low values, and it is impossible to tell by looking at any two signals that they are, in fact, significantly different. We look instead at the empirical distribution functions derived from them. Suppose we wish to compare signal 1 with signal 2. Signal 1 has n1 observations and signal 2 has n2 observations. The mechanics of the K-S statistic combines all n1+n2 observations and orders them from least to greatest. Cumulative probability distributions are then created with respect to each signal. The statistical test compares these two cumulative probability distributions.
Three tests can be done here, one for each possible pairing of signals. If the same process is generating the workload elicited by the two tasks under comparison, the resulting cumulative probability distributions should be similar. If the process has changed from one task to the next, the distributions should differ.
To make the cumulative probability distributions, create two new vectors each of size n1+n2. The first element in one vector will be the number of elements in signal 1 equal to or less than the first value in the combined set. (Remember that the smallest value in the combined set will be first, and the largest value will be last.) Do the same thing for the second value in the combined set, and so on until you reach the largest (last) value. The final element in the vector must be equal to n1, because all elements in signal 1 must be greater than or less than the largest value in the combined set. Repeat the process with the second new vector, comparing the elements in signal 2 to the combined set. The final element in the second vector will be n2. Rescale the values in both vectors by dividing each element in the first vector by n1 and each element in the second vector by n2. You now have the cumulative probability functions based on the two signals. The K-S statistic is calculated as the maximum absolute difference between the corresponding elements of these two cumulative probability functions
The three pairwise tests for this example are all statistically significant at p<0.001. The graphs below show the two cumulative probability distributions used for each test. The green arrows indicate the maximum deviations of the K-S Tests. In this case, the three maximum observed differences are 0.625, 0.825, and 0.500.
So, what can we learn from the K-S Tests? We would most like to know two things: at which values of the ICA are the signals most different? And which signals contain the highest and lowest levels of workload?
We can see the answer to the first question in the plots of the probability distributions above. In this example, all three plots show that the largest difference between the plotted lines is around the midpoint of the x axis.
We can answer the second question by turning to a different visualization of the signals. Take each original signal of ICA values, re-order each one from lowest to highest, and compute the cumulative sum over the entire set. The results are:
This graph shows that the cumulative sums taken over 40 seconds of data are quite different. The black line has the highest cumulative sum and lies well above the other two. The red line climbs more rapidly than the blue one. Notice the clear separation among all three lines, especially over the last half of the graph. In general, the more two lines differ, the larger the statistical difference between the two. For example, the greatest difference on the K-S test described above was between Task1 and Task3, with a max difference of 0.825. The two lines that are most distinct from each other in this plot are the same two tasks.
As a frame of reference, here is a second example in which the ICA signals are not different from each other. Below are three ICA signals, all taken from the same person, all spanning 40 seconds, and all based on a common task (reading news on the internet). Note that it is not possible to tell just by looking whether the signals differ. It is necessary to test the differences statistically.
Using the calculations described above for the K-S test, we get the following cumulative probability distributions and cumulative sum plots. All statistical tests on these data were non-significant. The maximum observed differences for the K-S tests were 0.15, 0.10, and 0.15 for the three comparisons, with p-values of 0.70, 0.98, and 0.70.
Compare these graphs with the ones from the first example. The visual differences are striking.
Remember that a detected difference between two signals may be due to size and/or shape. Plotting the probability distributions as well as the cumulative sums will tell you something about signal differences. The probability distributions indicate where the signals are most different. The cumulative sums indicate which signals have a greater number of large ICA values in them and thus have overall higher workload.
When two signals are similar, their plotted probability distributions lie very close to each other, with no large discernible differences. When the two signals are highly different, the plots separate widely at some point. Likewise, for similar signals, their respective cumulative sum plots tend to overlap, with very little space between them. When the signals are different, the cumsum plots spread apart. The signal corresponding to the highest level of workload will be on top and the signal corresponding to the lowest level on the bottom.
It is especially useful to plot the cumulative sums when you have more than two signals to compare. Doing so lets you see quickly which ones are relatively similar—their plots will lie almost on top of each other—and which are distinctly different. You can also see at a glance which correspond to higher workload (because those graphs will lie uppermost in the set) and which correspond to lower workload (because those will lie at the bottom of the set). These graphs do not replace the statistical tests. It is still necessary to make the K-S tests and derive the appropriate p-values for each.
The probability values associated with maximum K-S differences are available in published mathematical tables, and the Kolmogorov-Smirnov two-sample test is a standard option in most statistical analysis packages. However, the output from these tests is usually minimal, consisting only of the maximum difference value and its associated p-value.