This week’s assignment was to apply ANOVA to a dataset of patient pain ratings at different levels of stress. The patients rated their pain levels while on the test drug on a scale from 1 to 10 (with 10 being the most pain), and their stress states were recorded as high, moderate, and low stress.
The null hypothesis is that there is no difference between the means of the pain ratings at different stress levels.
To test this, I copied the table of data from the assignment website. Then I cleaned the dataset up a bit using the gather() function from the tidyr package to convert the data to two columns – one with the stress level and one with the pain level. Then I used aov() to get ANOVA information and TukeyHSD() to check variability between specific stress groups.
My code follows.
library("tidyr") # Read the table and then convert it to two columns detailing stress and pain migraine <- read.table("G:/week10data.txt", header=TRUE) migraine_clean <- gather(migraine, stress, pain, factor_key=TRUE) # Get ANOVA information for the data migraine_aov <- aov(migraine_clean$pain ~ migraine_clean$stress) # Print summary information from ANOVA summary(migraine_aov) # Print comparisons between groups to determine where variance lies TukeyHSD(migraine_aov)
The output of the ANOVA information was:
Df Sum Sq Mean Sq F value Pr(>F) migraine_clean$stress 2 82.11 41.06 21.36 4.08e-05 *** Residuals 15 28.83 1.92 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The p value is very low (less than 0.05) and the F value is high, so the null hypothesis can be rejected. There is variability in pain between stress groups.
(To be sure that F was high enough to show variability I ran qf() against the DFs of 2 and 15. At a 95% probability level the critical F value would be 3.682, so the F of 21.36 does exceed the critical level.)
To find where there was variance, I used TukeyHSD(). The output of the TukeyHS() function was:
Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = migraine_clean$pain ~ migraine_clean$stress) $`migraine_clean$stress` diff lwr upr p adj moderate_stress-high_stress -1.166667 -3.245845 0.9125117 0.3382642 low_stress-high_stress -5.000000 -7.079178 -2.9208216 0.0000440 low_stress-moderate_stress -3.833333 -5.912512 -1.7541550 0.0006586
There was not significant variance in the mean pain between moderate and high stress, but there was significant variance in the other comparisons. It therefore appears that pain levels differ significantly between low and medium stress levels, while there is little difference in pain levels between medium and high stress levels.