Using the pbc data set from the last assignment, answer the following questions:
library(survival)
data(pbc)
What is the probability of the null hypothesis that the age of the study participants in normally distributed?
swtest <- shapiro.test(pbc$age)
swtest$p.value
## [1] 0.008918234
Is the proportion of women in the study what would be expected if particpants were selcted randomly from the us population (females = 50.8% of population). What is the p-value associated with that null hypothesis?
p0 = 0.508
ptest <- prop.test(x = sum(pbc$sex=='f'),
n = nrow(pbc),
p = p0)
ptest.alternative <- prop.test(x = rev(table(pbc$sex)), p=p0)
# Note, we have to reverse the order, since the table tabulates males first, but we need females first
ptest$p.value
## [1] 5.263938e-56
ptest.alternative$p.value
## [1] 5.263938e-56
Is there a statistically significant difference between the cholesterol levels of men versus women?
ttest <- t.test(pbc$chol[pbc$sex=='m'], pbc$chol[pbc$sex=='f'])
ttest1 <- t.test(chol ~ sex, data=pbc)
ttest$p.value
## [1] 0.8129404
ttest1$p.value
## [1] 0.8129404
wtest <- wilcox.test(pbc$chol[pbc$sex=='m'], pbc$chol[pbc$sex=='f'])
wtest1 <- wilcox.test(chol ~ sex, data=pbc)
wtest$p.value
## [1] 0.945674
wtest1$p.value
## [1] 0.945674
Is there a statistically significant relationship between sex and disease stage? You will need to use the “table” command to put these columns (pbc[,c(“sex”,“stage”)]) into a contingency table first.
tab1 <- table(pbc[,c('sex','stage')])
chitst <- chisq.test(tab1)
## Warning in chisq.test(tab1): Chi-squared approximation may be incorrect
fishertst <- fisher.test(tab1)
chitst$p.value
## [1] 0.8307365
fishertst$p.value
## [1] 0.7765807
Is there a statistically significant correlation between serum albumin and serum cholesterol?
cortst <- cor.test(pbc$albumin, pbc$chol)
cortst1 <- cor.test(~ albumin + chol, data=pbc)
cortst$p.value
## [1] 0.2414376
cortst1$p.value
## [1] 0.2414376
Is there a statistically significant correlation between age and cholesterol?
cortst <- cor.test(~age +chol, data=pbc)
cortst1 <- cor.test(pbc$age, pbc$chol)
cortst$p.value
## [1] 0.007786129
cortst1$p.value
## [1] 0.007786129
Typically one chooses parametric tests unless there is an obvious deviation from the assumptions based on descriptive statistics and figures. Generally one should not test for normality (i.e. whether a distribution follows a normal distribution), since it will affect your overall Type I error rate which will no longer be 0.05 (if that’s what you planned). Moreover, the tests of normality (Shapiro-Wilks or Kolmogorov-Smirnov) have poor statistical power to detect deviations from normality in small to medium sample sizes. Most parametric tests, including the t-test and chi-square test, are fairly robust to deviations from normality. However if you think the normality assumptions are not met, go ahead and perform the non-parametric versions of your tests, with the knowledge that except for substantial deviations from normality, the statistical power of these tests are lower than the statistical power of the corresponding parametric tests. The Wilcox and Fisher tests are commonly used in small sample situations.
A threshold of 0.05 is typically chosen for statistical significance. However, in many cases a different threshold is used. For example, in GWAS studies, a typical threshold of 5 x 10-8 is used to correct for Type I error.
Generally a p-value should be reported without commenting on statistical significance. The p-value that would meet statistical significance can be decided by the reader based on their understanding of the problem. The traditional thresholds of 0.05, 0.01 and 0.001 are artificial and reflect merely the inability of the founders of statistics to compute critical values of standard distributions for all possible Type I error levels. Current computers make this moot.
The general rubric for hypothesis tests in R is that the p-value can te extracted as testobj$p.value
, where the test that is run is stored in testobj
. For example,
library(survival)
data(pbc)
ttst <- t.test(age ~ sex, data=pbc)
ttst$p.value
There is a package called broom
which makes extracting information from hypothesis tests (and models) a lot easier. This package has a function tidy
that converts the results of a test into a data.frame
. For example,
if(!('broom' %in% installed.packages()[,'Package'])){
install.packages('broom')
}
library(broom)
tidy(ttst)
## # A tibble: 1 x 10
## estimate estimate1 estimate2 statistic p.value parameter conf.low
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 5.55 55.7 50.2 3.20 0.00236 52.2 2.07
## # ... with 3 more variables: conf.high <dbl>, method <chr>,
## # alternative <chr>
Specially for two-sample tests with small sample sizes, the correct test to do is often the permutation test. This test assumes that under the null hypothesis there is no difference in means between the two groups. In that case, under the null hypothesis, we should be able to shuffle the group assignments and see no real difference in the means. To implement this to test, for example, whether in the pbc data, age is different between males and females, we can do the following:
obs_difference <- mean(pbc$age[pbc$sex=='m']) - mean(pbc$age[pbc$sex=='f'])
n <- nrow(pbc)
n_males <- sum(pbc$sex=='m')
combined_data <- pbc$age
perm_means <- rep(0,1000) # initialize storage of permuation null distribution
set.seed(12) # set seed for the random numbers
for(i in 1:1000){ # 100 interations to generate the null distribution
index_male <- sample(1:n, size=n_males, replace = FALSE) # which elemnts are deemed male under H0
index_female <- setdiff(1:n, index_male) # which elements are deemed female under H0
perm_means[i] = mean(pbc$age[index_male])-mean(pbc$age[index_female])
}
hist(perm_means, main = 'Permutation null distribution')
p_value = mean(abs(perm_means) > abs(obs_difference)) # Two-sided alternative
print(p_value)
## [1] 0.001
The permutation test can also be done by the coin
package:
if(!('coin' %in% installed.packages()[,'Package'])){
install.packages('coin')
}
library(coin)
# Permutation test based on 1000 samples
wilcox_test(age ~ sex, data=pbc, distribution=approximate(B=1000))
##
## Approximative Wilcoxon-Mann-Whitney Test
##
## data: age by sex (m, f)
## Z = 3.0052, p-value = 0.001
## alternative hypothesis: true mu is not equal to 0