R假设检验 Hypothesis testing
要点: Hypothesis testing is one of the workhorses of science. It is how we can draw conclusions or make decisions based on finite samples of data.[stanford - bios221 - chapter6]
精彩网络资源 推荐: 生物统计学与R极简手册
大数定理:当样本量足够多时,样本发生的频率近似于概率。
中心极限定理:
中心极限定理以严格的数学形式阐明了在大样本条件下,不论总体的分布如何,样本的均值总是近似地服从正态分布。如果一个随机变量能够分解为独立同分布的随机变量序列之和,则可以直接利用中心极限定理进行解决。总之,恰当地使用中心极限定理解决实际问题有着极其重要意义。
1.常见的检验包括 参数检验 和 非参数检验。参数检验包括 t.test, fisher.test 等, 非参数检验包括 wilcox.test 等。
2.假设检验的五个步骤
1. 根据目的,设计实验,选择合适的统计量。
2. 设置 零假设 (null hypothesis)。就是认为没有差别,没有影响。
3. 确定拒绝区间 rejection region
4. 收集处理数据,计算统计量。
5. 做决定:如果统计学参数落在一个小概率事件的范围,则根据小概率事件在一次实验中不可能发生,拒绝原假设。
3.假设检验的两类错误 Types of error。
Reject null hypothesis |
Type I error (false positive) |
True positive |
Do not reject |
True negative |
Type II error (false negative) |
通常可以根据具体情况权衡降低两种错误中的哪一类。
两类错误总是有一个权衡。
对于非典等流行病的公开与上报,标准要适当宽松,因为弃真的代价太大。
而对于(孕产妇等)新药的临床试验,标准要尽量严格,因为纳伪的代码太大。
弃真:
条件严格了(更小的p值,比如0.01),筛选到的真值比例高了,但是被错误过滤掉的真值也多了。
I类错误 假阳性,false positive rate (FPR)
the FPR is the same as the probability α that we mentioned above. 1−α is also called the specificity of a test.
纳伪:
条件宽松了(更大的p值,比如0.05),筛选到的真值总数多了,但是被错误接受的假值也多了。
II类错误 假阴性,false negative rate (FNR)
The FNR is sometimes also called β, and 1−β the power, sensitivity or true positive rate of a test.
# 检验鸢尾花花萼的长宽是否有差异
# 分组t检验
> t1=t.test(iris$Sepal.Length, iris$Sepal.Width); t1
Welch Two Sample t-test
data: iris$Sepal.Length and iris$Sepal.Width
t = 36.463, df = 225.68, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
2.63544 2.93656
sample estimates:
mean of x mean of y
5.843333 3.057333
> t1$p.value
[1] 1.459543e-96
# 配对 t 检验,检验效率更高
> t2=t.test(iris$Sepal.Length - iris$Sepal.Width); t2
One Sample t-test
data: iris$Sepal.Length - iris$Sepal.Width
t = 34.815, df = 149, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
2.627874 2.944126
sample estimates:
mean of x
2.786
> t2$p.value
[1] 1.849554e-73
使用R进行多因素多水平的方差分析,考虑交叉效应。
参考:
pdf1,
pdf2,
pdf3,
# 有5个因素,每个因素4个水平,因素之间有相互作用
# 造数据
set.seed(202112)
data2=data.frame(
value=c(rnorm(100, 70, 15), rnorm(100, 80, 15),
rnorm(100, 90, 15), rnorm(100, 100, 15)),
A=rep(c(1,2,3,4), each=100), #A should be sig.
B=rep(c(1,2,3,4), time=100),
C=rep(c(4,1,2,3, 5), time=80),
D=rep(c(3,4,1,2, 5,6,7,8,9,100), time=40),
E=rep(c(2,3,4,1, 5,6,7,8), time=50)
)
# improve sig of B
# 10 sig
for(i in 1:400){
data2[i, "value"]=data2[i, "value"] + (i %%4)*2
}
# to factors 一定要把水平转为因子
data2$A=as.factor(data2$A)
data2$B=as.factor(data2$B)
data2$C=as.factor(data2$C)
data2$D=as.factor(data2$D)
data2$E=as.factor(data2$E)
str(data2)
dim(data2)
head(data2)
# anova 交叉效应
aov2=aov(value~A*B*C*D*E, data2) #very slow(>1 min)
summary(aov2)
# method2
result=anova(lm(value ~ A*B*C*D*E, data=data2)) #very slow(>1min)
result
输出:
Analysis of Variance Table
Response: value
Df Sum Sq Mean Sq F value Pr(>F)
A 3 45445 15148.3 72.5612 < 2e-16 ***
B 3 2130 710.2 3.4017 0.01844 *
C 4 662 165.5 0.7928 0.53085
D 4 679 169.7 0.8127 0.51808
E 4 1557 389.2 1.8643 0.11739
A:B 9 1799 199.9 0.9573 0.47619
A:C 12 2185 182.1 0.8722 0.57602
B:C 8 1337 167.1 0.8004 0.60276
A:D 12 2468 205.6 0.9850 0.46380
A:E 12 4005 333.7 1.5986 0.09259 .
C:E 16 4218 263.6 1.2629 0.22207
A:B:C 24 6261 260.9 1.2496 0.20081
A:C:E 48 14631 304.8 1.4601 0.03521 *
Residuals 240 50104 208.8
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
多重比较
> TukeyHSD(aov(value~A, data2))
> TukeyHSD(aov(value~A*C*E, data2)) #includes all the interactions
> TukeyHSD(aov2) #many outputs
置换检验常常用来比较均值,和 bootstrap 很像,基本区别是后者重抽样时允许重复,前者不重复。
//The so-called permutation test, introduced by R. A. Fisher in the 1930s, is sometimes recommended for comparing means. The method is somewhat similar in spirit to the bootstrap, but a fundamental difference between it and the bootstrap is that the bootstrap resamples with replacement and the permutation test does not.
置换检验是怎么工作的呢? 在H0假设下,2个样本来自相同的分布,所以随机置换两组中的元素会给出相同分布下的新的一套数据。
//Why permutation test works? It works because under H0, the two samples are from the same distribution. Thus, randomly exchanging the elements in the two samples should give us a new set of data from the same distribution
置换检验的步骤:
1). 计算开始的统计量 d0=mean(X1)-mean(X2); Determine & calculate the initial test-statistic.
2). 抽样重新计算该统计量,重复N次(足够大),得到其分布 dn=mean(X1new)-mean(X2new); Construct approximate test-statistic distribution.
3). 超出最初统计量的值/n就是p值 p=(larger than d0)/N; Calculate the p-value.
如果想要控制I类错误为0.05,就是要让95%的d1,...,dN 不包含 d0。
//If we want the Type I error probability to be .05, we conclude that the groups differ if the middle 95% of the values d1, …, dB do not contain d.
//Both are nonparametric procedures in the sense that they make no specific assumptions about the form of any underlying probability distributions.
Bootstrap方法能够测量一个参数的估计值的有效性,而置换检验产生了检验某些统计假设的新方法。
// Bootstrap methods enable to measure the efficacy of an estimator of a parameter, while permutation tests yield new ways to test certain statistical hypotheses.
# 计算两组数据的均值是否有差异
Group1 =c(6, 19, 34, 15)
Group2 =c(9, 21, 8, 53, 25)
d0=mean(Group1)-mean(Group2); d0 #-4.7
# 组合到一起
pool=c(Group1, Group2); pool
# 置换后,前4个一组,后5个一组,计算平均值差
oneTest=function(){
dt2=sample(pool, size=length(pool), replace = F)
mean(dt2[1:4]) - mean(dt2[5:9])
}
N=10000
set.seed(2021)
dn=replicate(N, oneTest())
# 画图展示
hist(dn, n=50)
abline(v=c(d0, -d0), lty=2, lwd=2, col="red")
# 看超过d0的百分比
len=length( dn[which( abs(dn) > abs(d0) )] ); len
# 计算p值
p=len/N; p #[1] 0.7111
# p值很大,无法拒绝原假设,就是说接受两组间无显著差异。
#计算两组数据的 中位数 是否有差异
SX = c(-2, -1, -1, 0, -2, -1, 0, -1, -2, 100)
SY = c(7, 13, 11, 5, 14, 9, 8, 10, 12, 11)
#检验统计量是 中位数差
d0=median(SX) - median(SY); d0 #-11.5
length(SX) #10
length(SY) #10
oneTest=function(){
SS=sample( c(SX, SY) )
median(SS[1:10]) - median(SS[11:20])
}
# 重排后,计算均值差;重复多次
set.seed(202108)
TestNum=10000
result=replicate(TestNum, oneTest() )
# 画分布
hist(result, n=50)
abline(v=c(d0, -d0), lty=2, lwd=2, col="red")
#计算超过 |T1| 的个数
len=length( result[abs(result) > abs(d0)] ); len #33
p=len/(TestNum); p #0.0033
http://web.stanford.edu/class/bios221/book/Chap-Testing.html
https://www.sciencedirect.com/topics/mathematics/permutation-test
https://www.sciencedirect.com/topics/mathematics/bootstrap-method