R假设检验 Hypothesis testing

目录

要点: Hypothesis testing is one of the workhorses of science. It is how we can draw conclusions or make decisions based on finite samples of data.[stanford - bios221 - chapter6]

统计学检验概述

精彩网络资源 推荐: 生物统计学与R极简手册


大数定理:当样本量足够多时,样本发生的频率近似于概率。


中心极限定理: 中心极限定理以严格的数学形式阐明了在大样本条件下,不论总体的分布如何,样本的均值总是近似地服从正态分布。如果一个随机变量能够分解为独立同分布的随机变量序列之和,则可以直接利用中心极限定理进行解决。总之,恰当地使用中心极限定理解决实际问题有着极其重要意义。


1.常见的检验包括 参数检验 和 非参数检验。参数检验包括 t.test, fisher.test 等, 非参数检验包括 wilcox.test 等。


2.假设检验的五个步骤

1. 根据目的,设计实验,选择合适的统计量。
2. 设置 零假设 (null hypothesis)。就是认为没有差别,没有影响。
3. 确定拒绝区间 rejection region
4. 收集处理数据,计算统计量。
5. 做决定:如果统计学参数落在一个小概率事件的范围,则根据小概率事件在一次实验中不可能发生,拒绝原假设。

3.假设检验的两类错误 Types of error。

Test vs reality Null hypothesis is true ... is false
Reject null hypothesis Type I error (false positive) True positive
Do not reject True negative Type II error (false negative)

通常可以根据具体情况权衡降低两种错误中的哪一类。

两类错误总是有一个权衡。
对于非典等流行病的公开与上报,标准要适当宽松,因为弃真的代价太大。
而对于(孕产妇等)新药的临床试验,标准要尽量严格,因为纳伪的代码太大。

弃真: 
	条件严格了(更小的p值,比如0.01),筛选到的真值比例高了,但是被错误过滤掉的真值也多了。
	I类错误 假阳性,false positive rate (FPR) 
	the FPR is the same as the probability α that we mentioned above. 1−α is also called the specificity of a test. 
纳伪: 
	条件宽松了(更大的p值,比如0.05),筛选到的真值总数多了,但是被错误接受的假值也多了。
	II类错误 假阴性,false negative rate (FNR)
	The FNR is sometimes also called β, and 1−β the power, sensitivity or true positive rate of a test.

参数检验

t检验

# 检验鸢尾花花萼的长宽是否有差异 # 分组t检验 > t1=t.test(iris$Sepal.Length, iris$Sepal.Width); t1 Welch Two Sample t-test data: iris$Sepal.Length and iris$Sepal.Width t = 36.463, df = 225.68, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 2.63544 2.93656 sample estimates: mean of x mean of y 5.843333 3.057333 > t1$p.value [1] 1.459543e-96 # 配对 t 检验,检验效率更高 > t2=t.test(iris$Sepal.Length - iris$Sepal.Width); t2 One Sample t-test data: iris$Sepal.Length - iris$Sepal.Width t = 34.815, df = 149, p-value < 2.2e-16 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 2.627874 2.944126 sample estimates: mean of x 2.786 > t2$p.value [1] 1.849554e-73

方差分析(1-way, 2-way, multi-way)

使用R进行多因素多水平的方差分析,考虑交叉效应。

参考: pdf1pdf2pdf3

# 有5个因素,每个因素4个水平,因素之间有相互作用 # 造数据 set.seed(202112) data2=data.frame( value=c(rnorm(100, 70, 15), rnorm(100, 80, 15), rnorm(100, 90, 15), rnorm(100, 100, 15)), A=rep(c(1,2,3,4), each=100), #A should be sig. B=rep(c(1,2,3,4), time=100), C=rep(c(4,1,2,3, 5), time=80), D=rep(c(3,4,1,2, 5,6,7,8,9,100), time=40), E=rep(c(2,3,4,1, 5,6,7,8), time=50) ) # improve sig of B # 10 sig for(i in 1:400){ data2[i, "value"]=data2[i, "value"] + (i %%4)*2 } # to factors 一定要把水平转为因子 data2$A=as.factor(data2$A) data2$B=as.factor(data2$B) data2$C=as.factor(data2$C) data2$D=as.factor(data2$D) data2$E=as.factor(data2$E) str(data2) dim(data2) head(data2) # anova 交叉效应 aov2=aov(value~A*B*C*D*E, data2) #very slow(>1 min) summary(aov2) # method2 result=anova(lm(value ~ A*B*C*D*E, data=data2)) #very slow(>1min) result 输出: Analysis of Variance Table Response: value Df Sum Sq Mean Sq F value Pr(>F) A 3 45445 15148.3 72.5612 < 2e-16 *** B 3 2130 710.2 3.4017 0.01844 * C 4 662 165.5 0.7928 0.53085 D 4 679 169.7 0.8127 0.51808 E 4 1557 389.2 1.8643 0.11739 A:B 9 1799 199.9 0.9573 0.47619 A:C 12 2185 182.1 0.8722 0.57602 B:C 8 1337 167.1 0.8004 0.60276 A:D 12 2468 205.6 0.9850 0.46380 A:E 12 4005 333.7 1.5986 0.09259 . C:E 16 4218 263.6 1.2629 0.22207 A:B:C 24 6261 260.9 1.2496 0.20081 A:C:E 48 14631 304.8 1.4601 0.03521 * Residuals 240 50104 208.8 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 多重比较 > TukeyHSD(aov(value~A, data2)) > TukeyHSD(aov(value~A*C*E, data2)) #includes all the interactions > TukeyHSD(aov2) #many outputs

非参数检验

置换检验

置换检验常常用来比较均值,和 bootstrap 很像,基本区别是后者重抽样时允许重复,前者不重复。 //The so-called permutation test, introduced by R. A. Fisher in the 1930s, is sometimes recommended for comparing means. The method is somewhat similar in spirit to the bootstrap, but a fundamental difference between it and the bootstrap is that the bootstrap resamples with replacement and the permutation test does not.

置换检验是怎么工作的呢? 在H0假设下,2个样本来自相同的分布,所以随机置换两组中的元素会给出相同分布下的新的一套数据。 //Why permutation test works? It works because under H0, the two samples are from the same distribution. Thus, randomly exchanging the elements in the two samples should give us a new set of data from the same distribution

置换检验的步骤:

1). 计算开始的统计量 d0=mean(X1)-mean(X2); Determine & calculate the initial test-statistic.

2). 抽样重新计算该统计量,重复N次(足够大),得到其分布 dn=mean(X1new)-mean(X2new); Construct approximate test-statistic distribution.

3). 超出最初统计量的值/n就是p值 p=(larger than d0)/N; Calculate the p-value.

如果想要控制I类错误为0.05,就是要让95%的d1,...,dN 不包含 d0。 //If we want the Type I error probability to be .05, we conclude that the groups differ if the middle 95% of the values d1, …, dB do not contain d.

//Both are nonparametric procedures in the sense that they make no specific assumptions about the form of any underlying probability distributions.

Bootstrap方法能够测量一个参数的估计值的有效性,而置换检验产生了检验某些统计假设的新方法。 // Bootstrap methods enable to measure the efficacy of an estimator of a parameter, while permutation tests yield new ways to test certain statistical hypotheses.

# 计算两组数据的均值是否有差异 Group1 =c(6, 19, 34, 15) Group2 =c(9, 21, 8, 53, 25) d0=mean(Group1)-mean(Group2); d0 #-4.7 # 组合到一起 pool=c(Group1, Group2); pool # 置换后,前4个一组,后5个一组,计算平均值差 oneTest=function(){ dt2=sample(pool, size=length(pool), replace = F) mean(dt2[1:4]) - mean(dt2[5:9]) } N=10000 set.seed(2021) dn=replicate(N, oneTest()) # 画图展示 hist(dn, n=50) abline(v=c(d0, -d0), lty=2, lwd=2, col="red") # 看超过d0的百分比 len=length( dn[which( abs(dn) > abs(d0) )] ); len # 计算p值 p=len/N; p #[1] 0.7111 # p值很大,无法拒绝原假设,就是说接受两组间无显著差异。
#计算两组数据的 中位数 是否有差异 SX = c(-2, -1, -1, 0, -2, -1, 0, -1, -2, 100) SY = c(7, 13, 11, 5, 14, 9, 8, 10, 12, 11) #检验统计量是 中位数差 d0=median(SX) - median(SY); d0 #-11.5 length(SX) #10 length(SY) #10 oneTest=function(){ SS=sample( c(SX, SY) ) median(SS[1:10]) - median(SS[11:20]) } # 重排后,计算均值差;重复多次 set.seed(202108) TestNum=10000 result=replicate(TestNum, oneTest() ) # 画分布 hist(result, n=50) abline(v=c(d0, -d0), lty=2, lwd=2, col="red") #计算超过 |T1| 的个数 len=length( result[abs(result) > abs(d0)] ); len #33 p=len/(TestNum); p #0.0033

参考资料

http://web.stanford.edu/class/bios221/book/Chap-Testing.html
https://www.sciencedirect.com/topics/mathematics/permutation-test
https://www.sciencedirect.com/topics/mathematics/bootstrap-method