Parameter tests 参数检验

1.Sample Test

1.1 z－test

两个样本方差已知，且相等

z= (xbar-ybar) / (sigma*sqrt(1/m +1/n) )

1.2 t-test, F-test

在sigma未知的情况下

单样本，可以检验mu（平均值）取值

双样本，可以比较检验两个样本间平均值的关系

t.test( )

中间有很多参数可以设置，包括是否pair, Variance是否equal, Alternative Hypothesis 是什么。

用Ftest检验方差是否相等

*Special case

In Practice Observational samples 在实际的观测样本中，对比实验组和观测组下结论之前，除了用回归的方式排除其他因素干扰，亦可选用Propensity Score Methods

第一步，用逻辑斯特回归计算Propensity Score

第二步：方法一，选用相近的score作为matching pair

方法二，将Propensity score转换为odds, 作为weights。进行比较。

具体操作参见以下codes：

// lalonde_PropensityScoreExample.r

// Created by Siddhartha Dalal on 4/15/14.

//The example uses the LaLonde (1986) experimental data which is based on a nationwide job training experiment. The observations are individuals, and the outcome of interest is real earnings in 1978. There are eight baseline variables age (age), years of education (educ), real earnings in 1974 (re74), real earnings in 1975 (re75) and in 78 (re78), and a series of indicator variables. The indicator variables are black (black), Hispanic (hisp), married (married) and lack of a high school diploma (nodegr).

require(Matching)

data(lalonde)

#attach(lalonde) #data from library Matching - lalonde

par(ask=T)

for (i in 1:length(names(lalonde))) {hist(lalonde[[i]],xlab=names(lalonde[i]),main= names(lalonde[i]))} ## 不同变量是否接受了treatment的柱状图

Tr <- lalonde$treat

#Propensity Score computation

glm1 <- glm(Tr ~ age + educ + black + hisp + married + nodegr +

+ re74 + re75, family = binomial, data = lalonde)

plot(ecdf(glm1$fit))

propscore=glm1$fitted/(1-glm1$fitted)

plot(ecdf(propscore))

#Estimation of Average Effect for Treated Population (ATT)

before_after_prop=function(comp_var,propscore,Tr,plt=FALSE){

#given comparison variable where you want to measure balance, and corresponding propscore as well as indicator for treatment variable, compute means before adjustment and after, also compute ATT

y1=comp_var[Tr==1] ##comp_var是目标数值

y0=cbind(comp_var,propscore)[Tr==0,]

meany1_tr=mean(y1)

meany0_ntr=mean(y0[,1])

meany0_tr=weighted.mean(y0[,1],y0[,2])

meany1_ntr=weighted.mean(y1,1/propscore[Tr==1])

print(c("actually","mean if not treated","mean if treated"))

print(c("treated",meany1_ntr,meany1_tr))

print(c("controlled",meany0_ntr,meany0_tr))

ATT=meany1_tr-meany0_tr #ATT=Average Treatment for Treated for the covariate-i.e. it is mean of the discrepancy between before and after matching

ATT.w=meany1_tr-meany0_ntr #without propensity score matching

##See how well is the samples balanced

if (balance==TRUE) {py1<-propscore[Tr==1]; py0<-propscore[Tr==0]

par(mfrow=c(2,1))

hist(py1); hist(py0)

#Permutation test

x<-as.list(seq(range(propscore)[1],range(propscore)[2],by=0.001))

Fx1<-sapply(x,function(x) sum(as.numeric(py1<=x))/length(py1))

Fx2<-sapply(x,function(x) sum(as.numeric(py0<=x))/length(py0))

t<-which(Fx2!=0 & Fx2!=1)

ad2.obs<-sum((Fx1[t]-Fx2[t])^2/(Fx2[t]*(1-Fx2[t])))

} else {}

return(list(mean_tr_tr=meany1_tr,mean_tr_ntr=meany0_tr,mean_ntr=meany0_ntr,ATT=ATT,ATT.w=ATT.w, balance.statistic=ad2.obs))

}

#The following variables were matched. Let us see balance on it after adjustment

before_after_prop(lalonde$nodegr,propscore,Tr)

before_after_prop(lalonde$hisp,propscore,Tr)

before_after_prop(lalonde$re75,propscore,Tr)

#The following varialble is outcome varaible, let us see the performance on it

before_after_prop(lalonde$re78,propscore,Tr, balance=TRUE)

2. Linear Regression

2.1 t-test

回归出来的模型summary(fit)，在coefficient跟着的t值,服从 t 分布，df＝n－p

confint(fit, level=)

置信区间

2.2 General linear test (F-test)

利用anova表格，F-stat＝（SSE(R) - SSE(F) )/ (dfr -dff) / MSE(F) ~F(dfr-dff, dff)

anova(fit)

3. General Linear Regression

3.1 Wald Test

回归出来的模型summary(fit)，在coefficient后跟着的Z－score，服从正太分布

3.2 Drop-in Deviance Test

同样在summary(fit)中关注Residual Deviance，越小越好

Difference in Deviance~ Chisq with df= difference in df

3.3 The likelihood Ratio Test

Wednesday, April 2, 2014

next >

< previous