* TWELVE PROCEDURES TO DO LOGISTIC REGRESSION IN SAS I have been interested in the logistic regression model for some years now. As SAS was always my preferred statistical software, some SAS code to fit logistic regression models accumulated over the years. Find below 12 different SAS procedures (LOGISTIC, GENMOD, PROBIT, GAM, LIFEREG, GLIMMIX, QLIM, NLMIXED, NLIN, IML, MDC, PHREG) to fit a simple logistic regression. The data set is from a project which I conducted with Dr. Stefan Rimbach from the Gynecology Department of the University of Heidelberg, Germany. The sample consisted of 162 women who wanted to become pregnant and were observed at the department. The response was pregnancy within the first 3 years of observation and the covariates were age at baseline (AGE), years of infertility at baseline (INFER), and a physiological tube defect (TUBPHYSD). All of the procedures below do reproduce exactly the result from a simple maximum likelihood fit in terms of the parameter estimates and their standard errors, which are Intercept 2.0117 (1.3734) AGE -0.0510 (0.0422) INFER -0.1409 (0.0791) TUBPHYSD -0.8880 (0.4284) Though I certainly agree that PROC LOGISTIC is sufficient for most practical cases, some additional things can be learned from the other procedures. For example, PROC QLIM and PROC MDC have a number of R-Square-measures (note that PROC QLIM has the correct ones, because PROC MDC is maximizing a partial likelihood), PROC GENMOD, PROC LIFEREG, PROC GLIMMIX and PROC NLMIXED give additional information criteria, PROC PROBIT gives inverse probabilities, or it might be instructive to look at the actual fitting algorithm in PROC IML. PROC QLIM gives estimates of marginal effects and PROC NLMIXED allows the estimation of nonlinear contrasts. Finally, you can use PROC GENMOD (with the REPEATED statement) to get robust standard errors. I am curious if you could find some more PROCs to do the job. For example, I did not succeed with the MODEL and the SURVEYLOGISTIC procedure. Date: 10/27/2008 ; data pregnancy; input age infer tubphysd pregnant @@; int=1; patid=_N_; cards; 22 3 1 1 22 3 1 0 24 0 0 0 24 2 1 0 24 4 1 0 25 2 0 1 25 8 0 0 26 5 0 0 27 0 1 0 27 1 1 1 27 1 1 0 27 2 1 1 27 2 1 1 27 2 1 0 27 2 1 0 27 3 1 0 27 4 0 0 27 4 1 0 28 2 1 1 28 2 1 1 28 3 1 1 28 4 1 1 29 0 0 1 29 1 1 1 29 1 1 0 29 1 1 0 29 2 0 1 29 3 0 1 29 4 1 0 29 6 1 0 29 7 1 0 30 1 0 1 30 1 1 1 30 1 1 0 30 2 0 0 30 2 1 0 30 4 1 1 30 5 0 0 30 5 1 0 31 1 1 1 31 1 1 0 31 2 0 1 31 2 1 0 31 2 1 0 31 2 1 0 31 3 0 0 31 3 1 1 31 3 1 0 31 3 1 0 31 4 1 1 31 4 1 0 31 4 1 0 31 4 1 0 31 5 0 0 31 5 1 0 31 7 1 0 31 8 1 0 32 1 0 1 32 2 0 0 32 2 1 0 32 2 1 0 32 2 1 0 32 2 1 0 32 2 1 0 32 3 1 1 32 3 1 0 32 3 1 0 32 3 1 0 32 4 1 0 32 6 1 0 32 7 0 0 32 8 1 1 32 8 1 0 33 2 1 0 33 3 1 1 33 3 1 0 33 3 1 0 33 4 0 1 33 4 0 0 33 4 1 1 33 4 1 0 33 5 1 0 33 5 1 0 33 6 1 0 33 12 1 0 34 1 1 0 34 2 1 1 34 2 1 0 34 3 0 1 34 3 0 1 34 3 1 1 34 3 1 0 34 5 1 1 34 7 1 0 34 7 1 0 34 10 0 1 34 10 1 1 35 2 1 1 35 2 1 1 35 2 1 0 35 2 1 0 35 3 0 1 35 3 1 1 35 3 1 0 35 4 1 0 35 4 1 0 35 4 1 0 35 5 0 1 35 5 1 0 35 5 1 0 35 6 0 1 35 6 1 0 36 1 1 0 36 2 1 0 36 2 1 0 36 2 1 0 36 2 1 0 36 3 1 1 36 4 1 0 36 4 1 0 36 4 1 0 36 4 1 0 36 8 1 1 36 15 1 0 36 18 1 0 37 2 0 0 37 2 1 0 37 2 1 0 37 3 1 1 37 3 1 0 37 3 1 0 37 4 1 0 37 5 1 1 37 5 1 0 37 8 0 0 37 10 1 0 37 17 1 0 38 1 1 1 38 3 1 1 38 3 1 1 38 4 1 0 38 8 1 0 38 12 1 0 39 2 1 1 39 2 1 0 39 2 1 0 39 3 1 0 39 10 1 0 39 12 1 0 39 14 1 0 40 3 1 0 40 4 0 0 40 5 0 0 42 1 0 0 42 2 1 0 42 3 1 0 42 5 1 0 42 7 1 0 42 20 1 0 43 4 1 0 45 7 1 0 46 3 1 0 ;run; proc logistic data=pregnancy descending; model pregnant=age infer tubphysd; title"PROC LOGISTIC"; run; proc genmod data=pregnancy descending; model pregnant=age infer tubphysd / link=logit d=bin; title"PROC GENMOD"; run; proc probit data=pregnancy order=data; class pregnant; model pregnant=age infer tubphysd / d=logistic; title"PROC PROBIT"; run; proc gam data=pregnancy; model pregnant(event="1")= param(age infer tubphysd) / dist = logist; title"PROC GAM"; run; proc lifereg data=pregnancy; model pregnant/int=age infer tubphysd / distribution=logistic; title"PROC LIFEREG"; run; proc glimmix data=pregnancy order=data; model pregnant = age infer tubphysd / solution dist=binomial ddfm=none; title"PROC GLIMMIX"; run; proc qlim data=pregnancy; model pregnant=age infer tubphysd / discrete(d=logit); title"PROC QLIM"; run; proc nlmixed data=pregnancy df=10000; parms beta0=-1 beta_age=0 beta_infer=0 beta_tubphysd=0; eta = beta0 + beta_age*age + beta_infer*infer + beta_tubphysd*tubphysd; expeta = exp(eta); p = expeta/(1+expeta); model pregnant ~ binomial(1,p); title"PROC NLMIXED"; run; proc nlin data=pregnancy method=gauss sigsq=1 nohalve; parms beta0=2 beta_age=0 beta_infer=0 beta_tubphysd=-1; num = exp(beta0 + beta_age*age + beta_infer*infer + beta_tubphysd*tubphysd); den = 1 + num; f=num/den; model pregnant = f; der.beta0 = num /(den*den); der.beta_age = age*num /(den*den); der.beta_infer = infer*num /(den*den); der.beta_tubphysd= tubphysd*num /(den*den); predict=f; _weight_ = 1/(predict*(1-predict)); title"PROC NLIN"; run; proc iml; use pregnancy; read all var {age infer tubphysd} into xv; read all var {pregnant} into y; read all var {int} into m; n=nrow(xv); design = repeat(1,n,1) || xv; x = design//design; _y = repeat(1,n,1) // repeat(0,n,1); wgt = y // (m-y); parm = {intercept age infer tubphysd}`; b = repeat(0,ncol(x),1); oldb=b+1; do iter=1 to 20 until ((max(abs(b-oldb))<1e-8)) ; oldb=b; p=1/(1+exp(-(x*b))); f=p#p#exp(-(x*b)); loglik =sum( ((_y=1)#log(p) + (_y=0)#log(1-p))#wgt); btransp = b`; w = wgt/(p#(1-p)); xx = f # x; xpxi = inv(xx`*(w#xx)); step = xpxi*(xx`*(w#(_y-p))); b = b + step; end; stderr = sqrt(vecdiag(xpxi)); title"PROC IML"; print "Parameter estimates and SEs",, parm b stderr,; quit; data mdcpregnancy; set pregnancy; retain id 0; id + 1; choice1 = 1;choice2 = 0; decision = (pregnant = 0); agemdc=0; infermdc=0; tubphysdmdc=0; output; /*-- second choice --*/ choice1 = 0; choice2 = 1; decision = (pregnant = 1); agemdc=age; infermdc=infer; tubphysdmdc=tubphysd; output; run; proc mdc data=mdcpregnancy; model decision = choice2 agemdc infermdc tubphysdmdc / type=clogit nchoice=2 covest=hess; id id; title"PROC MDC"; run; data phregpregnancy; set pregnancy; id=_N_; do i=1 to 4; output;end; run; data phregpregnancy2; set phregpregnancy; if i=1 then do; choice=1; dummy=1; freq=pregnant; end; if i=2 then do; choice=2; dummy=0; freq=pregnant; end; if i=3 then do; choice=2; dummy=1; freq=1-pregnant;end; if i=4 then do; choice=1; dummy=0; freq=1-pregnant;end; dummy_age=dummy*age; dummy_infer=dummy*infer; dummy_tubphysd=dummy*tubphysd; run; proc phreg data=phregpregnancy2; model choice*choice(2) = dummy dummy_age dummy_infer dummy_tubphysd; strata id; freq freq; title"PROC PHREG"; run;