<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Quantitative and Statistical Consulting Blog &#187; case cohort</title>
	<atom:link href="https://missionalconsulting.com/methods/tag/case-cohort/feed/" rel="self" type="application/rss+xml" />
	<link>https://missionalconsulting.com/methods</link>
	<description>The Consulting Blog</description>
	<lastBuildDate>Tue, 11 Nov 2014 18:15:19 +0000</lastBuildDate>
	<language>en-US</language>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.0.38</generator>
	<item>
		<title>R code: six case-cohort and two nested case-control methods</title>
		<link>https://missionalconsulting.com/methods/rcode-cch-ncc/</link>
		<comments>https://missionalconsulting.com/methods/rcode-cch-ncc/#comments</comments>
		<pubDate>Fri, 17 Oct 2014 15:45:00 +0000</pubDate>
		<dc:creator><![CDATA[Ryung S. Kim]]></dc:creator>
				<category><![CDATA[R codes]]></category>
		<category><![CDATA[case cohort]]></category>
		<category><![CDATA[nested case-control]]></category>
		<category><![CDATA[R code]]></category>

		<guid isPermaLink="false">http://missionalconsulting.com/methods/?p=81</guid>
		<description><![CDATA[I used two nested case-control and six case-cohort methods in the following paper: Kim RS, A New Comparison of Nested Case-Control and Case-Cohort Designs and Methods (Under Review) I originally intended to use existing packages but found inaccuracies in some functions.  I have discussed the inaccuracies with R community (here) as well as with one of the [&#8230;]]]></description>
				<content:encoded><![CDATA[<p class="PlainText">I used two nested case-control and six case-cohort methods in the following paper:</p>
<div class="PlainText">
<p>Kim RS, A New Comparison of Nested Case-Control and Case-Cohort Designs and Methods (Under Review)</p>
<p class="PlainText">I originally intended to use existing packages but found inaccuracies in some functions.  I have discussed the inaccuracies with R community (<a href="http://r.789695.n4.nabble.com/A-couple-of-questions-regarding-the-survival-cch-function-td4674359.html" target="_blank">here</a>) as well as with one of the authors privately. So I ended up writing my own codes for the methods and I am sharing them here with research community.</p>
<p>For each method, one can easily use &#8216;robust&#8217; or &#8216;sandwich&#8217; option in software to compute variance estimators that converge to approximate jackknife estimators (or ‘robust’ estimators by Lin &amp; Wei, 1989) as sample size increases to the size of full cohorts. When someone has already reported its use, I indicated the names (e.g. Lin and Ying). Otherwise, I named them Type B variance estimators.</p>
</div>
<div class="PlainText">t.entry.nm = NULL;t.exit.nm = &#8220;time&#8221;;fail.nm &lt;- &#8220;status&#8221;;match.nm =NULL</div>
<div class="PlainText">dt &lt;- simulation(N=N,beta1=beta1.choices[j],max.censor=max.censor.choice[j])</div>
<div class="PlainText">risk.table &lt;- risk.table.f(fail.nm=fail.nm,data=dt,t.entry.nm=t.entry.nm,t.exit.nm=t.exit.nm)<br />
inclusion.prob &lt;- inclusion.prob.ncc.f(data=dt,t.entry.nm=t.entry.nm, t.exit.nm=t.exit.nm, fail.nm=fail.nm,controls=m,risk.table=risk.table,match.nm=match.nm)<br />
sub.dt.clogit &lt;-  nccdata(t.entry.nm = t.entry.nm, t.exit.nm = t.exit.nm,  fail.nm=fail.nm,controls = m, match.nm =NULL, data=dt,risk.table=risk.table,inclusion.prob=inclusion.prob)<br />
sub.dt.ncc &lt;- sub.dt.clogit[!duplicated(sub.dt.clogit$Map),]<br />
sub.dt.ncc &lt;- cbind(sub.dt.ncc,cluster.id=1:nrow(sub.dt.ncc))<br />
n.ncc&lt;-nrow(sub.dt.ncc);n0.ncc&lt;-sum(sub.dt.ncc[,fail.nm]==0);N0&lt;-sum(dt[,fail.nm]==0)<br />
cch.p =n0.ncc/N0 #To ensure expected sample size of ncc and cch are the same<br />
CCH &lt;-casecohortdata(dt,cch.p,fail.nm)<br />
CCH  &lt;- cbind(CCH,cluster.id=1:nrow(CCH))</div>
<div class="PlainText">The following function produces Self &amp; Prentice estimate for linear coefficients but two types of variances, one by Self &amp; Prentice and one by Lin &amp; Ying.</div>
<h1 class="PlainText">Nested Case-Control Methods</h1>
<h6 class="PlainText">Thomas (1977)&#8217;s Hazard Ratio and Standard Error Estimation (&#8220;Conditional Logistic Method&#8221;)</h6>
<p>First, make a &#8216;trick&#8217; data set &#8220;by including multiple inputs for subjects who are selected multiple times, and recording all randomly selected failures as non-failures.&#8221; (Kim 2013)</p>
<div class="PlainText">
<pre>clogit.fn &lt;- function(formula,data){
 fit &lt;- coxph(formula=formula,data=data,method="exact")
 res &lt;- list(coef=fit$coef, se=sqrt(diag(fit$var)))
 }
formula.clogit &lt;- Surv(time, Fail) ~ X1+X2 + strata(Set)
clogit.fn(formula.clogit,data=sub.dt.clogit)</pre>
<h6>Samuelsen (1997)&#8217;s Hazard Ratio and Kim (2013)&#8217;s Standard Error Estimation  (&#8220;IPW method&#8221;)</h6>
<pre>robust.coxph.fn &lt;-function(formula,sub.dt,wt="wt"){
 if(is.null(wt)) {wt&lt;-rep(1,nrow(sub.dt))} else { wt&lt;-sub.dt[,wt]}
 fit&lt;-coxph(formula=formula,data=sub.dt, weights=wt)
 res&lt;-list(coef=fit$coef,se=sqrt(diag(fit$var)))
}
robust.formula &lt;- Surv(time, status) ~ X1+X2+cluster(cluster.id)
robust.coxph.fn(robust.formula,sub.dt.ncc,wt="wt")</pre>
<h1 class="PlainText">Case-Cohort Methods</h1>
<h6>Barlow (1994)&#8217;s Hazard Ratio and Standard Error Estimation</h6>
<p>Barlow (1994) originally proposed two approaches for weighting: 1) using time dependent subcohort proportion π(<em>t</em>) which is the number of subjects at risk among the subcohort at <em>t</em> divided by the number of subjects at risk in the cohort at <em>t</em>, or 2) using the baseline subcohort proportion π. Later Barlow et al. (1999) used only π.  Here the program uses the  baseline subcohort proportion π.</p>
<pre>barlow.dt &lt;- function(CCH,cch.p){
#assumes no ties in failure time &amp; zero entry time
#http://lib.stat.cmu.edu/general/robphreg
eps&lt;-1e-08
wt.temp=1/cch.p
ST &lt;- CCH$status
SC &lt;- CCH$in.subcohort
temp1 &lt;- CCH[ST==1,]
temp1 &lt;- cbind(temp1, T=temp1$time,WWTT=1,start=temp1$time-eps,CASE=1)
situation2&lt;-ST==1 &amp; SC==T
if(sum(situation2)&gt;0){
 temp2 &lt;- CCH[ST==1 &amp; SC==T,]
 temp2 &lt;- cbind(temp2,T=temp2$time-eps,WWTT=wt.temp,start=temp2$t.entry,CASE=0)
 }
 temp3 &lt;- CCH[ST==0 &amp; SC==T,]
 temp3 &lt;- cbind(temp3,T=temp3$time,    WWTT=wt.temp,start=temp3$t.entry,CASE=0)
 if(sum(situation2)&gt;0){
 res&lt;-rbind(temp1,temp2,temp3)
 } else res &lt;-rbind(temp1,temp3)
 return(cbind(res,logwt=log(res$WWTT)))
 }
barlow.fn &lt;-function(formula=formula.Barlow,data=CCH.barlow){
 fit&lt;-coxph(formula=formula,weights=WWTT, data=data)
 res&lt;-list(coef=fit$coef,se=sqrt(diag(fit$var)))
 }

formula.Barlow&lt;-Surv(start,T, CASE) ~ X1+X2+cluster(cluster.id)
CCH.barlow &lt;- barlow.dt(cbind(CCH, t.entry=0), cch.p)
barlow.fn(formula.Barlow, data=CCH.barlow)
</pre>
<h6>Prentice (1986)&#8217;s Hazard Ratio and Self &amp; Prentice (1988)&#8217;s Standard Error Estimation</h6>
<pre>prentice.dt &lt;-function(CCH){
 eps  &lt;- 1e-08
 TM   &lt;-CCH$time
 start&lt;-CCH$t.entry
 SC   &lt;-CCH$in.subcohort
 start[SC == FALSE] &lt;- TM[SC == FALSE] - eps
return(cbind(CCH,start,TM))
}

CCH.prentice &lt;- prentice.dt(cbind(CCH,t.entry=0))
prentice &lt;-function(formula.prentice,data=CCH.prentice,cohort.size=nrow(dt)){
 SC &lt;- data$in.subcohort
 fit &lt;- coxph(formula.prentice,data=data,x=T)
 dfbeta &lt;- resid(fit,type="dfbeta")
 d2 &lt;- dfbeta[SC,]
 fit$naive.var &lt;- fit$naive.var + (1- sum(SC)/cohort.size)* t(d2)%*% d2
 fit$SelfPrentice.var &lt;- fit$naive.var
fit$AJK.var &lt;- fit$var
 fit&lt;-fit[c("coefficients","SelfPrentice.var","AJK.var"]
 return(fit)
 }
prentice.fn &lt;-function(formula=formula.prentice,data=CCH.prentice){
 fit&lt;-prentice(formula=formula,data=data)
 res&lt;-list(coef=fit$coef,se=sqrt(diag(fit$SelfPrentice.var)))
 }
formula.prentice &lt;- Surv(start, T, status) ~ X1+X2+cluster(cluster.id)
prentice.fn(formula.prentice,data =CCH.prentice)</pre>
<h6>Prentice (1986)&#8217;s Hazard Ratio and Type B (Approximate Jackknife) Standard Error Estimation</h6>
<pre>prentice.AJK.fn &lt;-function(formula=formula.prentice,data=CCH.prentice){
 fit&lt;-prentice(formula=formula,data=data)
 res&lt;-list(coef=fit$coef,se=sqrt(diag(fit$AJK.var)))
 }
formula.prentice&lt;-Surv(start, T, status) ~ X1+X2+cluster(cluster.id)
prentice.AJK.fn(formula.prentice,data =CCH.prentice)</pre>
<h6>Self &amp; Prentice (1998)&#8217;s Hazard Ratio and Standard Error Estimation</h6>
<pre>temp1 &lt;- data.frame(CCH[CCH$status==1, ],         group=1, dummy= -100)
temp2 &lt;- data.frame(CCH[CCH$in.subcohort==TRUE, ],group=0, dummy= 0)
CCH.SP         &lt;- rbind(temp1, temp2)

selfprentice &lt;-function(formula.SP,data=sub.dt,cohort.size=nrow(dt)){
    SC &lt;- data$in.subcohort
   fit &lt;- coxph(formula.SP,data=data,x=T)
dfbeta &lt;- resid(fit,type="dfbeta")
    d2 &lt;- dfbeta[SC,]
fit$SelfPrentice.var &lt;- fit$naive.var + (1- sum(SC)/cohort.size)* t(d2)%*% d2
     fit$LinYing.var &lt;- fit$var
fit&lt;-fit[c("coefficients","SelfPrentice.var","LinYing.var")]
return(fit)
}
selfprentice.fn &lt;-function(formula,data=sub.dt,cohort.size=nrow(dt)){
 fit&lt;-selfprentice(formula=formula,data=data,cohort.size=cohort.size)
 res&lt;-list(coef=fit$coef,se=sqrt(diag(fit$SelfPrentice.var)))
 }
formula.SP &lt;- Surv(time, group) ~ X1+X2+offset(dummy) + cluster(cluster.id)
selfprentice.fn(formula.SP, data=CCH.SP, cohort.size=nrow(dt))</pre>
<h6>Self &amp; Prentice (1998)&#8217;s Hazard Ratio and Lin &amp; Ying (1993)&#8217;s  (Approximate Jackknife) Standard Error Estimation</h6>
<pre>linying.fn &lt;-function(formula,data=sub.dt,cohort.size=nrow(dt)){
 fit&lt;-selfprentice(formula=formula,data=data,cohort.size=cohort.size)
 res&lt;-list(coef=fit$coef,se=sqrt(diag(fit$LinYing.var)))
 }
formula.SP &lt;- Surv(time, group) ~ X1+X2+offset(dummy) + cluster(cluster.id)
linying.fn(formula.SP, data =CCH.SP, cohort.size=nrow(dt))</pre>
<h6>Binder (1992)&#8217;s Hazard Ratio and Type B (Approximate Jackknife) Standard Error Estimation (&#8220;IPW method&#8221;)</h6>
<pre>robust.coxph.fn &lt;-function(formula,sub.dt,nm="Robust.Cox",wt="wt"){
 if(is.null(wt)) {wt&lt;-rep(1,nrow(sub.dt))} else { wt&lt;-sub.dt[,wt]}
 fit&lt;-coxph(formula=formula,data=sub.dt, weights=wt)
 res&lt;-list(coef=fit$coef,se=sqrt(diag(fit$var)))
}
robust.formula &lt;- Surv(time, status) ~ X1+X2+cluster(cluster.id)
robust.coxph.fn(robust.formula,CCH,wt="wt")</pre>
<p>Contact me for beta versions of the following:</p>
<h6>Samuelsen (1992)&#8217;s Hazard Ratio Standard Error Estimation (&#8220;IPW method&#8221;)</h6>
<h6>Binder (1992)&#8217;s Hazard Ratio and Lin (2000)&#8217;s Standard Error Estimation (&#8220;IPW method&#8221;)</h6>
<pre>library(survey)
svycoxph.fn&lt;-function(formula,design=design){
 fit&lt;-svycoxph(formula,design=design)
 list(coef=fit$coef,se=sqrt(diag(fit$var)))
 }
formula &lt;- Surv(time, status) ~ X1+X2
svycoxph.fn(formula,design=svydesign(id=~1, weights=~wt, data=CCH))</pre>
</div>
]]></content:encoded>
			<wfw:commentRss>https://missionalconsulting.com/methods/rcode-cch-ncc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
