Main Content

This example shows how to convert survival data to counting process form and then construct a Cox proportional hazards model with time-dependent covariates.

A Cox model with time-dependent covariates requires survival data to be in counting process form and not in standard layout. To see the difference between survival data in standard layout and in counting process form, load the following sample data.

`load simplesurvivaldata`

This sample data contains two tables: `relapseS`

and `relapseCP`

. These two tables represent the same simple survival data in standard layout and in counting process form, respectively.

Display the data in standard layout.

relapseS

`relapseS=`*2×5 table*
ID Time Censoring Age StopTreatment
__ ____ _________ ___ _____________
1 5 0 20 NaN
2 20 1 30 12

This data represents two patients whose treatment status changes over time. Patient 1 was not taking treatment for the interval from week 0 to 5 and relapsed at the end of the interval. Patient 2 was taking treatment for the interval from week 0 to 12, but not for the interval from week 12 to 20. Patient 2 did not relapse and left the study after week 20.

Now display the same data in counting process form.

relapseCP

`relapseCP=`*3×6 table*
ID tStart tStop Censoring Age TreatmentStatus
__ ______ _____ _________ ___ _______________
1 0 5 0 20 0
2 0 12 1 30 1
2 12 20 1 30 0

In counting process form, each row represents the risk interval (`tStart`

,`tStop`

] instead of a single value of an event time. `Censoring`

is 0 if the event is observed at the end of the risk interval, and 1 if it is not. `TreatmentStatus`

corresponds to a time-dependent covariate, which represents the same information with `StopTreatment`

in standard layout. Note that a Cox model assumes time-dependent covariates to be constant in each risk interval.

Next, load sample data to convert.

`load survivaldatacp`

This sample data contains a table `labS`

, which is simulated survival data including repeated measurement for each patient in standard layout.

Display the simulated survival data in standard layout.

labS

`labS=`*6×7 table*
ID Time Censoring Sex Lab_0 Lab_50 Lab_100
__ ____ _________ ___ _____ ______ _______
1 46 0 1 0.3 NaN NaN
2 138 1 0 0.2 0.23 0.39
3 94 0 1 0.18 0.22 NaN
4 50 0 0 0.21 0.2 NaN
5 106 0 0 0.25 0.21 0.42
6 98 0 0 0.21 0.22 NaN

In standard layout, each row of the table shows information for one patient.

`ID`

indicates the ID of a patient. You do not include`ID`

as an input of a Cox model. Include`ID`

in a data set to confirm that the data set is correctly converted to counting process form.`Time`

represents time to event in days, which corresponds to a response variable.`Censoring`

has the censorship information for each patient, where 1 indicates censored data and 0 indicates that the exact time to event is observed at the end of the observation period.`Sex`

is a time-independent predictor where 1 indicates female, and 0 indicates male.`Lab_0`

,`Lab_50`

, and`Lab_100`

represent three consecutive laboratory results measured at day 0, 50, and 100, which correspond to a time-dependent predictor.

To convert the survival data `labS`

to counting process form, execute the code below. This code converts `Time`

to a risk interval (`tStart`

,`tStop`

] and combines three vectors of the time-dependent predictor, `Lab_0`

, `Lab_50`

, and `Lab_100`

, into one vector, `Lab`

.

mTime = [0 50 100]; % Measurement time threeLabs = [labS.Lab_0 labS.Lab_50 labS.Lab_100]; nLabMeasure = sum(sum(~isnan(threeLabs))); % Number of lab measurements data = zeros(nLabMeasure,6); % One row for each observation oID = 0; % Observation ID for i = 1 : size(labS,1) idx = find(mTime <= labS.Time(i)); for j = 1 : length(idx)-1 oID = oID + 1; data(oID,:) = [labS.ID(i) mTime(j:j+1) 1 labS.Sex(i) threeLabs(i,j)]; end oID = oID + 1; data(oID,:) = [labS.ID(i) mTime(length(idx)) labS.Time(i) ... labS.Censoring(i) labS.Sex(i) threeLabs(i,length(idx))]; end labCP = table(data(:,1),data(:,2),data(:,3),data(:,4),data(:,5),data(:,6), ... 'VariableNames', {'ID','tStart','tStop','Censoring','Sex','Lab'});

Display the survival data in counting process form.

labCP

`labCP=`*13×6 table*
ID tStart tStop Censoring Sex Lab
__ ______ _____ _________ ___ ____
1 0 46 0 1 0.3
2 0 50 1 0 0.2
2 50 100 1 0 0.23
2 100 138 1 0 0.39
3 0 50 1 1 0.18
3 50 94 0 1 0.22
4 0 50 1 0 0.21
4 50 50 0 0 0.2
5 0 50 1 0 0.25
5 50 100 1 0 0.21
5 100 106 0 0 0.42
6 0 50 1 0 0.21
6 50 98 0 0 0.22

In counting process form, each row of table `labCP`

shows information of one observation corresponding to one risk interval. Note that a Cox model assumes `Lab`

to be constant in the risk interval (`tStart`

,`tStop`

]. The value in `Censoring`

is 0 if an event is observed at the end of the risk interval, and 1 if an event is not observed.

For example, patient 3 has two laboratory measurements at day 0 and 50, so there are two rows of data for patient 3 in counting process form. A Cox model assumes the lab results 0.18 and 0.22 to be constant in the interval (0,50] and (50,94], respectively. `Censoring`

is 1 in (0,50] and 0 in (50,94] because the exact event time of patient 3 is observed at day 94.

Find a patient who has a zero-length risk interval.

idxInvalid = labCP.ID(find(labCP.tStart == labCP.tStop))

idxInvalid = 4

Review the data for patient 4.

labCP(find(labCP.ID==idxInvalid),:)

`ans=`*2×6 table*
ID tStart tStop Censoring Sex Lab
__ ______ _____ _________ ___ ____
4 0 50 1 0 0.21
4 50 50 0 0 0.2

The time to event of patient 4 coincides with the measurement day 50. However, (50,50] is an invalid risk interval for a Cox model because the model does not accept a zero length interval. Adjust the risk interval to be valid. You can choose any value less than the time unit as an adjustment amount. The choice of an adjustment amount is arbitrary, and it does not change the result.

idxAdjust = find(labCP.ID==idxInvalid); labCP.tStop(idxAdjust(1)) = labCP.tStop(idxAdjust(1))-0.5; labCP.tStart(idxAdjust(2)) = labCP.tStart(idxAdjust(2))-0.5; labCP(idxAdjust,:)

`ans=`*2×6 table*
ID tStart tStop Censoring Sex Lab
__ ______ _____ _________ ___ ____
4 0 49.5 1 0 0.21
4 49.5 50 0 0 0.2

Fit a Cox proportional hazards model with the time-independent variable `Sex`

and time-dependent variable `Lab`

.

X = [labCP.Sex labCP.Lab]; T = [labCP.tStart labCP.tStop]; b = coxphfit(X,T,'Censoring',labCP.Censoring,'Baseline',0)

`b = `*2×1*
2.0054
29.7530

For details on how to assess a Cox proportional hazards model, see Cox Proportional Hazards Model for Censored Data.

- Hazard and Survivor Functions for Different Groups
- Survivor Functions for Two Groups
- Cox Proportional Hazards Model for Censored Data