Multi-state model with msm package
- Ref: Multi-state modelling with R: the msm package
- Informative sampling times
- Fixed: patients are observed at fixed time intervals specified in advance
- Random: the sampling times are independent of the current state of the disease
- Doctor’s care: the next sampling time is chosen on the basis of the current disease state (more severe ill patients are monitored more closely)
- Patient self-selection: patients decide to see doctors when they are in poor conditions
- \[P(t) = Exp(tQ)\]
- \(P\) is the probability of transition during the time \(t\)
- \(Q\) is the transition intensity maxtrix
- \(Exp\) is the matrix exponential
- Interval censored: in the intermittently-observed processess, the exact times of the state changes are usually interval censored, it is known to be within the bounded intervals.
- Sojourn Time: a single period of occupancy
CAV Data
Sharples et al. studied the progression of coronary allograft vasculopathy (CAV), a post-transplant deterioration of the arterial walls, using these data. Risk factors and the accuracy of the screening test were investigated using multi-state Markov and hidden Markov models.
- 622 patients
- Approximately each year after transplant, every patient has an angiograme when CAV could be diagnosed.
- age: age at screen
- dage: donor’s age
- sex: 0=male, 1=female
- pdiag: primary diagnosis
- cumrej: cumulative number of rejection episodes
- firstobs: 1 = the first observation (transplant), 0 = later angiogram
L.D. Sharples, C.H. Jackson, J. Parameshwar, J. Wallwork, and S.R. Large. Diagnostic accuracy of coronary angiography and risk factors for post-heart-transplant cardiac allograft vasculopathy. Transplantation, 76(4):679682, 2003.
Data Summary
Vars <- c(
"age"
, "years"
, "dage"
, "sex"
, "pdiag"
, "cumrej"
, "state"
)
FactorVars <- c(
"sex"
, "pdiag"
, "cumrej"
, "state"
)
t <- Table1n(obj = cav, Vars = Vars, FactorVars = FactorVars)
t %>% prt()
what
|
level
|
Overall
|
Overall
|
Missing
|
n
|
|
2846
|
2846
|
|
age
|
mean (SD)
|
48.9 (10.9)
|
48.9 (10.9)
|
0.0
|
|
median [IQR]
|
51.2 [43.7, 56.7]
|
51.2 [43.7, 56.7]
|
0.0
|
|
median [range]
|
51.2 [6.3, 74.3]
|
51.2 [6.3, 74.3]
|
0.0
|
years
|
mean (SD)
|
3.8 (3.3)
|
3.8 (3.3)
|
0.0
|
|
median [IQR]
|
3.9 [1.0, 6.0]
|
3.9 [1.0, 6.0]
|
0.0
|
|
median [range]
|
3.9 [0.0, 19.5]
|
3.9 [0.0, 19.5]
|
0.0
|
dage
|
mean (SD)
|
28.8 (11.4)
|
28.8 (11.4)
|
0.0
|
|
median [IQR]
|
26.0 [19.0, 37.0]
|
26.0 [19.0, 37.0]
|
0.0
|
|
median [range]
|
26.0 [0.0, 61.0]
|
26.0 [0.0, 61.0]
|
0.0
|
sex (%)
|
0
|
2504 (88.0)
|
2504 (88.0)
|
0.0
|
|
1
|
342 (12.0)
|
342 (12.0)
|
|
pdiag (%)
|
CVCM
|
88 ( 3.1)
|
88 ( 3.1)
|
1.1
|
|
Hyper
|
7 ( 0.2)
|
7 ( 0.2)
|
|
|
IDC
|
1283 (45.1)
|
1283 (45.1)
|
|
|
IHD
|
1413 (49.6)
|
1413 (49.6)
|
|
|
Other
|
10 ( 0.4)
|
10 ( 0.4)
|
|
|
Restr
|
15 ( 0.5)
|
15 ( 0.5)
|
|
|
NA
|
30 ( 1.1)
|
30 ( 1.1)
|
|
cumrej (%)
|
0
|
1082 (38.0)
|
1082 (38.0)
|
0.0
|
|
1
|
557 (19.6)
|
557 (19.6)
|
|
|
2
|
425 (14.9)
|
425 (14.9)
|
|
|
3
|
294 (10.3)
|
294 (10.3)
|
|
|
4
|
179 ( 6.3)
|
179 ( 6.3)
|
|
|
5
|
159 ( 5.6)
|
159 ( 5.6)
|
|
|
6
|
73 ( 2.6)
|
73 ( 2.6)
|
|
|
7
|
42 ( 1.5)
|
42 ( 1.5)
|
|
|
8
|
26 ( 0.9)
|
26 ( 0.9)
|
|
|
9
|
3 ( 0.1)
|
3 ( 0.1)
|
|
|
10
|
1 ( 0.0)
|
1 ( 0.0)
|
|
|
11
|
2 ( 0.1)
|
2 ( 0.1)
|
|
|
12
|
3 ( 0.1)
|
3 ( 0.1)
|
|
state (%)
|
1
|
2039 (71.6)
|
2039 (71.6)
|
0.0
|
|
2
|
351 (12.3)
|
351 (12.3)
|
|
|
3
|
205 ( 7.2)
|
205 ( 7.2)
|
|
|
4
|
251 ( 8.8)
|
251 ( 8.8)
|
|
Description
dt <- cav
setDT(dt)
dt <- dt[, state := factor(state)][, years := as.integer(floor(years))]
clrs <- c("#2EAEE6"
## , "#2EE6CA"
## , "#2EE677"
, "#37E62E"
## , "#8AE62E"
, "#DCE62E"
## , "#E69C2E"
, "#E6492E"
## , "#aaaaaa"
)
ggplot(dt, aes(x=years, fill = state)) + geom_bar() +
scale_fill_manual(values = clrs) +
labs(x="Year of State"
, y="Count"
, title="Overall"
) +
## scale_x_discrete(breaks=seq(0, 20, 5)) +
## scale_y_continuous(breaks=seq(1, 5, 1)) +
theme(panel.background = element_rect(fill = "transparent")
, plot.background = element_rect(fill = "transparent", color = NA)
, panel.grid.major = element_blank()
, panel.grid.minor = element_blank()
, legend.background = element_rect(fill = "transparent")
, legend.box.background = element_rect(fill = "transparent")
, legend.position = 'top'
## , plot.margin=unit(c(1,2,1,2),"cm")
) +
## coord_fixed(ratio = 3 / 2
## , xlim = c(0.5, 28.5)
## , ylim = c(0.5, 10.5)
## , expand = FALSE) +
guides(fill = guide_legend(nrow = 1))
ggplot(dt, aes(x=years, fill = state)) + geom_bar(position = "fill") +
scale_fill_manual(values = clrs) +
labs(x="Year of State"
, y="Count"
, title="Proportiona by Sex"
) +
## scale_x_discrete(breaks=seq(0, 20, 5)) +
## scale_y_continuous(breaks=seq(1, 5, 1)) +
facet_wrap(as.formula(paste0("~", "sex"))) +
theme(panel.background = element_rect(fill = "transparent")
, plot.background = element_rect(fill = "transparent", color = NA)
, panel.grid.major = element_blank()
, panel.grid.minor = element_blank()
, legend.background = element_rect(fill = "transparent")
, legend.box.background = element_rect(fill = "transparent")
, legend.position = 'top'
## , plot.margin=unit(c(1,2,1,2),"cm")
) +
## coord_fixed(ratio = 3 / 2
## , xlim = c(0.5, 28.5)
## , ylim = c(0.5, 10.5)
## , expand = FALSE) +
guides(fill = guide_legend(nrow = 1))
Simple bidirectional model
- Death is the absorbing state
- The day of death is assumed to be recorded exactly. The state 4 is death, deathexact=4
- If the data has two death states 4 and 5 with two competing risks, deathexact=c(4,5)
- State transition table: from state (in row) to state (in column)
- Transition intensity matrix Q: zeros in the matrix
- obstype: a vector specifying the observation scheme for each row of the data
- =1: a snapthot of the process, the states are unkown between observation times
- =2: an exact transition time, the state at the previous observation retained untile the current observation
- =3: an exact transition time, but the state at the instant before entering this state is unknown.
- exacttime=TRUE: all observations are of obstype 2.
- deathexact=death.states specifies that all observations of death.states are of type 3.
- deathexact=TRUE specifies all observations in the final absorbing state are of type 3.
Transition Frequencies
1
|
2
|
3
|
4
|
1367
|
204
|
44
|
148
|
46
|
134
|
54
|
48
|
4
|
13
|
107
|
55
|
Q <- rbind ( c(0, 0.25, 0, 0.25),
c(0.166, 0, 0.166, 0.166),
c(0, 0.25, 0, 0.25),
c(0, 0, 0, 0) )
Q %>% prt(caption="Initial Transition Intensity")
Initial Transition Intensity
0.000
|
0.25
|
0.000
|
0.250
|
0.166
|
0.00
|
0.166
|
0.166
|
0.000
|
0.25
|
0.000
|
0.250
|
0.000
|
0.00
|
0.000
|
0.000
|
Observed Frequencies
|
State 1
|
State 2
|
State 3
|
State 4
|
Total
|
0
|
622
|
0
|
0
|
0
|
622
|
1.94602739726027
|
537
|
4
|
5
|
54
|
600
|
3.89205479452054
|
356
|
35
|
24
|
87
|
502
|
5.83808219178081
|
208
|
41
|
28
|
127
|
404
|
7.78410958904108
|
122
|
44
|
27
|
158
|
351
|
9.73013698630135
|
71
|
25
|
22
|
187
|
305
|
11.6761643835616
|
31
|
11
|
13
|
218
|
273
|
13.6221917808219
|
12
|
6
|
5
|
236
|
259
|
15.5682191780822
|
5
|
1
|
3
|
244
|
253
|
17.5142465753424
|
1
|
0
|
2
|
249
|
252
|
19.4602739726027
|
0
|
0
|
0
|
251
|
251
|
Observed percentages
|
State 1
|
State 2
|
State 3
|
State 4
|
0
|
100.0000000
|
0.0000000
|
0.0000000
|
0.00000
|
1.94602739726027
|
89.5000000
|
0.6666667
|
0.8333333
|
9.00000
|
3.89205479452054
|
70.9163347
|
6.9721116
|
4.7808765
|
17.33068
|
5.83808219178081
|
51.4851485
|
10.1485149
|
6.9306931
|
31.43564
|
7.78410958904108
|
34.7578348
|
12.5356125
|
7.6923077
|
45.01425
|
9.73013698630135
|
23.2786885
|
8.1967213
|
7.2131148
|
61.31148
|
11.6761643835616
|
11.3553114
|
4.0293040
|
4.7619048
|
79.85348
|
13.6221917808219
|
4.6332046
|
2.3166023
|
1.9305019
|
91.11969
|
15.5682191780822
|
1.9762846
|
0.3952569
|
1.1857708
|
96.44269
|
17.5142465753424
|
0.3968254
|
0.0000000
|
0.7936508
|
98.80952
|
19.4602739726027
|
0.0000000
|
0.0000000
|
0.0000000
|
100.00000
|
Expected Frequencies
|
State 1
|
State 2
|
State 3
|
State 4
|
Total
|
0
|
622.00000
|
0.00000
|
0.00000
|
0.00000
|
622
|
1.94602739726027
|
449.07969
|
75.24240
|
23.48806
|
52.18985
|
600
|
3.89205479452054
|
295.63609
|
71.64422
|
39.76870
|
94.95098
|
502
|
5.83808219178081
|
191.64727
|
53.65790
|
38.71516
|
119.97967
|
404
|
7.78410958904108
|
135.79981
|
40.89343
|
33.86263
|
140.44413
|
351
|
9.73013698630135
|
96.93930
|
30.41968
|
27.20555
|
150.43546
|
305
|
11.6761643835616
|
71.59456
|
23.02624
|
21.53417
|
156.84503
|
273
|
13.6221917808219
|
56.19788
|
18.34853
|
17.62184
|
166.83175
|
259
|
15.5682191780822
|
45.49718
|
14.99375
|
14.63348
|
177.87558
|
253
|
17.5142465753424
|
37.59868
|
12.46273
|
12.28366
|
189.65493
|
252
|
19.4602739726027
|
31.09172
|
10.34315
|
10.25666
|
199.30848
|
251
|
Expected percentages
|
State 1
|
State 2
|
State 3
|
State 4
|
0
|
100.00000
|
0.000000
|
0.000000
|
0.000000
|
1.94602739726027
|
74.84662
|
12.540399
|
3.914677
|
8.698308
|
3.89205479452054
|
58.89165
|
14.271757
|
7.922053
|
18.914539
|
5.83808219178081
|
47.43744
|
13.281659
|
9.582959
|
29.697938
|
7.78410958904108
|
38.68941
|
11.650548
|
9.647473
|
40.012573
|
9.73013698630135
|
31.78338
|
9.973666
|
8.919853
|
49.323103
|
11.6761643835616
|
26.22512
|
8.434519
|
7.887975
|
57.452391
|
13.6221917808219
|
21.69802
|
7.084374
|
6.803800
|
64.413804
|
15.5682191780822
|
17.98308
|
5.926383
|
5.783985
|
70.306555
|
17.5142465753424
|
14.92011
|
4.945527
|
4.874470
|
75.259891
|
19.4602739726027
|
12.38714
|
4.120776
|
4.086317
|
79.405769
|
Transition Probabilities from time 0 to time 1
|
State 1
|
State 2
|
State 3
|
State 4
|
State 1
|
0.8539587
|
0.0883695
|
0.0147554
|
0.0429163
|
State 2
|
0.1555769
|
0.5666328
|
0.2059956
|
0.0717946
|
State 3
|
0.0099040
|
0.0785369
|
0.6596573
|
0.2519018
|
State 4
|
0.0000000
|
0.0000000
|
0.0000000
|
1.0000000
|
predicted prevalence percentage at time 5
|
x
|
State 1
|
51.965804
|
State 2
|
13.851775
|
State 3
|
9.119847
|
State 4
|
25.062574
|
Model Fitness
Model with Covariates
Hazard Ratio of sex 1 against 0
|
HR
|
L
|
U
|
State 1 - State 2
|
0.5632779
|
0.3333382
|
9.518320e-01
|
State 1 - State 4
|
1.1289701
|
0.6261976
|
2.035417e+00
|
State 2 - State 1
|
1.2905854
|
0.4916004
|
3.388139e+00
|
State 2 - State 3
|
1.0765518
|
0.5193868
|
2.231408e+00
|
State 2 - State 4
|
0.0003805
|
0.0000000
|
1.999137e+57
|
State 3 - State 2
|
1.0965531
|
0.1345395
|
8.937364e+00
|
State 3 - State 4
|
2.4135380
|
1.1762927
|
4.952140e+00
|
Estimated Time for Each State sex=1
|
estimates
|
SE
|
L
|
U
|
State 1
|
8.106755
|
1.4777620
|
5.5868064
|
11.233842
|
State 2
|
1.553960
|
0.8304127
|
0.0000000
|
2.472661
|
State 3
|
1.218479
|
0.4488469
|
0.5005559
|
2.259481
|
Estimated Time for Each State sex=0
|
estimates
|
SE
|
L
|
U
|
State 1
|
5.634780
|
0.3301271
|
5.041437
|
6.321529
|
State 2
|
1.653057
|
0.1387556
|
1.362682
|
1.934526
|
State 3
|
2.436049
|
0.3371423
|
1.816691
|
3.192994
|
Model Comparison
cav.msm <- msm(state ~ years
, subject = PTNUM
, data = cav
, qmatrix = Q
, deathexact = 4)
cav.msm.1 <- msm(state ~ years
, subject = PTNUM
, data = cav
, qmatrix = Q
, deathexact = 4
, covariates = ~ sex
)
t <- lrtest.msm(cav.msm, cav.msm.1)
t %>% prt(caption="Likelihood Ratio Test: Null model vs sex model"
, col.names=attr(t, "dimnames")[[2]]
)
Likelihood Ratio Test: Null model vs sex model
|
-2 log LR
|
df
|
p
|
cav.msm.1
|
14.02089
|
7
|
0.0508111
|
Computing Environment
R version 4.2.0 (2022-04-22) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.3 LTS
Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale: [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] msm_1.6.9 Wu_0.0.0.9000 flexdashboard_0.6.0 [4] lme4_1.1-30 Matrix_1.4-0 mgcv_1.8-38
[7] nlme_3.1-152 png_0.1-7 scales_1.2.0
[10] nnet_7.3-16 labelled_2.9.1 kableExtra_1.3.4
[13] plotly_4.10.0 gridExtra_2.3 ggplot2_3.3.6
[16] DT_0.24 tableone_0.13.2 magrittr_2.0.3
[19] lubridate_1.8.0 dplyr_1.0.9 plyr_1.8.7
[22] data.table_1.14.2 rmdformats_1.0.4 knitr_1.39
loaded via a namespace (and not attached): [1] webshot_0.5.3 httr_1.4.4 tools_4.2.0 bslib_0.4.0
[5] utf8_1.2.2 R6_2.5.1 DBI_1.1.3 lazyeval_0.2.2
[9] colorspace_2.0-3 withr_2.5.0 tidyselect_1.1.2 compiler_4.2.0
[13] cli_3.3.0 rvest_1.0.2 expm_0.999-6 xml2_1.3.3
[17] labeling_0.4.2 bookdown_0.28 sass_0.4.2 mvtnorm_1.1-3
[21] proxy_0.4-27 systemfonts_1.0.4 stringr_1.4.0 digest_0.6.29
[25] minqa_1.2.4 rmarkdown_2.14 svglite_2.1.0 pkgconfig_2.0.3
[29] htmltools_0.5.3 fastmap_1.1.0 highr_0.9 htmlwidgets_1.5.4 [33] rlang_1.0.4 rstudioapi_0.13 farver_2.1.1 jquerylib_0.1.4
[37] generics_0.1.3 zoo_1.8-10 jsonlite_1.8.0 crosstalk_1.2.0
[41] Rcpp_1.0.9 munsell_0.5.0 fansi_1.0.3 lifecycle_1.0.1
[45] stringi_1.7.8 yaml_2.3.5 MASS_7.3-54 grid_4.2.0
[49] forcats_0.5.1 lattice_0.20-45 haven_2.5.0 splines_4.2.0
[53] hms_1.1.1 klippy_0.0.0.9500 pillar_1.8.1 boot_1.3-28
[57] glue_1.6.2 evaluate_0.16 mitools_2.4 vctrs_0.4.1
[61] nloptr_2.0.3 gtable_0.3.0 purrr_0.3.4 tidyr_1.2.0
[65] assertthat_0.2.1 cachem_1.0.6 xfun_0.32 survey_4.1-1
[69] e1071_1.7-11 class_7.3-19 survival_3.2-13 viridisLite_0.4.0 [73] tibble_3.1.8 ellipsis_0.3.2