wc1y$Tt>k^Z~e@Z0J$qgC$
zeR4WOVSpP-Jbd{ZbP59Oh4vS*-xI^XiD(_c0AtKW%@LD{-k
zj?YIr?uRH~+~2%WK`4
z2$+g|Khr4NYA3$jU&A$cXvefqROCwGtz;Lag`F{Ufq0xwRy7QDO*M>j?^I{oRi8!&hhUA$yj_mr7P9DrIL-
zGPJr$Ao5Jz-!BZ@hCgaN()&lg2^Kw*TJhb)s=$reH5TD{WEpQx604F@i-yPVZpl7_
zS;;ScZ6W#=ZOW*nwNK~Y?J2oy$x1My-ltr(`4bQI+!S~5ZMmRxd^5uYqh`jA{*jj0
z_xM(Kq_X|q(sT_^+)W;Kjbxs{$pe99?1pH|Wf0w&)UMTZmIub07P6a(vM{)cRqjG?
zXXxGbb(?{!Ek-DfZ?Zjtxwl4i8JjpCF8*j^(`U5Y+FZjcXk{_#|4QvTLm>M(@HCCx
zR*SN3pYj;DL
zv!^93rHP(Z?2OQDt=Vvkydk&KCl1|Xv`GGX8OTFP*n18Z2FM5%_edE)x~jip8B+I(
zl5tBu-LTgazY)dptZsuCDeU|uURw~uu3@bezbX6lX8{+@Wb|HR;qw8UFqYruH+O30
z3+a#~Cm65g%)ZN*SYfbf6y#nZP9nw9;a(aUZ(phXIF*d_G0`(c17AyIFLYnySrrg`)=eAM&eYX0n7w
z2~_qX8_z{bx(H=
z6KZF5UeJ9M+n83DNryPvYj(wcQZZ#Uf`14_7(`HMUO)2M=X@+l7_E@~2nRpi%L
zpIN5*b=Yf#Wl`CEFo4r4Y4H2HG6EG6fdrzm)Di}_;RId6LUa<()zl!*4_vHQusx)w
zG?KZ?^`ahsFO1!FkQH)`*j7AoHw6qAFFUqf(c0g;1vUg*CWd+yd@!~~$Nc8;o-}(!
ztcZ}RTJzv+dI?S*I7g*2K#{S~T$s`B)CQk;hVC>u5@lKBc;7+90Z(;#Qgr$8b$O@4
z*y!W)cj|qgm*p8s93g5O&obf58uve)4DVlXq&+9s#k#2TH_QdTyrjosyQ<$7qBlqG
z&%O;K8X?X)7LMv#?=(OwnvNMUmY%5FKB1Wwn4@iH9`HCv>>F(R^T$J3zH!RK=Ki1L
zuC66~pI_6*l+`zB8FS_1jYis@Ol-I1)O@UkI-tqaIehnyJ1&MV&{K`=6{`}3GRBexhCNeU>
z1Yp++1cz#!79C+33o;8tjNrg$0(3_9-OpsD&Aj@@XEuvt!tN#?kPhfFYfjWg9pD-G
z8$?3PQT|rX7{dNj3tejme3?+BKrKAuN*`gZi-}K6V<9CNYXyv{O2^1n#wGX2(%@S1
z1t_@Z;rfI-2DcNpeF#vl-IJMfyH_sa7;W{zw~_|QKMfPU<2UHqm)#(w-esGDy~Ads
zvgzmcL%vZ^87pZwrQ1)jYk*`2_kA9{HQplD_xo>L7_ZGqQLJfWc9d_Nh?Kv54mY@sBE@tDp+qz9f
zB1G}=`cbfDYuaAru8T0=D(y&g1E@Uhhb{1FLlpny%9<=1)vZc4IGWQeOmXLG(P-Zp
zGzEMshT?Ub{@Om7RYtn2?z0GIM81kOY0V-R+m%2#N|#4L21|QT)3@2eV~vY&U$ayu
z$k^);sV!dwt%x#Ly5-Mw{@D3@m~>)eZ9n?)`a$BqH~9G5ZLq4Nw`(3kpjJZBloi6A
zcw6P#;Ka7tOhMzvVJ7|VvWrczJdLsF`ZZbBo=pR0H3N*Q=Wwrf-TpOii9%+ki*Vwe
zl6Fl}4gcZb?o(%lpl3V`sa!(x-!MLxgOtmeB_Cw7#9l3wYPjfAAE12M$#2RiA({tVNzToF#rIx6=`NyuMDwL4mD{o4h4!DiSQ
zSD3sWk>l>u0>_^}VMMfy{25ss_qCCkulda~mn)-&`cmeKGSt(P8;vw5wqc1lPO^*&RT$Tw@#^R&OG38EUQ7wBIa-6sU^4H`+AkBxLZqf|XvOy2{HR
zZqH~{DBBGme9`&&aVvC5G^Ix8ho|snd6rLt7*j#i=Ey93Ao&diX*aIg&!|5qKU(>{
zl(nP_qfXXkF)|`WfUBBy`?c7H2l8LXO)>6f!jvtBxYdg;=WA;cHxn2}hjnYwyWB&i
zQ-xl6w-@q+Dd|e%q(}rRO`rp<6K9OSIX(L1Pm<&4o)yh$+fVpB+S3>hz$EPfOf##)
z8LW)6ne11UcaW*|mY*p0yj-1qV^--;Mi=HPRc#qtE{0LCN2F4kYvA03=b@ToSMC^T
zagCxC;f~MWFA|eS_sqgG(y+haq;$yc&EVU!9hdNQyYX55Btcn+#1VH#;}_7tE+_YU
zhw#6xm*dM98Hf9w
z>t0LKk-)o{p{Z5P*sZklO=Yrpg~Ote;*Mi%rxd=dzL`xmnCI6dE+y;mdY_*NupR8G
zX?-BHuEhpFFzgh3FR}uWC=Gd<%xh!qkE5Nh>1V9Jj0-+G?lF@
zQ(Qo6EMrxb%9--}$yZmT<=(-G9HjJlF2;!#Fn=wL&aLLx@^vt7eLd_P^P#=EPw?+Q
zicE>BGh_vzf=eQEbV;dEEQZ0tW5%4p#cBRfc@7`Ha_jE4;YRVot%(*7vMussT{B7J
zP6-_~MG9eY7Lxb5sXt^CR24QhY4EiizhOP#|J7704q5mpY9sk<^>H`mqFh(HT*osc
z$n&e^&vmNA>?_D?{Z%&o1-Zs%(ozx}sDLl!w4?TBJNp-jlOG0l)3JIZnG$Kc*w(VE
zA%A3atP%vWa-w09OGDdp4~&c(se~|x|FFPW^)ut&kxyIA1<6Mrv7Ryq8G#vgZAUKx
zR#TqllVR3mCzV|p<~;8s&nfYOh+b>3G_6^{(y#B(zHc|=%O6Xaq<^mqV=6~wz_;7n
z8h0Muls*Vxo1{`~gs_cS^cjxaSdlk07S|r4kITky>?`!WNwJG@oa30_@5tlhUoPnM
zvWg+`Y>4#tpS&@zyH#O9+OMNxSKMtdW97Sg`;7Rx$u^VVgbB~}?VWTv8_Y#e!wGJ^
z=+rCScknN|-259xrr+YIiP%s^I2Nw!LW*ZQwy_3sz)P=+6H25xDj8}DmhjMHeOcSw
zwO-t;EPj+?1Sd8}nKTP``T-?$dc2D`@a3rm!nXR&N(fuH&hF2X~OwxH7D8Y3$XfRC#rWP`03r+#9@~9TlEAQsO%wQ}X@B
zs}eFT(Y}+0rrR8r)H{_UGxJJ`P>rJfz{Gndk8%bU7F8~pq7kXeHLfI>(l$#q}Irs
zL%08gdRg)J_3lZhle@t61<)Or;Q0BhzgioDdhQCrljrW5x#YEA5Np}bpeQMeOk-M?46%7
zy>5ps1IL0I^$Tzgr-Vdd@c615y*PCSIGv;BANN`z6+hm;6Z|r)K@fj$>PPC+whaLX
zP)^v)ryej-QF~Pu1G1UGN}FEG!>SN55d9)pBLge#az5WL9R{)9`HYO@;a5`t=09k?
zJL;hUYJB8??qReYo-bDG@~jr)uOtl;v%22w_He*syo>h=blyA@uB~#VB-y^
z1t~`3)(w|eJbsmUM)(i?8-QT2i6~@Sk3p&XP_HqB;hYzN_a>
zJe)RP=qUN?_taN*99_~h5I;FGFa-LjmeQUY$#`;ZYkQcpvAKbp`LYAAM^=hRpF|_o
z$Pk_R`lLOtkNY|+Y=Yq`g|JvM--+oX{&DYl?s?8#&$+i$yX*E_9a<}48vWaV#nb;Nxy(89A1$M=b9UFXOv6~3{&p!J%WDTr>
zgAV7%Q=x(8(I*{2R8zkFM&oZwFCLBfWGwb9B}yK@Z;=7Y&YY|bd{c?kz)wfLDRszP
zuzwpy{fKs2Qr|0_RM~dkNJe;l_Wi!)q=Z_aB$iGPCF|Cg$SQ5q4_k4Xtg>;Z91OeC
z#G_tgb(>cXuUFZ}u$DExWf7V&@AiWH?}k#VM*14ulXynY?QZ4X|J|=;8fPLuWLcvR
zO6Zf`O2;f57f#8HE*;KC!4-qn_&8ypH`D*0n8{yl4)D>;{i`)qC3`sYgmkUlKL=?z@}Ahsld#ddTFRxkXia
z@7^Zhg|M&`0~mV-G!rhrnF0tr!Y(2J(}}}*&UQQ_ZY`VrIgZT&L&wGm#+uN8+Q<>c
zTS%CUDlLo&(3PYCM9RtZRA22EGE&52eo1mOu8V-{9dAGQVS4SqdoR3rFefUF`^x6C
zCl^j>Szxvd;JO%aLU5BVJ)^O3NU+N#vgYn+%?EY;Bx&|*5(++Unq-u+hBemdZL
zvR2k~G{osITpQyGV9P-Hx~6C0w7{O#=G+nAh^Z$S&VD-On!3GN)j~a3bTJ>+fy}1D
zw(AwgVq=xa&4;S`Umyu!#->x!zi8#~bW4OD#kuqC^w`eMhHcFM=(@QGNUC7N=tEf*
zu6O9+|J{ck6iRQ^A$yf!s$2=)9H{%&?*_vGyn}lAI~lB0`8zwP`c2z=XzS*^e0$j-ifKClvB@(P5|;G5!0XB{*n-|C-b%G!JL77thX2O~}FOaB7I4
zm2{so+jHFE)Q$58Pbyz75)M`J6SQmTfIa-GL2^rWYCv+r;MY02fBDGAVK)yDK;7E&
zTTuNh5&OH_Vl(md`E_BkIKz;t{*dRo9rXeY&=+1;z$!mjT>JASQW)K#7O4#}v+Ys8
zBz$Tsm-}<~1+I`WR1M3~+3N6(4b?&9BDC>fSI;euexo#b?TiRrh@Q0z!tZgNA3-6Z
zqa9yI1-X54B5ZDjfzoHlbZ~-iU>qk%+~2Cl3bE@||DI`~R2_J|NP%LCpPZMwf_5<&
zFmx!xE&iy*2P8jaCX6CsA+&M|gT*4`($w^fNL&s#8W?~V%w7e?-M^`v>CPG*4phP<
zpKrCiH&8PdN_`g}!)}pUz{7*tri0&t;-zb3rS3w%zDGM~BQ>;<-OFcot!2@>ou1e-
zU!5527CB|&s(m=l!Q1ZWh{dgLZt5m_v(r=3S3BkO_xrav$@61X^Shlb|KWE3P25(-
zevmqq1GjI7%iMz$__FO+WYwQi=8e@eaqyg0c3^k`^83R+M8%STU%^R;VvN^cnzbd~=KT0Q(ss6M(a9+b4YOkfXsKjLHSyx;5qw#iVGkH-U+}`sb+g~4fcZX
z8*pV|*Blmn?c~G!tkncNEP-!?3O6oFn47OEcl9gWXV
zpENR8rgKzoJ-0lysER*Q1h8SF89pOTpU=+H*Dw)CJ?)t|
zRYx=)H_BJ?%=WNDl$-TI4(>`c)&I(7E0|oA37cL5*V{OMug=pMgED!F88%vj{U0As
zi4}}%{w$#%X8?xE(l^X?0m^|~OxOk7rwu9TuGiv0r
z*)p_>#T^q;_UF534J-I2w8>P5qjt3w#Z||TmLcXc17NhXzg|=)IX61m=+9gdzd(*7
zmrh{(eW<(YFUH_bXga*P?CSFO4aj;9BubBdn0Y
zw=&Kid&G{sHxjEa!Oc>nW>Y3pO2r&1AwWEmx~!+y@yx6&CQ-Ga
z*8`N7&R}UP$4y~aVq`&`9EJsjs%TY(2wap7zg5n6M0$Ar35Mu|g0A;q3UQi!
z)4J`EE6vj>DK}Fg`>Gp2cl|b_xQij{)$;!R4`*-jhqZokP()lnj6B8?3#wTitLC=R
zj=T~%%VC=#IvMo2LbNCImelw}31EpL)%2AA3_L1>`_8~GmP=n3tjNi@5(B*$1P5^Z
z)*+us|J;Zfb?3LTrZ%*%GRA;nD)@DvrY}3SA^~LOGlQfrm$!{4Ikj8zqu6YZNq+TQ^9Sg5%YltX|LDa#
zQRfTAy|^*kF4a0cLh?zi`OE>&xNqU->Q=z?TIexwMFfF`Wm9vuW{eoF?#jO_bPpk*
z(#?3d@!-1|iGYe_jXe$Gy-bOe8l~uQrVBV$eF=qXoIa!bSE0}Jiv9#mjR*gFYRR9{
z^)S#|9%jo}K+uI+$}E){FuCHJc!}~o{1SzPVKD2?X*!*hLXXn*T1nj(nQGSLCdn?1
zmC7e`W6vAjv5uWB+d@q}WL?}R$d((_&U#{9Cm3?w0yHiOATm~#rF^>*WZL&tR&$3i
z4c9lz6Z>!%wabpj?V-t&QoeP(0!iHQZi5thnwHf&sX)k@MVun-Nq!7+YfGUDOqq(B
zFMN$Mv2%k*Bo>QZ5E#8gR=jNeQXdvTi%_<}5@m=8>IJ>jr`3WlDqIk(sC5KwO41m8p_c
zf_Cfr5!H$R?Ntmo*+!}v&;Pr(%WiHdnd3=tIS$Ju
z1129;iA7S+u%cfXe$_LkCSlmz<%?t*SU-CoBA#bUqu8}0@@p>;A4>6H))F`oT2!$}
zj~x8_yF8wrR>s519zEI+fiHmkW#oLwO{GDka@jy#g@=|`kAI6yIkC}I)Bn?>4|04v
zMOCuTe99VJ`;LrO{^p%eTLPoem#{GkHmo;F6=sLhc&B-O#6Rz7!XoLL?<6
zWx6Re%o
z3Nm`(D7I7=ndR9*Y`#iv{#nL3b6kQj>$-^;>Q8~AXAsD#x
z>AXR_IM98pDVuwA`{G}feq}pJ#}BuNCEZ*5ey{vJmJ^f=%MKLgUIfzCeI)(hRGkLz8RRAS!}+NsC}EsD1%S
zksH3>1AT&bjPsd!&(~{F-s387=A}vq*v3~EY#e4a&3QYi)a$ajB%2OIAs^kj+toC)
zugjAA$0ca~@0%F`^c}T#j!2hsN>vncNhZ$5uV*+Jx2^7|)HLC-c$hMfmp{D$Dolno
zgirt4slXmm_9kYnfAwYSn8rIAUV%-!%lppz$7a}38KG2ak(WLuUnzCdJXP0wseeT~+gmWRZtY
zipb<^uHqtyhQmShm!Bc?HzET!-$q(h+PTP3u(*ohS$o6m)@L(svGD1
z)k+mxwM>GgH#o-NBP_3QdUTi3cVbcoK)o5&lJ9GiCuZ$#zC4#j#^|A^{9XdS0s&X)
zupR6w3UkUcaiEM+%Hq2L?xq&IlmGAwBg##8ZCt+Rhqk^@nR8CI-5niT-Abx>sBdg0
zpY~9>=kpBQE*oABVaDXNd!{rxj&l;kIn`X?_izEUF6DK4%VV{yGldpNJ6QyW>UUe{
zoCb!hObE%^fjh$}DM@Q`ua)q;;g}fg=q{8R)F3RiTqPN0?bVv-Cg{4w^~!nfl2wlp
za)GE{k+1)`AS(r4E*||QWV*pft5A95WrA^{Xw`!A`dOuBH(efvAvwp2ZJ6r6xcmBL
zFP=$zm5DjFfNnxX7Dl2>)B8%5eHT62)21XzY`_qRDM5lt$|!UjK6Emj9($E9#1-fs
zv8=!?hkTBKtbmwSI@bsv)g%1Cm%Y?K>`VE-X;_$X`qqh;veP#qp5yOV%OBqp2MCL*y!7@buuPppT9ek9y8j9|`tTAz>
zUor#PU3L1QET&TwQL^Fpdad^a=X2b$iRp|0w
z!|@tbGFhbEsrtKWngAbj-XBO1TOfQV>P>yivYLw_lKp(WM??1;dL>ESQ;A>|p&~lsX
z;5~Pl6_-1Z7DsKmw({)tFI9TB4&OXKx<_klm^nCW3EO8=L1Jms>o~B~yYL-OHH-Zk
z!#gp8Y^2%VmI$#2N|>QI%c?}+n&!#|b?I`Fo`y))h+>r|CK#s5oynH9Osag92RqYW
zWUHh%?IB3uE!=gM$C&qhy`kzs^;-@1mQRl{d=32CQ$LZ{Q(nWy*-g#~Zl7Jx6udZe
zEar}RW-hRYzh1fqqAyzU-=(hobygx|P&%6!LevUfa;`?SO$g`x-~zm1L&IFNO9SWl
zm}Z-=&S2>^eXFgN=;nt+Q?6kJ#Edeop?#z@Ubpg7qTXw-J~RStuP)!)u00~tU)9fp
zn+Gh=81NLjL?7Q}v82i@$KrB?@_nXrOX2hF&93cTomL8`Ag&-6^>V?E-ZCtF>)W`R
z<#}Au;aQ%;!aR6awwz_<+;MTt1w_YB_P$#rXKcgV9dY20^!%3lYvF-7la2WmvSGgQ
zaER`A=UH}gE3Rnko|h$5qjk&4(yV@*6oYkYoa}gA_o(?(`E40=_8CR1z}{xo|83Xh
zcxxwao(48;W=frx)q=lT4fwIlYAL?>3+nSiSow%azqc&?cZa)@@7jlHJV=Te|cYAOM-c7XjkNypTOoWT7(F$mLXOzhI9wL5Isce
z42A$VV2z`PX8JNYdN#ArW&5$&J!A$p#n17MNMl1m?<`4H)&MGNmg~cGdy7Bx66NJh
zAD~A~{|E$>)gu+>Mb{(h91HUV4m&$K_j`*v6d&s%pKK;h$oH<(`T9*8jU??q)1enw
ze{#7krR%5gOiWh1PYC(xq}_KyID`51fIMoF6xc@C{w0CZ?v+a=zODwa&
zy836UL1N00_g#167Y=s%!~@aA`?j?swZvyl-+KLF(_3s-+QSa&3$w!Y-NgD}(mB?Y
z@wo?fzk*W?!IjDJSIgui%-Sf{2bJAtlN0mNF%GAw1hWp;xmO|mz2emW|T>_eDzt$Ur-%-?ANk?QZdV&GtDj_XW{cDwTk
zVsXVVj{vU~^r}CF*yixufC@;MAN3mu>ZIl}KGd*jS
zOhfOuKp@ALs%Zpy6g}d+OpYxKtrcr)lkR=t!hmhe{iFUomNPZLl$)W(&u{86!JyzfT+vveh?E`>@z3-YY^YH|v?HJ#=H3@c^fs=t
zsFJJ+ED6DBe
zsu}s-@QLTCA5eu>0j)T>R0kiDx#`>9=cn<#TVZAwMvF4&>M|3MGun{O&3s&us%xDX
zv#qzP1`bDTx?%(2_xB%wK4CjU_t7nLt1Nsop$?z8cAnAh(29~j&K{_oyJ9W$rD5a(
z==1Xz{mfr-4*t(AaQ%Dr17NsMvB4a8RNsMmoCXRun6owH&cK;rA@(wC4PVv+8R#|S
zxL_e?Yt12-=Qti3aQz(UKm#KP{kz2qlf#9{4F4|9FaF=KbPE8uQfI=?b8tRqrooE{
zfOL3a{(H1oDB?YK%cB*0KmYj2vmm>CxW_FhB9k6mf$|4`paTn0z?w@b{79
zq{@R!u#obNMjjd(nwkH70ohETP!3MXZSAUUhYG23OVY6307d{7!sdW{^cXaOlgdsX
zr!c9uMd0H9(SL9OgW0+yH0ugp2Aj?Rtd|>rT>mCR#O@rbL&`Hk8BaF{D}E|<;?9ibT<;N~VihhuACOi2>3;>kWe$6k4;v#M)k
R^N$#fzOKpbTI~nm{{fn$kDCAh
diff --git a/man/midoc-package.Rd b/man/midoc-package.Rd
index d85a866..032345b 100644
--- a/man/midoc-package.Rd
+++ b/man/midoc-package.Rd
@@ -6,7 +6,7 @@
\alias{midoc-package}
\title{midoc: A Decision-Making System for Multiple Imputation}
\description{
-A guidance system for analysis with missing data. It incorporates expert, up-to-date methodology to help researchers choose the most appropriate analysis approach when some data are missing. You provide the available data and the assumed causal structure, including the likely causes of missing data. 'midoc' will advise whether multiple imputation is needed, and if so, how best to perform it.
+A guidance system for analysis with missing data. It incorporates expert, up-to-date methodology to help researchers choose the most appropriate analysis approach when some data are missing. You provide the available data and the assumed causal structure, including the likely causes of missing data. `midoc` will advise which analysis approaches can be used, and how best to perform them. `midoc` follows the framework for the treatment and reporting of missing data in observational studies (TARMOS). Lee et al (2021). \doi{10.1016/j.jclinepi.2021.01.008}.
}
\seealso{
Useful links:
diff --git a/man/proposeMI.Rd b/man/proposeMI.Rd
index bbd6f48..d03d2da 100644
--- a/man/proposeMI.Rd
+++ b/man/proposeMI.Rd
@@ -4,7 +4,7 @@
\alias{proposeMI}
\title{Suggests multiple imputation options}
\usage{
-proposeMI(mimodobj, data, message = TRUE, plot = TRUE, plotprompt = TRUE)
+proposeMI(mimodobj, data, plot = TRUE, plotprompt = TRUE, message = TRUE)
}
\arguments{
\item{mimodobj}{An object, or list of objects, of type 'mimod', which stands
@@ -14,38 +14,45 @@ for 'multiple imputation model', created by a call to
\item{data}{A data frame containing all the variables required for imputation
and the substantive analysis}
-\item{message}{If TRUE (the default), displays a message describing the
-proposed 'mice' options; use message=FALSE to suppress the message}
-
-\item{plot}{If TRUE (the default), displays diagnostic plots for the
-proposed 'mice' call; use plot=FALSE to disable the plots}
+\item{plot}{If TRUE (the default), displays diagnostic plots for the proposed
+'mice' call; use plot=FALSE to disable the plots}
\item{plotprompt}{If TRUE (the default), the user is prompted before the
second plot is displayed; use plotprompt=FALSE to remove the prompt}
+
+\item{message}{If TRUE (the default), displays a message describing the
+proposed 'mice' options; use message=FALSE to suppress the message}
}
\value{
An object of type 'miprop', which can be used to run 'mice' using the
-proposed options, plus, optionally, a message and diagnostic plots describing
-the proposed 'mice' options
+proposed options, plus, optionally, a message and diagnostic plots
+describing the proposed 'mice' options
}
\description{
-Suggests the \link[mice]{mice} options to perform multiple
-imputation, based on the proposed set of imputation models (one for each
-partially observed variable) and specified dataset.
+Suggests the \link[mice]{mice} options to perform multiple imputation, based
+on the proposed set of imputation models (one for each partially observed
+variable) and specified dataset.
}
\examples{
-# First specify each imputation model as a 'mimod' object (suppressing the
-## output)
-mimod_bmi7 <- checkModSpec(
-formula="bmi7~matage+I(matage^2)+mated+pregsize",
- family="gaussian(identity)", data=bmi, message = FALSE)
+# First specify each imputation model as a 'mimod' object
+## (suppressing the message)
+mimod_bmi7 <- checkModSpec(formula="bmi7~matage+I(matage^2)+mated+pregsize",
+ family="gaussian(identity)",
+ data=bmi,
+ message=FALSE)
mimod_pregsize <- checkModSpec(
-formula="pregsize~bmi7+matage+I(matage^2)+mated",
- family="binomial(logit)", data=bmi, message = FALSE)
+ formula="pregsize~bmi7+matage+I(matage^2)+mated",
+ family="binomial(logit)",
+ data=bmi,
+ message=FALSE)
# Display the proposed 'mice' options (suppressing the plot prompt)
- ## When specifying a single imputation model
-proposeMI(mimodobj=mimod_bmi7, data=bmi, plotprompt = FALSE)
- ## When specifying more than one imputation model (suppressing the plot)
-proposeMI(mimodobj=list(mimod_bmi7,mimod_pregsize), data=bmi, plot = FALSE)
+## When specifying a single imputation model
+proposeMI(mimodobj=mimod_bmi7,
+ data=bmi,
+ plotprompt = FALSE)
+## When specifying more than one imputation model (suppressing the plots)
+proposeMI(mimodobj=list(mimod_bmi7,mimod_pregsize),
+ data=bmi,
+ plot = FALSE)
}
diff --git a/tests/testthat/test-checkcra.R b/tests/testthat/test-checkcra.R
index b69119a..67e5319 100644
--- a/tests/testthat/test-checkcra.R
+++ b/tests/testthat/test-checkcra.R
@@ -7,9 +7,9 @@ res1<-evaluate_promise(checkCRA(y="bmi7", covs="matage", r_cra="r",
test_that("checkCRA correctly identifies that CRA is not valid given the mDAG
and analysis model but could be valid for a different set of covariates",
{
- expect_equal(trimws(paste0(gsub("\n","",res1$output), collapse=" "),"right"),
-"Based on the proposed directed acyclic graph (DAG), the analysis model outcome and complete record indicator are not independent given analysis model covariates. Hence, in general, complete records analysis is not valid. In special cases, depending on the type of analysis model and estimand of interest, complete records analysis may still be valid. See, for example, Bartlett et al. (2015) (https://doi.org/10.1093/aje/kwv114) for further details. Consider using a different analysis model and/or strategy, e.g. multiple imputation. For example, the analysis model outcome and complete record indicator are independent if, in addition to the specified covariates, the following sets of variables are included as covariates in the analysis model (note that this list is not necessarily exhaustive, particularly if your DAG is complex): { mated }{ mated, sep_unmeas }")
- }
+ expect_equal(trimws(paste0(gsub("\n"," ",res1$messages), collapse=" "),"right"),
+"Based on the proposed directed acyclic graph (DAG), the analysis model outcome and complete record indicator are not independent given analysis model covariates. Hence, in general, complete records analysis is not valid. In special cases, depending on the type of analysis model and estimand of interest, complete records analysis may still be valid. See, for example, Bartlett et al. (2015) (https://doi.org/10.1093/aje/kwv114) for further details. Consider using a different analysis model and/or strategy, e.g. multiple imputation. For example, the analysis model outcome and complete record indicator are independent if, in addition to the specified covariates, the following sets of variables are included as covariates in the analysis model (note that this list is not necessarily exhaustive, particularly if your DAG is complex): mated c(\"mated\", \"sep_unmeas\")")
+ }
)
# Check output when complete records analysis is valid
@@ -18,7 +18,7 @@ res2<-evaluate_promise(checkCRA(y="bmi7", covs="matage mated", r_cra="r",
sep_unmeas -> r"))
test_that("checkCRA correctly identifies that CRA is valid",
{
- expect_equal(trimws(paste0(gsub("\n","",res2$output), collapse=" "),"right"),
+ expect_equal(trimws(paste0(gsub("\n"," ",res2$messages), collapse=" "),"right"),
"Based on the proposed directed acyclic graph (DAG), the analysis model outcome and complete record indicator are independent given analysis model covariates. Hence, complete records analysis is valid.")
}
)
@@ -31,8 +31,8 @@ res3<-evaluate_promise(checkCRA(y="bmi7", covs="matage mated", r_cra="r",
test_that("checkCRA correctly identifies that CRA is not valid, but could be
valid for a different estimand",
{
- expect_equal(trimws(paste0(gsub("\n","",res3$output), collapse=" "),"right"),
-"Based on the proposed directed acyclic graph (DAG), the analysis model outcome and complete record indicator are not independent given analysis model covariates. Hence, in general, complete records analysis is not valid. In special cases, depending on the type of analysis model and estimand of interest, complete records analysis may still be valid. See, for example, Bartlett et al. (2015) (https://doi.org/10.1093/aje/kwv114) for further details. There are no other variables which could be added to the model to make the analysis model outcome and complete record indicator conditionally independent, without changing the estimand of interest. Consider using a different strategy e.g. multiple imputation. Alternatively, consider whether a different estimand could be of interest. For example, the analysis model outcome and complete record indicator are independent given each of the following sets of variables: { bmi3, mated }{ bmi3, matage, mated }{ bmi3, sep_unmeas }{ bmi3, matage, sep_unmeas }{ bmi3, mated, sep_unmeas }{ bmi3, matage, mated, sep_unmeas }")
+ expect_equal(trimws(paste0(gsub("\n"," ",res3$messages), collapse=" "),"right"),
+"Based on the proposed directed acyclic graph (DAG), the analysis model outcome and complete record indicator are not independent given analysis model covariates. Hence, in general, complete records analysis is not valid. In special cases, depending on the type of analysis model and estimand of interest, complete records analysis may still be valid. See, for example, Bartlett et al. (2015) (https://doi.org/10.1093/aje/kwv114) for further details. There are no other variables which could be added to the model to make the analysis model outcome and complete record indicator conditionally independent, without changing the estimand of interest. Consider using a different strategy e.g. multiple imputation. Alternatively, consider whether a different estimand could be of interest. For example, the analysis model outcome and complete record indicator are independent given each of the following sets of variables: c(\"bmi3\", \"mated\") c(\"bmi3\", \"matage\", \"mated\") c(\"bmi3\", \"sep_unmeas\") c(\"bmi3\", \"matage\", \"sep_unmeas\") c(\"bmi3\", \"mated\", \"sep_unmeas\") c(\"bmi3\", \"matage\", \"mated\", \"sep_unmeas\")")
}
)
@@ -42,7 +42,7 @@ res4 <- evaluate_promise(checkCRA(y="bmi7", covs="matage mated", r_cra="r",
sep_unmeas -> r bmi7 -> r"))
test_that("checkCRA correctly identifies that CRA is never valid",
{
- expect_equal(trimws(paste0(gsub("\n","",res4$output), collapse=" "),"right"),
-"Based on the proposed directed acyclic graph (DAG), the analysis model outcome and complete record indicator are not independent given analysis model covariates. Hence, in general, complete records analysis is not valid. In special cases, depending on the type of analysis model and estimand of interest, complete records analysis may still be valid. See, for example, Bartlett et al. (2015) (https://doi.org/10.1093/aje/kwv114) for further details. There are no other variables which could be added to the model to make the analysis model outcome and complete record indicator conditionally independent. Consider using a different strategy e.g. multiple imputation.")
+ expect_equal(trimws(paste0(gsub("\n"," ",res4$messages), collapse=" "),"right"),
+"Based on the proposed directed acyclic graph (DAG), the analysis model outcome and complete record indicator are not independent given analysis model covariates. Hence, in general, complete records analysis is not valid. In special cases, depending on the type of analysis model and estimand of interest, complete records analysis may still be valid. See, for example, Bartlett et al. (2015) (https://doi.org/10.1093/aje/kwv114) for further details. There are no other variables which could be added to the model to make the analysis model outcome and complete record indicator conditionally independent. Consider using a different strategy e.g. multiple imputation.")
}
)
diff --git a/tests/testthat/test-checkmi.R b/tests/testthat/test-checkmi.R
index 6cd839c..d37fe72 100644
--- a/tests/testthat/test-checkmi.R
+++ b/tests/testthat/test-checkmi.R
@@ -5,7 +5,7 @@ res1<-evaluate_promise(checkMI(dep="bmi7", preds="matage mated pregsize", r_dep=
#There's a trailing blank, but only visible in testing, so just trim for test purposes
test_that("checkMI correctly identifies when MI is valid given the mDAG and imputation model",
{
- expect_equal(trimws(paste0(gsub("\n","",res1$output), collapse=" "),"right"),
+ expect_equal(trimws(paste0(gsub("\n"," ",res1$messages), collapse=" "),"right"),
"Based on the proposed directed acyclic graph (DAG), the incomplete variable and its missingness indicator are independent given imputation model predictors. Hence, multiple imputation methods which assume data are missing at random are valid in principle.")
}
)
@@ -19,8 +19,8 @@ res2<-evaluate_promise(checkMI(dep="bmi7", preds="matage mated bwt", r_dep="r",
test_that("checkMI correctly identifies imputation model is not valid, but could
be valid for a different set of predictors",
{
- expect_equal(trimws(paste0(gsub("\n","",res2$output), collapse=" "),"right"),
-"Based on the proposed directed acyclic graph (DAG), the incomplete variable and its missingness indicator are not independent given imputation model predictors. Hence, multiple imputation methods which assume data are missing at random are not valid. Consider using a different imputation model and/or strategy (e.g. not-at-random fully conditional specification). For example, the incomplete variable and its missingness indicator are independent if, in addition to the specified predictors, the following sets of variables are included as predictors in the imputation model (note that this list is not necessarily exhaustive, particularly if your DAG is complex): { pregsize }{ pregsize, sep_unmeas }")
+ expect_equal(trimws(paste0(gsub("\n"," ",res2$messages), collapse=" "),"right"),
+"Based on the proposed directed acyclic graph (DAG), the incomplete variable and its missingness indicator are not independent given imputation model predictors. Hence, multiple imputation methods which assume data are missing at random are not valid. Consider using a different imputation model and/or strategy (e.g. not-at-random fully conditional specification). For example, the incomplete variable and its missingness indicator are independent if, in addition to the specified predictors, the following sets of variables are included as predictors in the imputation model (note that this list is not necessarily exhaustive, particularly if your DAG is complex): pregsize c(\"pregsize\", \"sep_unmeas\")")
}
)
diff --git a/tests/testthat/test-checkmodspec.R b/tests/testthat/test-checkmodspec.R
index ea9b906..d2e905d 100644
--- a/tests/testthat/test-checkmodspec.R
+++ b/tests/testthat/test-checkmodspec.R
@@ -6,8 +6,8 @@ res1<-evaluate_promise(checkModSpec(formula="bmi7~matage+mated+pregsize",
test_that("checkModSpec correctly identifies that the proposed gaussian model is
mis-specified",
{
- expect_equal(trimws(paste0(gsub("\n","",res1$output), collapse=" "),"right"),
-"Model mis-specification method: regression of model residuals on a fractional polynomial of the fitted values P-value: 0 A small p-value means the model may be mis-specified. Check the specification of each relationship in your model.")
+ expect_equal(trimws(paste0(gsub("\n"," ",res1$messages), collapse=" "),"right"),
+"Model mis-specification method: regression of model residuals on a fractional polynomial of the fitted values P-value: 0 A small p-value means the model may be mis-specified. Check the specification of each relationship in your model.")
expect_equal(res1$result$formula, "bmi7~matage+mated+pregsize")
expect_equal(res1$result$family, "gaussian(identity)")
expect_equal(res1$result$datalab, "bmi")
@@ -23,8 +23,8 @@ res2<-evaluate_promise(checkModSpec(
test_that("checkModSpec correctly identifies that the proposed gaussian model is
correctly specified",
{
- expect_equal(trimws(paste0(gsub("\n","",res2$output), collapse=" "),"right"),
-"Model mis-specification method: regression of model residuals on a fractional polynomial of the fitted values P-value: 1 A large p-value means there is little evidence of model mis-specification.")
+ expect_equal(trimws(paste0(gsub("\n"," ",res2$messages), collapse=" "),"right"),
+"Model mis-specification method: regression of model residuals on a fractional polynomial of the fitted values P-value: 1 A large p-value means there is little evidence of model mis-specification.")
}
)
@@ -36,8 +36,8 @@ res3<-evaluate_promise(checkModSpec(formula="mated~matage+bmi7+pregsize",
test_that("checkModSpec correctly identifies that the proposed logistic model
is mis-specified",
{
- expect_equal(trimws(paste0(gsub("\n","",res3$output), collapse=" "),"right"),
- "Model mis-specification method: Pregibon's link test P-value: 0.01275569 A small p-value means the model may be mis-specified. Check the specification of each relationship in your model.")
+ expect_equal(trimws(paste0(gsub("\n"," ",res3$messages), collapse=" "),"right"),
+ "Model mis-specification method: Pregibon's link test P-value: 0.012756 A small p-value means the model may be mis-specified. Check the specification of each relationship in your model.")
expect_equal(res3$result$formula, "mated~matage+bmi7+pregsize")
expect_equal(res3$result$family, "binomial(logit)")
expect_equal(res3$result$datalab, "bmi")
@@ -52,7 +52,7 @@ res4<-evaluate_promise(checkModSpec(
#There's a trailing blank, but only visible in testing, so just trim for test purposes
test_that("checkModSpec correctly identifies that the proposed model is corretly specified",
{
- expect_equal(trimws(paste0(gsub("\n","",res4$output), collapse=" "),"right"),
-"Model mis-specification method: Pregibon's link test P-value: 0.381826 A large p-value means there is little evidence of model mis-specification.")
+ expect_equal(trimws(paste0(gsub("\n"," ",res4$messages), collapse=" "),"right"),
+"Model mis-specification method: Pregibon's link test P-value: 0.381826 A large p-value means there is little evidence of model mis-specification.")
}
)
diff --git a/tests/testthat/test-descMissData.R b/tests/testthat/test-descMissData.R
index c6576e0..58c6ad6 100644
--- a/tests/testthat/test-descMissData.R
+++ b/tests/testthat/test-descMissData.R
@@ -4,8 +4,8 @@ res1<-evaluate_promise(descMissData(y="bmi7", covs="matage mated", data=bmi))
#There's a trailing blank, but only visible in testing, so just trim for test purposes
test_that("descMissData output is as expected when data are missing",
{
- expect_equal(trimws(paste0(gsub("\n","",res1$output), collapse=" "),"right"),
-" pattern bmi7 matage mated n pct 1 1 1 1 592 59 2 0 1 1 408 41")
+ expect_equal(trimws(paste0(gsub("\n","",res1$result), collapse=" "),"right"),
+"1:2 c(1, 0) c(1, 1) c(1, 1) c(592, 408) c(59, 41)")
}
)
diff --git a/tests/testthat/test-doMImice.R b/tests/testthat/test-doMImice.R
index 3fd8dd2..eb7b6e1 100644
--- a/tests/testthat/test-doMImice.R
+++ b/tests/testthat/test-doMImice.R
@@ -5,7 +5,7 @@ mimod_bmi7 <- checkModSpec(
family="gaussian(identity)", data=bmi, message=FALSE)
# Save the proposed 'mice' options as a 'miprop' object, suppressing the
## message and plots
-miprop <- proposeMI(mimodobj=mimod_bmi7, data=bmi, message=FALSE, plot=FALSE)
+miprop <- proposeMI(mimodobj=mimod_bmi7, data=bmi, plot=FALSE, message=FALSE)
# Check both the output when a substantive model is specified and that a
## mice object is created
res1<-evaluate_promise(doMImice(miprop, 123,
@@ -14,9 +14,9 @@ res1<-evaluate_promise(doMImice(miprop, 123,
test_that("doMImice creates both the correct output when a substantive model is
specified and a mice 'mids' object",
{
- expect_equal(substr(trimws(paste0(gsub("\n","",res1$output), collapse=" "),
+ expect_equal(substr(trimws(paste0(gsub("\n"," ",res1$message), collapse=" "),
"right"),1,100),
-"Given the substantive model: lm(bmi7 ~ matage + I(matage^2) + mated) multiple imputation estimates ")
+"Given the substantive model: lm(bmi7 ~ matage + I(matage^2) + mated) , multiple imputation estimates")
expect_equal(mice::is.mids(res1$result),TRUE)
}
)
diff --git a/tests/testthat/test-exploreDAG.R b/tests/testthat/test-exploreDAG.R
index 17c96ab..935eb3c 100644
--- a/tests/testthat/test-exploreDAG.R
+++ b/tests/testthat/test-exploreDAG.R
@@ -5,10 +5,10 @@ res1<-evaluate_promise(exploreDAG(mdag="matage -> bmi7 mated -> matage
test_that("exploreDAG correctly identifies both the implied independencies and that
none of them are testable",
{
- expect_equal(substr(trimws(paste0(gsub("\n","",res1$output), collapse=" "),
- "right"),345,595),
-"None of the fully observed variables are conditionally independent. Hence, no consistency checks will be performed. Consider whether it is valid and possible to explore relationships between partially observed variables using the observed data, e.g.")
-}
+ expect_equal(substr(trimws(paste0(gsub("\n"," ",res1$messages), collapse=" "),
+ "right"),358,606),
+"None of the fully observed variables are conditionally independent. Hence, no consistency checks will be performed. Consider whether it is valid and possible to explore relationships between partially observed variables using the observed data, e.g")
+ }
)
# Check output when there are testable paths
@@ -18,8 +18,8 @@ res2<-evaluate_promise(exploreDAG(mdag="matage -> bmi7 mated -> matage
test_that("exploreDAG correctly identifies both the implied independencies and
the testable subset",
{
- expect_equal(substr(trimws(paste0(gsub("\n","",res2$output), collapse=" "),
- "right"),386,551),
+ expect_equal(substr(trimws(paste0(gsub("\n"," ",res2$messages), collapse=" "),
+ "right"),378,543),
"These (conditional) independence statements are explored below using the canonical correlations approach for mixed data. See ??dagitty::localTests for further details")
}
)
diff --git a/tests/testthat/test-proposeMI.R b/tests/testthat/test-proposeMI.R
index d66db52..3b76c59 100644
--- a/tests/testthat/test-proposeMI.R
+++ b/tests/testthat/test-proposeMI.R
@@ -13,9 +13,9 @@ res1<-evaluate_promise(proposeMI(mimodobj=list(mimod_bmi7,mimod_pregsize),
#Trim output for test purposes
test_that("proposeMI suggests correct mice options and creates expected object",
{
- expect_equal(substr(trimws(paste0(gsub("\n","",res1$output), collapse=" "),
+ expect_equal(substr(trimws(paste0(gsub("\n"," ",res1$messages), collapse=" "),
"right"),1,110),
-"Based on your proposed imputation model and dataset, your mice() call should be as follows: mice(data = bmi ,")
+"Based on your proposed imputation model and dataset, your mice() call should be as follows: mice(data = bmi ,")
expect_equal(res1$result$m,41)
expect_equal(res1$result$method,list("norm", "logreg"))
expect_equal(paste0(res1$result$formulas),c("bmi7 ~ matage + I(matage^2) + mated + pregsize", "pregsize ~ bmi7 + matage + I(matage^2) + mated"))
diff --git a/vignettes/midoc.rmd b/vignettes/midoc.rmd
index 957e2ea..4369dfd 100644
--- a/vignettes/midoc.rmd
+++ b/vignettes/midoc.rmd
@@ -55,7 +55,8 @@ Missing data is a common issue in health and social research, often addressed by
However, using MI in practice can be complex. Application of MI involves multiple decisions which are rarely justified or even documented, and for which little guidance is available.
The Multiple Imputation DOCtor (`midoc`) R package is a decision-making system which incorporates expert, up-to-date guidance to help you choose the most appropriate analysis method
-when there are missing data. `midoc` will guide you through your analysis, examining both the hypothesised causal relationships and the observed data to advise on whether MI is needed, and if so how to perform it. We assume you are interested in obtaining unbiased estimates of regression coefficients - note that bias is not necessarily a concern if your interest is in prediction (*i.e.* diagnostic/prognostic modelling).
+when there are missing data. `midoc` will guide you through your analysis, examining both the hypothesised causal relationships and the observed data to advise on whether MI is needed, and if so how to perform it. `midoc` follows the framework for the treatment
+and reporting of missing data in observational studies (TARMOS) [1](https://doi.org/10.1016/j.jclinepi.2021.01.008). We assume you are interested in obtaining unbiased estimates of regression coefficients - note that bias is not necessarily a concern if your interest is in prediction (*i.e.* diagnostic/prognostic modelling).
Here, we will demonstrate the key features of `midoc` using a worked example.
@@ -72,15 +73,19 @@ We will start by specifying the relationships between our variables, assuming th
We will assume maternal age (`matage`) causes BMI at age 7 years (`bmi7`), and maternal education level (`mated`) causes both maternal age and BMI at age 7 years. We can express these relationships using "dagitty" syntax, as follows:
```{r, eval=FALSE}
-matage -> bmi7 mated -> matage mated -> bmi7
+matage -> bmi7
+mated -> matage
+mated -> bmi7
```
-Next, for each partially observed variable, we will specify the variables related to its probability of being missing (its "missingness") by adding these relationships to our DAG. This type of DAG is often referred to as a "missingness" DAG (mDAG) [1](https://doi.org/10.1177/0962280210394469), [2](https://doi.org/10.1093/ije/dyad008).
+Next, for each partially observed variable, we will specify the variables related to its probability of being missing (its "missingness") by adding these relationships to our DAG. This type of DAG is often referred to as a "missingness" DAG (mDAG) [2](https://doi.org/10.1177/0962280210394469), [3](https://doi.org/10.1093/ije/dyad008).
We will first use the `midoc` function `descMissData` to identify which variables in our dataset are partially observed, specifying our outcome (`y`), covariates, *i.e.* our independent variables, (`covs`), and dataset (`data`), as follows.
```{r}
-descMissData(y="bmi7", covs="matage mated", data=bmi)
+descMissData(y="bmi7",
+ covs="matage mated",
+ data=bmi)
```
We see that there are two missing data patterns: either all variables are observed, or BMI at age 7 years is missing and all covariates are observed. We will use indicator variable "R" to denote the missingness of BMI at age 7 years (for example, R=1 if BMI at age 7 years is observed, and 0 otherwise). In this specific example, R also indicates a complete record (R=1 if all variables are fully observed, and 0 otherwise) because all other variables are fully observed. We will suppose that R is related to maternal education level via socio-economic position (SEP), *i.e.* SEP is a cause of both maternal education level and R, but neither BMI at age 7 years itself nor maternal age are causes of R. We will further suppose that SEP is missing (unmeasured) for all individuals in our dataset; to remind us of this fact, we will name this variable `sep_unmeas`.
@@ -88,13 +93,20 @@ We see that there are two missing data patterns: either all variables are observ
Our mDAG is now as follows (note that we follow the convention of using lower case names for variables in our code, so R becomes "r", and so on):
```{r, eval=FALSE}
-matage -> bmi7 mated -> matage mated -> bmi7 sep_unmeas -> mated sep_unmeas -> r
+matage -> bmi7
+mated -> matage
+mated -> bmi7
+sep_unmeas -> mated
+sep_unmeas -> r
```
Note that if instead you believe maternal education is a direct cause of R, the mDAG would be as follows:
```{r, eval=FALSE}
-matage -> bmi7 mated -> matage mated -> bmi7 mated -> r
+matage -> bmi7
+mated -> matage
+mated -> bmi7
+mated -> r
```
We will now draw our mDAG and visually check that the relationships are specified as we intended:
@@ -120,14 +132,19 @@ plot(dagitty::dagitty('dag {
As a final check of our mDAG, we will use the `midoc` function `exploreDAG` to explore whether relationships in the dataset are consistent with the proposed mDAG, specifying both our mDAG (`mdag`) and dataset (`data`), as follows.
```{r}
-exploreDAG(mdag="matage -> bmi7 mated -> matage mated -> bmi7 sep_unmeas -> mated sep_unmeas -> r", data=bmi)
+exploreDAG(mdag="matage -> bmi7
+ mated -> matage
+ mated -> bmi7
+ sep_unmeas -> mated
+ sep_unmeas -> r",
+ data=bmi)
```
-Based on the relationships between fully observed variables maternal age, maternal education, and missingness of BMI at age 7 years, we can see that there is little evidence of inconsistency between our dataset and proposed mDAG. In particular, our mDAG assumes that maternal age (`matage`) is unrelated to missingness of BMI at age 7 years (`r`), given maternal education (`mated`); our results suggest this is plausible. Note that we cannot use our observed data to explore whether BMI at age 7 years is unrelated to its own missingness - we would need the missing values of BMI at age 7 years in order to do this.
+Based on the relationships between fully observed variables maternal age, maternal education, and missingness of BMI at age 7 years, we can see that there is little evidence of inconsistency between our dataset and proposed mDAG. In particular, our mDAG assumes that maternal age (`matage`) is unrelated to missingness of BMI at age 7 years (`r`), given maternal education (`mated`); our results suggest this is plausible. Note that we cannot use our observed data to determine whether BMI at age 7 years is unrelated to its own missingness - we would need the missing values of BMI at age 7 years in order to do this. However, if BMI at age 7 years was a cause of its own missingness, then we would expect maternal age also to be related to its missingness (via BMI at age 7 years). Since maternal age seems to be unrelated, we are reassured that BMI at age 7 years is also likely to be unrelated, given maternal education.
**Tips for specifying a "missingness" DAG**
-* First specify the DAG for the analysis model, as it would be if there were no missing data. You may find this introduction to DAGs useful [3](https://doi.org/10.1038/s41390-018-0071-3).
+* First specify the DAG for the analysis model, as it would be if there were no missing data. You may find this introduction to DAGs useful [4](https://doi.org/10.1038/s41390-018-0071-3).
* Next add missingness indicator(s) to your DAG. If you have multiple variables with missing data, you may want to start by including just the complete records indicator in your DAG.
@@ -138,13 +155,19 @@ Based on the relationships between fully observed variables maternal age, matern
* Data exploration, for example, by performing a logistic regression of each missingness indicator on your analysis model variables - noting that you may have to exclude any variables with a large proportion of missing data to avoid perfect prediction
## Step 2 Check whether complete records analysis is likely to be a valid strategy
-Our next step is to determine whether complete records analysis (CRA) is a valid strategy, using our mDAG. Remember that, in general, CRA will be valid if the analysis model outcome is unrelated to the complete records indicator, conditional on the analysis model covariates [4](https://doi.org/10.1093/ije/dyz032) (in special cases, depending on the type of analysis model and estimand of interest, this rule can be relaxed [5](https://doi.org/10.1093/aje/kwv114) - here, we will consider the general setting without making any assumptions about the fitted model).
+Our next step is to determine whether complete records analysis (CRA) is a valid strategy, using our mDAG. Remember that, in general, CRA will be valid if the analysis model outcome is unrelated to the complete records indicator, conditional on the analysis model covariates [5](https://doi.org/10.1093/ije/dyz032) (in special cases, depending on the type of analysis model and estimand of interest, this rule can be relaxed [6](https://doi.org/10.1093/aje/kwv114) - here, we will consider the general setting without making any assumptions about the fitted model).
Suppose we decide to estimate the unadjusted association between BMI at age 7 years and maternal age, without including our confounder maternal education in the model. We will use the `midoc` function `checkCRA` applied to our mDAG to check whether CRA is valid for this model, specifying our outcome (`y`), covariates, *i.e.* our independent variables, (`covs`), complete records indicator (`r_cra`), and mDAG (`mdag`), as follows:
```{r}
-checkCRA(y="bmi7", covs="matage", r_cra="r",
- mdag="matage -> bmi7 mated -> matage mated -> bmi7 sep_unmeas -> mated sep_unmeas -> r")
+checkCRA(y="bmi7",
+ covs="matage",
+ r_cra="r",
+ mdag="matage -> bmi7
+ mated -> matage
+ mated -> bmi7
+ sep_unmeas -> mated
+ sep_unmeas -> r")
```
@@ -154,8 +177,14 @@ We can see that CRA would not be valid (we can also tell this by inspecting our
If we add `mated` to the model and re-run `checkCRA`, as below, we see that CRA is now valid.
```{r}
-checkCRA(y="bmi7", covs="matage mated", r_cra="r",
- mdag="matage -> bmi7 mated -> matage mated -> bmi7 sep_unmeas -> mated sep_unmeas -> r")
+checkCRA(y="bmi7",
+ covs="matage mated",
+ r_cra="r",
+ mdag="matage -> bmi7
+ mated -> matage
+ mated -> bmi7
+ sep_unmeas -> mated
+ sep_unmeas -> r")
```
@@ -164,25 +193,34 @@ checkCRA(y="bmi7", covs="matage mated", r_cra="r",
If our outcome, BMI at age 7 years, was itself a cause of missingness, CRA would always be invalid, *i.e.* there would be no other variables we could add to the analysis model to make CRA valid. See below to see the results of `checkCRA` in this case (note, in the code, we have added a path from `bmi7` to `r` to the specified mDAG).
```{r}
-checkCRA(y="bmi7", covs="matage mated", r_cra="r",
- mdag="matage -> bmi7 mated -> matage mated -> bmi7 sep_unmeas -> mated sep_unmeas -> r bmi7 -> r")
+checkCRA(y="bmi7",
+ covs="matage mated",
+ r_cra="r",
+ mdag="matage -> bmi7
+ mated -> matage
+ mated -> bmi7
+ sep_unmeas -> mated
+ sep_unmeas -> r
+ bmi7 -> r")
```
## Step 3 Check whether multiple imputation is likely to be a valid strategy
-Although CRA is valid for our example, we may also wish to perform MI. Remember that MI is valid in principle if each partially observed variable is unrelated to its missingness, given its imputation model predictors. Furthermore, we should include all other analysis model variables in the imputation model for each partially observed variable, in the form implied by the analysis model, so that the analysis and imputation models are "compatible". In theory, given multiple partially observed variables, validity of MI may imply different causes of missingness for each missing data pattern. For example, if both BMI at age 7 years and maternal education were partially observed, MI would only be valid if missingness of BMI at age 7 years was unrelated to maternal education among individuals missing both BMI at age 7 years and maternal education (given the other observed data), although missingness of BMI at age 7 years could be related to maternal education among individuals with observed maternal education. In practice, we recommend focusing on the most common missing data patterns and/or variables with the most missing data. Less common missing data patterns can often be assumed to be missing completely at random - it is unlikely to change your final conclusions if this assumption is incorrect.
+Although CRA is valid for our example, we may also wish to perform MI. Remember that MI is valid in principle if each partially observed variable is unrelated to its missingness, given its imputation model predictors. Furthermore, we should include all other analysis model variables in the imputation model for each partially observed variable, in the form implied by the analysis model, so that the analysis and imputation models are "compatible". In theory, given multiple partially observed variables, validity of MI may imply different causes of missingness for each missing data pattern. For example, if both BMI at age 7 years and maternal education were partially observed, MI would only be valid if missingness of BMI at age 7 years was unrelated to maternal education among individuals missing both BMI at age 7 years and maternal education (given the other observed data). Missingness of BMI at age 7 years could be related to maternal education among individuals with observed maternal education. In practice, we recommend focusing on the most common missing data patterns and/or variables with the most missing data. Less common missing data patterns can often be assumed to be missing completely at random - it is unlikely to change your final conclusions if this assumption is incorrect.
-In our example, we only have a single partially observed variable (BMI at age 7 years), so it is relatively simple to check the validity of MI based on our mDAG. We have already verified (using `checkCRA`) that BMI at age 7 years is unrelated to its missingness, given maternal age and maternal education. Therefore, we know that MI will be valid if we use only these variables in the imputation model for BMI at age 7 years (because the analysis model and the imputation model are exactly the same in this case). However, MI using just maternal age and maternal education in the imputation model for BMI at age 7 years will recover no additional information compared to CRA. Therefore, we may wish to include "auxiliary variables" in our imputation model for BMI at age 7 years (*i.e.* variables that are predictive of BMI at age 7 years, but that are not required for the analysis model) to improve the precision of our MI estimate, compared to the CRA estimate.
+In our example, we only have a single partially observed variable (BMI at age 7 years), so it is relatively simple to check the validity of MI based on our mDAG. We have already verified (using `checkCRA`) that BMI at age 7 years is unrelated to its missingness, given maternal age and maternal education. Therefore, we know that MI will be valid if we use only these variables in the imputation model for BMI at age 7 years (because the analysis model and the imputation model are exactly the same in this case). However, MI using just maternal age and maternal education in the imputation model for BMI at age 7 years will recover no additional information compared to CRA. Therefore, we may wish to include "auxiliary variables" in our imputation model for BMI at age 7 years. These are additional variables that are included as predictors in the imputation model but that are not required for the analysis model. If we choose auxiliary variables that are predictive of BMI at age 7 years, we can improve the precision of our MI estimate - reduce its standard error - compared to the CRA estimate.
-In our example, we have two variables that could be used as auxiliary variables: pregnancy size - singleton or multiple birth - (`pregsize`) and birth weight (`bwt`). We will inspect the missing data patterns in our dataset once again, having included our auxiliary variables, using `descMissData`.
+In our example, we have two variables that could be used as auxiliary variables: pregnancy size - singleton or multiple birth - (`pregsize`) and birth weight (`bwt`). We will inspect the missing data patterns in our dataset once again using `descMissData`, including our auxiliary variables.
```{r}
-descMissData(y="bmi7", covs="matage mated pregsize bwt", data=bmi)
+descMissData(y="bmi7",
+ covs="matage mated pregsize bwt",
+ data=bmi)
```
We can see that our auxiliary variables are fully observed.
-We assume that pregnancy size is a cause of BMI at age 7 years, but not its missingness, whereas we assume birth weight is related to both BMI at 7 years (via pregnancy size) and its missingness (via SEP). We will now add these variables to our mDAG. Below, we have shown our updated mDAG.
+We assume that pregnancy size is a cause of BMI at age 7 years, but not its missingness. We assume birth weight is related to both BMI at 7 years (via pregnancy size) and its missingness (via SEP). We will now add these variables to our mDAG. Below, we have shown our updated mDAG.
```{r, echo=FALSE, out.width="500px", out.height="400px", dpi=200}
plot(dagitty::dagitty('dag {
@@ -207,62 +245,94 @@ plot(dagitty::dagitty('dag {
We will also once again explore whether relationships in the dataset are consistent with the updated mDAG using `exploreDAG`, as follows.
```{r}
-exploreDAG(mdag="matage -> bmi7 mated -> matage mated -> bmi7 sep_unmeas -> mated sep_unmeas -> r
- pregsize -> bmi7 pregsize -> bwt sep_unmeas -> bwt", data=bmi)
+exploreDAG(mdag="matage -> bmi7
+ mated -> matage
+ mated -> bmi7
+ sep_unmeas -> mated
+ sep_unmeas -> r
+ pregsize -> bmi7
+ pregsize -> bwt
+ sep_unmeas -> bwt",
+ data=bmi)
```
Our results suggest that our updated mDAG is plausible.
Note that CRA is still valid for our updated mDAG. We can check this using `checkCRA` once more:
-```{r}
-checkCRA(y="bmi7", covs="matage mated", r_cra="r",
- mdag="matage -> bmi7 mated -> matage mated -> bmi7 sep_unmeas -> mated sep_unmeas -> r
- pregsize -> bmi7 pregsize -> bwt sep_unmeas -> bwt")
+```{r, R.options=list(width=80)}
+checkCRA(y="bmi7",
+ covs="matage mated",
+ r_cra="r",
+ mdag="matage -> bmi7
+ mated -> matage
+ mated -> bmi7
+ sep_unmeas -> mated
+ sep_unmeas -> r
+ pregsize -> bmi7
+ pregsize -> bwt
+ sep_unmeas -> bwt")
```
-We will now use the `midoc` function `checkMI` applied to our DAG to check whether MI is valid when the imputation model predictors for BMI at age 7 years include pregnancy size or birth weight, as well as maternal age and maternal education, specifying the partially observed variable (`dep`), predictors (`preds`), missingness indicator for the partially observed variable (`r_dep`), and mDAG (`mdag`).
+We will now use the `midoc` function `checkMI` applied to our DAG to check whether MI is valid when the imputation model predictors for BMI at age 7 years include pregnancy size or birth weight, as well as maternal age and maternal education. We will specify the partially observed variable (`dep`), predictors (`preds`), missingness indicator for the partially observed variable (`r_dep`), and mDAG (`mdag`).
We will first consider the imputation model including pregnancy size. The results are shown below. These suggest that MI would be valid in principle if we included pregnancy size as well as the other analysis model variables in the imputation model for BMI at age 7 years.
-```{r}
-checkMI(dep="bmi7", preds="matage mated pregsize", r_dep="r",
- mdag="matage -> bmi7 mated -> matage mated -> bmi7 sep_unmeas -> mated sep_unmeas -> r
- pregsize -> bmi7 pregsize -> bwt sep_unmeas -> bwt")
+```{r, R.options=list(width=80)}
+checkMI(dep="bmi7",
+ preds="matage mated pregsize",
+ r_dep="r",
+ mdag="matage -> bmi7
+ mated -> matage
+ mated -> bmi7
+ sep_unmeas -> mated
+ sep_unmeas -> r
+ pregsize -> bmi7
+ pregsize -> bwt
+ sep_unmeas -> bwt")
```
-We will next consider the imputation model including birth weight. The results are shown below. These suggest that MI would not be valid if we included birth weight as well as the other analysis model variables in the imputation model for BMI at age 7 years (we can also tell this by inspecting our mDAG: since `bwt` shares a common cause with both `bmi7` and `r`, it is a "collider", and hence conditioning on `bwt` opens a path from `bmi7` to `r` via `bwt`).
-
-```{r}
-checkMI(dep="bmi7", preds="matage mated bwt", r_dep="r",
- mdag="matage -> bmi7 mated -> matage mated -> bmi7 sep_unmeas -> mated sep_unmeas -> r
- pregsize -> bmi7 pregsize -> bwt sep_unmeas -> bwt")
+We will next consider the imputation model including birth weight. The results are shown below. These suggest that MI would not be valid if we included birth weight as well as the other analysis model variables in the imputation model for BMI at age 7 years. We can also tell this by inspecting our mDAG: since `bwt` shares a common cause with both `bmi7` and `r`, it is a "collider", and hence conditioning on `bwt` opens a path from `bmi7` to `r` via `bwt`.
+
+```{r, R.options=list(width=80)}
+checkMI(dep="bmi7",
+ preds="matage mated bwt",
+ r_dep="r",
+ mdag="matage -> bmi7
+ mated -> matage
+ mated -> bmi7
+ sep_unmeas -> mated
+ sep_unmeas -> r
+ pregsize -> bmi7
+ pregsize -> bwt
+ sep_unmeas -> bwt")
```
**Note**
-In theory, and as suggested by the `checkMI` results shown above, MI would be valid if we added both birth weight and pregnancy size as auxiliary variables in our imputation model (note that SEP is not needed, conditional on the other imputation model predictors). However, in practice, this strategy may still result in biased estimates, due to unmeasured confounding of the relationship between BMI at age 7 years and birth weight. We recommend not including colliders of the partially observed variable and its missingness as auxiliary variables [6](https://doi.org/10.3389/fepid.2023.1237447).
+In theory, and as suggested by the `checkMI` results shown above, MI would be valid if we added both birth weight and pregnancy size as auxiliary variables in our imputation model (note that SEP is not needed, conditional on the other imputation model predictors). However, in practice, this strategy may still result in biased estimates, due to unmeasured confounding of the relationship between BMI at age 7 years and birth weight. We recommend not including colliders of the partially observed variable and its missingness as auxiliary variables [7](https://doi.org/10.3389/fepid.2023.1237447).
## Step 4 Check that all relationships are correctly specified
So far, we have explored whether CRA and MI are valid *in principle* using our mDAG, without making any assumptions about the form of our variables, or their relationships with each other.
-However, for MI to give unbiased estimates, imputation models must be both compatible with the analysis model and correctly specified: they must contain all the variables required for the analysis model, they must include all relationships implied by the analysis model e.g. interactions, and they must specify the form of all relationships correctly [7](https://doi.org/10.1016/j.jclinepi.2023.06.011).
+However, for MI to give unbiased estimates, imputation models must be both compatible with the analysis model and correctly specified: they must contain all the variables required for the analysis model, they must include all relationships implied by the analysis model e.g. interactions, and they must specify the form of all relationships correctly [8](https://doi.org/10.1016/j.jclinepi.2023.06.011).
-Since CRA and MI are valid in principle for our worked example, we will use the complete records in the `bmi` dataset to explore the specification of relationships between BMI at age 7 years and the predictors (the analysis model variables, maternal age and maternal education, plus auxiliary variable, pregnancy size) in our imputation model for BMI at age 7 years.
+Since CRA and MI are valid in principle for our worked example, we will use the complete records in the `bmi` dataset to explore the specification of relationships between BMI at age 7 years and its predictors (the analysis model variables, maternal age and maternal education, plus auxiliary variable, pregnancy size) in its imputation model.
-We will use the `midoc` function `checkModSpec` applied to the `bmi` dataset to check whether our imputation model is correctly specified, specifying the formula for the imputation model using standard R syntax (`formula`), the type of imputation model (`family`) (note that `midoc` currently supports either linear or logistic regression models), and the name of the dataset (`data`).
+We will use the `midoc` function `checkModSpec` applied to the `bmi` dataset to check whether our imputation model is correctly specified. We will specify the formula for the imputation model using standard R syntax (`formula`), the type of imputation model (`family`) (note that `midoc` currently supports either linear or logistic regression models), and the name of the dataset (`data`).
Since maternal education and pregnancy size are binary variables, we only need to explore the form of the relationship between BMI at age 7 years and our continuous exposure, maternal age. We will first assume there is a linear relationship between BMI at age 7 years and maternal age (note, this is the default in most software implementations of MI). We will assume there are no interactions.
The results are shown below. These suggest that our imputation model is mis-specified. A plot of the residuals versus the fitted values from our model (which is automatically displayed if there is evidence of model mis-specification), suggests there may be a quadratic relationship between BMI at age 7 years and maternal age.
-```{r}
-checkModSpec(formula="bmi7~matage+mated+pregsize", family="gaussian(identity)",
+```{r, R.options=list(width=80)}
+checkModSpec(formula="bmi7~matage+mated+pregsize",
+ family="gaussian(identity)",
data=bmi)
```
@@ -270,25 +340,29 @@ We will use the `midoc` function `checkModSpec` again, this time specifying a qu
The results below suggest there is no longer evidence of model mis-specification.
-```{r}
-checkModSpec(formula="bmi7~matage+I(matage^2)+mated+pregsize", family="gaussian(identity)",
+```{r, R.options=list(width=80)}
+checkModSpec(formula="bmi7~matage+I(matage^2)+mated+pregsize",
+ family="gaussian(identity)",
data=bmi)
```
**Note** We must make sure we account for the non-linear relationship between BMI at age 7 years and maternal age in all other imputation models. For example, the imputation model for pregnancy size would need to include BMI at age 7 years, maternal education, and a quadratic form of maternal age (induced by conditioning on BMI at age 7 years). Although there are no missing values for pregnancy size in our dataset, we can still explore the specification that we would need using `checkModSpec` as follows (note that we have suppressed the plot in this case using the `plot = FALSE` option):
-```{r}
-checkModSpec(formula="pregsize~matage+bmi7+mated", family="binomial(logit)",
- data=bmi, plot=FALSE)
+```{r, R.options=list(width=80)}
+checkModSpec(formula="pregsize~matage+bmi7+mated",
+ family="binomial(logit)",
+ data=bmi,
+ plot=FALSE)
```
There is some evidence of model mis-specification.
Once we include a quadratic form of maternal age in our model for pregnancy size, there is little evidence of model mis-specification:
-```{r}
-checkModSpec(formula="pregsize~matage+I(matage^2)+bmi7+mated", family="binomial(logit)",
+```{r, R.options=list(width=80)}
+checkModSpec(formula="pregsize~matage+I(matage^2)+bmi7+mated",
+ family="binomial(logit)",
data=bmi)
```
@@ -311,33 +385,39 @@ checkModSpec(formula="pregsize~matage+I(matage^2)+bmi7+mated", family="binomial(
## Step 5 Perform MI using the proposed imputation model
-We have explored both the validity of MI in principle, using our mDAG, and the specification of our imputation model, based on our observed data. We will now use the `midoc` function `proposeMI` to choose the best options when performing MI using the [mice](https://doi.org/10.18637/jss.v045.i03) package. We will first save our chosen imputation model (*i.e.* specifying a quadratic relationship between BMI at age 7 years and maternal age) as a `mimod` object (note we have suppressed the `checkModSpec` output in this case using the `message = FALSE` option). We will then use this, along with our dataset, to construct our call of the "mice" function. Note we will also save our proposed "mice" call as a `miprop` object, to be used later.
+We have explored both the validity of MI in principle, using our mDAG, and the specification of our imputation model, based on our observed data. We will now use the `midoc` function `proposeMI` to choose the best options when performing MI using the [mice](https://doi.org/10.18637/jss.v045.i03) package. We will first save our chosen imputation model (*i.e.* specifying a quadratic relationship between BMI at age 7 years and maternal age) as a `mimod` object. Note we have suppressed the `checkModSpec` message in this case using the `message = FALSE` option. We will then use this, along with our dataset, to construct our call of the "mice" function. Note we will also save our proposed "mice" call as a `miprop` object, to be used later.
-The results are shown below. In particular, note that in the proposed "mice" call, the default values for the number of imputations, method, formulas, and number of iterations have been changed. Plots of the distributions of imputed and observed data, based on a sample of five imputed datasets, suggest that extreme values are handled appropriately using the proposed imputation method. Trace plots, showing the mean and standard deviation of the imputed values across iterations, are also displayed. Note that there is no need to adjust the number of iterations when, as in our dataset, only one variable is partially observed.
+The results are shown below. In particular, note that in the proposed "mice" call, the default values for the number of imputations, method, formulas, and number of iterations have been changed. Plots of the distributions of imputed and observed data, based on a sample of five imputed datasets, suggest that extreme values are handled appropriately using the proposed imputation method. Trace plots, showing the mean and standard deviation of the imputed values across iterations, are also displayed. Note that both plots are shown without prompting (`plotprompt = FALSE`). There is no need to adjust the number of iterations when, as in our dataset, only one variable is partially observed.
-```{r}
+```{r, R.options=list(width=80)}
mimod_bmi7 <- checkModSpec(formula="bmi7~matage+I(matage^2)+mated+pregsize",
- family="gaussian(identity)", data=bmi,
- message=FALSE)
-miprop <- proposeMI(mimodobj=mimod_bmi7, data=bmi)
+ family="gaussian(identity)",
+ data=bmi,
+ message=FALSE)
+miprop <- proposeMI(mimodobj=mimod_bmi7,
+ data=bmi,
+ plotprompt=FALSE)
```
-**Note** Given multiple partially observed variables, we can specify a list of imputation models (one for each partially observed variable) in `proposeMI` (although note that the Shiny app above only allows a single imputation model to be specified). For example, suppose pregnancy size was also partially observed (assuming, for simplicity, that pregnancy size was missing completely at random). Then we could construct our proposed "mice" call using `proposeMI`, as follows:
+**Note** Given multiple partially observed variables, we can specify a list of imputation models - one for each partially observed variable - in `proposeMI`. For example, suppose pregnancy size was also partially observed. We will assume, for simplicity, that pregnancy size was missing completely at random. Then we could construct our proposed "mice" call using `proposeMI`, as follows. Here, we again suppress the model checking messages.
```{r, eval=FALSE}
-mimod_bmi7 <- checkModSpec(formula="bmi7~matage+I(matage^2)+mated+pregsize",
- family="gaussian(identity)", data=bmi,
- message=FALSE)
+mimod_bmi7 <- checkModSpec(formula="bmi7~matage+I(matage^2)+mated+pregsize",
+ family="gaussian(identity)",
+ data=bmi,
+ message=FALSE)
mimod_pregsize <- checkModSpec(formula="pregsize~bmi7+matage+I(matage^2)+mated",
- family="binomial(logit)", data=bmi,
- message=FALSE)
-proposeMI(mimodobj=list(mimod_bmi7, mimod_pregsize), data=bmi)
+ family="binomial(logit)",
+ data=bmi,
+ message=FALSE)
+proposeMI(mimodobj=list(mimod_bmi7, mimod_pregsize),
+ data=bmi)
```
Returning to our example, we will assume no further adjustment is required to the proposed "mice" call. We will use the `midoc` function `doMImice` to perform MI, specifying our proposed "mice" call (`miprop`) and the seed for our "mice" call (`seed`) (so that our results are reproducible). We will also specify our substantive model of interest (`substmod`): a regression of BMI at 7 years on maternal age (fitting a quadratic relationship) and maternal education. This is an optional step: if we specify the substantive model, it will be fitted automatically to each imputed dataset and the pooled results will be displayed (equivalent to using the "mice" functions `with` and `pool`). If the substantive model is not specified, only the imputation step will be performed.
-```{r}
-doMImice(miprop, 123, substmod="lm(bmi7 ~ matage + I(matage^2) + mated)")
+```{r, R.options=list(width=80)}
+doMImice(miprop, seed=123, substmod="lm(bmi7 ~ matage + I(matage^2) + mated)")
```
## Illustration using our worked example
@@ -345,7 +425,7 @@ Finally, we illustrate how our choice of analysis approach affects the estimated
The parameter estimates for the linear and quadratic terms of maternal age, and their 95% confidence intervals, are shown in the table below. Note that, because we have simulated the data and its missingness, we know the "true" association *i.e.* the association if there were no missing data - this is shown in the "Full data" row of the table. Further note that the results displayed in the third row ("MI fitting quadratic relationship, using pregnancy size") are exactly those generated above. To avoid repetition, we have not shown the code for fitting the other models.
-From the table, we can see that both CRA and MI (fitting a quadratic relationship between BMI at age 7 years and maternal age in the imputation model) estimates are unbiased for both the linear and quadratic terms of maternal age. MI estimates are biased when fitting a linear relationship in the imputation model, particularly for the quadratic term of maternal age. MI estimates using the collider, birth weight, as an auxiliary variable are slightly more biased and slightly less precise than the estimates using pregnancy size as an auxiliary variable. The collider bias is relatively small because the association between BMI at age 7 years and maternal age is strong in this setting. Note that the collider bias could be relatively larger if the association was weak [8](https://doi.org/10.3389/fepid.2023.1237447).
+From the table, we can see that both CRA and MI (fitting a quadratic relationship between BMI at age 7 years and maternal age in the imputation model) estimates are unbiased for both the linear and quadratic terms of maternal age. MI estimates are biased when fitting a linear relationship in the imputation model, particularly for the quadratic term of maternal age. MI estimates using the collider, birth weight, as an auxiliary variable are slightly more biased and slightly less precise than the estimates using pregnancy size as an auxiliary variable. The collider bias is relatively small because the association between BMI at age 7 years and maternal age is strong in this setting. Note that the collider bias could be relatively larger if the association was weak [9](https://doi.org/10.3389/fepid.2023.1237447).
```{r echo=FALSE}
results <- data.frame(approach="Full data",linest="1.17 (1.09-1.26)", quadest="0.86 (0.80-0.91)")