I quote Dr Fisher's casebook on Keep dancing 'Dr Johnson remarked that “No man but a blockhead ever wrote, except for money.” If the great lexicographer was right, then I am a blockhead, for these little essays bring me neither fortune nor fame. They do, however, bring me something else: pleasure.' My motivation here came from my little experience with clinical trial data, interests in longitudinal studies and a recent post on medstats google group by Chris Hunt (MedStats). thanks to Chris for flagging that question :-) This article is targeted to statisticians and clinical trialists, there is no hardcore stuff which i always avoided.
There is a general rise in appreciation and utilisation of longitidinal designs in clinical trial setting. Indeed, understanding profiles of patients with respect to treatment effect with time is important inorder to come up with an effective intervention that will improve their quality of life immensely. Interest on inference can be targeted to short and long term or time averaged effects. This is just a refresher and i don't want to dwell much on this in this article!
Now straight to the point; without loss of generality, the parameter of interest in any trial is the treatment effect. In the case of randomised control trial, assuming randomisation being effective then the analysis is straight forward compered to epidemiological studies. For simple trials, this can be done using a t-test for contionuous outcomes but in case of baseline imbalances, a well known ANCOVA model adjusting for baseline will be required. The reasons for this are now well appreciated now; thanks to guys like Stephen Senn et al for the great work in this area!
Let's now consider a longitudinal study, there is a great deal of modelling with challenges due to missing data, etc and therei is a danger of producing a higgledy-piggledy recipe. Different models in the name of fixed, mixed and random effects are potential candidates. (mixed models) (multi-level models-Goldstein) There is a danger of getting carried away with the modelling and forget crucial basic statistical principles. I will now consider a simple time averaged effect random effect model comparing two interventions on a continuous outcome; assuming randomisation being effective, the effects are in the order 0 and b say (where 0 and b are the effects at baseline and time averaged after baseline respectively). From my experience, not forgeting the advantage of repeated measures in reducing the sample size immensely; the number of patients required are relatively small thereby increasing the chances of baseline imbalances. Of course randomisation techniques such as minimisation, biased coin optimization etc can be use to improve balance in prognostic factors at baseline. I am not advocating or against these techniques but the take home message is that the chances of baseline imbalances are relatively high when using simple randomisation techniques in longitudinal studies. So fitting the model above will end up giving effects of the order 0+a and b+e (where a and e are the mean difference in outcome responce at baseline and inherent effect due to a respectively). The effect of a is to change the "true" effect b by a factor e through regression to the mean phenomenon.(Regression to the mean)
This is normally neglected in longitudinal setting; concentration is placed on difference on mean profile and forget to constraint a to 0. I guess maybe we forget the basic principles of trials. I view this as wrong and inherent effect on inference is substantial. One way to get around this problem is to extend the "ANCOVA" model in longitudinal framework by including baseline as a covariate, i will call it longitudinal ANCOVA. This constraint the baseline treatment effect to be zero (this is what we expect for a fair comparison). In any case, why would we expect treatment effect (not zero) at baseline when all patients haven't been exposed to the new intervention? This can only be due to baseline imbalances which is our number one enermy "terrorist" in clinical trials!!
A simple example is given below (done in STATA 11.1); where bdiscore (continuous) in the outcome measure and assessments repeated at different time occassions (time). This can be replicated in SAS, MLWiN, etc
bysort id (time): gen baseline=bdiscore[1] if bdiscore!=.
xtmixed bdiscore baseline time i2.treat||id:time, reml cov(unstr)
Jos W. R. Twisk and Wieke de Vente looked at different approaches to this (Twisk-2008)
although I believe a simulation study should be undertaken to check if their assertion that this method slightly overestimate the treatment effect is true. Its one of the things on my "to do list", I hope it won't lie idle on my table forever!
Any comments, suggestions, additions and subtractions are always welcome :-)
Hurray!!
Note: I wrote this article in a rush so errors are bound to exist!!!!!!
this is flaw in the randomisation process rather longitudinal modelling. If no randomisarion as in observational cohort studies and one is looking for compare treatment effect a adjustement for baseline factor always works in fact, but can always argue that, its not as effective as using propensity score.
ReplyDeleteSurely, it's not about the flaw of the statistical model but its inefficiency to account for baseline imbalances. For instance, consider a pre-post study in the presence of baseline measures; we know that analysis of change is not flawed method but not efficient in this case due to possible regression to the mean. This argument could be extrapolated to longitudinal models. we need to find an efficient methods in the presents of ineffective randomisation. On observational studies, the situation is different here and has bee discussed by Stephen Senn et al. I have tried to limit this to RCT in order to avoid that confusion. On propensity score, this is another area of current discussions with lot of issues to be researched on. maybe we'll initiate a discussion on this topic in future. many thanks
ReplyDelete