Tuesday, 23 November 2010

Is it time to be all Bayesian?


The world is evolving by day and the question is whether the change is for the better or not, and is it worth adapting to? It’s inevitable that also systems within the world are bound to change and the phase has been phenomenal of late. Statistics as a profession has not been spared; there is mushrooming of knowledge and new methodology being developed to deal with complex designs. My fellow colleague once said “I foresee supply outstripping demand”. The main challenges under these circumstances are the validation of these methodologies and selling them to the intended consumers.  A fascinating current discussion among statisticians is between frequentist and bayesian inference. This article will try to stimulate this ongoing discussion and look at this in a layman’s language.

Roderick Little once said “The practice of our discipline will mature only when we can come to a basic agreement about how to apply statistics to real problems. Simple and more general illustrations are given of the negative consequences of the existing schism between frequentists and Bayesians.” It seems there is a “cold” war within the discipline and only time will tell on who will win. This is a polemical issue that affect us all. Why can’t we act like Chemists say, no matter which route you use there are only two omponents to come-up with a water molecule? I guess this is the same with physics. Ours is a bit different with three camps; the purely frequentist, purely bayesian and those in the middle suffering from identity crisis. The most annoying thing is that the results don’t always agree. What a nightmare?

Basically, bayesian inference involves the use of explicit quantitative external evidence in the design, monitoring, analysis and reporting. In other words, the conditional distribution of the parameter of interest given the observed data is a function of the likelihood and prior distribution. In contrast, the frequentist approach says that the information about the unknown parameter of interest only comes from the data through the likelihood. Everyone knows that experience is the best teacher. It’s interesting how doctors and health care professionals use this route with current information around them to improve patient diagnosis. In reality, we all tend to operate in this mode all the times and absolutely make sense. Strictly speaking we are all bayesians in a way but formalising it is a sticking point and  a different issue all together.

Nevertheless, bayesian is not a platform for a free lunch. In order to come up with the posterior distribution of the unknown parameter of interest (drug effect say), you need the data then assume a model to obtain a likelihood, and most importantly the prior distribution of parameter of interest. I can now see how some anti-bayesian fictional persona creeps in. Obtaining the prior distribution is another nightmare and quagmire to cross. Choosing an “objective” prior out of infinitely many priors raises questions; at the same time how can I trust your prior when I have got mine? But let’s also remember that there same problem exist in frequentist during the planning of a trial. Sample size and power calculations use prior belief in an informal bayesian fashion.    

Gelman (2008) as one of the anti-bayesian to come open on the offensive noted without any doubt that bayesian is a coherent mathematical theory but he is sceptical on the practical application. He went on to quote Efron (1983) that “…recommending that scientists use Bayes' theorem is like giving the neighbourhood kids the key to your F-16. This means a lot, bayesian statistics is really much harder mathematically and practically compared to frequentist. I remember the difficulties I faced when I was learning WinBUGS and the theory behind it; it’s not easy and friendly at all. This has been viewed by many as a major obstacle (Efron). Nevertheless, I don’t agree with Gelman on the notion that it’s better to stick to tried, tested and “true methods“. I wonder if they are true methods, logic tells me that there are no true methods but others are better. In any case, sticking to an old gun thinking that it is the best ever is disastrous at best.  Remember that the so called "true" methods were once viewed by sceptics as rubbish but are now the cornerstone of our discipline.

The superiority of bayesian methods over frequentist in many areas such as indirect treatment effect, evidence synthesis and small trials (just to mention a few) should be recognised. For example, in a small trial it is ludicrous and naïve to assume to be in a promised land of asymptotia and use the likelihood theory alone.  Another simple example is on the interpretation of confidence intervals opposed to credible intervals in frequentist. Honestly, it’s confusing and ambiguous to talk about repeated sampling (long run process) on interpreting confidence intervals in a frequentist approach when in fact the trial has been conducted once.

There is as much bayesian junk as there is frequentist junk; the use of bayesian methods in hierarchical models with assumption of exchangeability whatever the case might be is worrying and not always tenable. This is one of the areas which require extensive research to incorporate different correlation structures. What about the use of magic Markov chain Monte Carlo “MCMC”? I was really sceptical and uneasy using MCMC but when you are in a multi-dimensional space with a complex function which is normally the case in bayesian framework (when you multiply prior distribution and likelihood function) it is a fantastic approach to evaluate these integrals. I don’t agree with those who think that this approach will erode our mathematical understanding of calculus. Let’s us not forget that the same MCMC approach is an important tool in frequentist multiple imputation techniques. I believe  that too much has been given on the subjectivity of bayesian to the extent that we are beginning to lose sight of the objectivity of this approach.

Bayesian should not be viewed as a replacement to frequentist but an alternative and a step forward. The recognition of strengths and weaknesses of the two has led Little to propose a truce in form of Calibrated Bayes roadmap. He argued that this is a compromise hybrid in order to harness the strength of the two approaches. The hybrid paradigm should be strong both on model formulation and assessment, also inference under assumed model and prediction. I can see how Little was influenced by the rule of averages here “if you deep one hand in boiling water and the other in ice, on average you will feel warm and better” :-)  Only time will tell whether this dream will be realised or the extreme one will win,  maybe through “democracy”. Declaring my allegiance in advance is suicidal but worth taking the risk. I am slightly above average but not purely bayesian...............!!

I hope this will leave us with more issues to argue.

All your comments are welcome.

Munya Dimairo (mdimairo@gmail.com)




   

Friday, 12 November 2010

What could go wrong in the analysis of longitudinal clinical trial?

I quote Dr Fisher's casebook on Keep dancing  'Dr Johnson remarked that “No man but a blockhead ever wrote, except for money.” If the great lexicographer was right, then I am a blockhead, for these little essays bring me neither fortune nor fame. They do, however, bring me something else: pleasure.'  My motivation here came from my little experience with clinical trial data, interests in longitudinal studies and a recent post on medstats google group by Chris Hunt (MedStats). thanks to Chris for flagging that question :-) This article is targeted to statisticians and clinical trialists, there is no hardcore stuff which i always avoided.


There is a general rise in appreciation and utilisation of longitidinal designs in clinical trial setting. Indeed, understanding profiles of patients with respect to treatment effect with time is important inorder to come up with an effective intervention that will improve their quality of life immensely. Interest on inference can be targeted to short and long term or time averaged effects. This is just a refresher and i don't want to dwell much on this in this article!

Now straight to the point; without loss of generality, the parameter of interest in any trial is the treatment effect. In the case of randomised control trial, assuming randomisation being effective then the analysis is straight forward compered to epidemiological studies. For simple trials, this can be done using a t-test for contionuous outcomes but in case of baseline imbalances, a well known ANCOVA model adjusting for baseline will be required. The reasons for this are now well appreciated now; thanks to guys like Stephen Senn et al for the great work in this area!

Let's now consider a  longitudinal study, there is a great deal of modelling with challenges due to missing data, etc and therei is a danger of producing a higgledy-piggledy recipe. Different models in the name of fixed, mixed and random effects are potential candidates. (mixed models) (multi-level models-Goldstein) There is a danger of getting carried away with the modelling and forget crucial basic statistical principles. I will now consider a simple time averaged effect random effect model comparing two interventions on a continuous outcome; assuming randomisation being effective, the effects are in the order 0 and b say (where 0 and b are the effects at baseline and time averaged after baseline respectively). From my experience, not forgeting the advantage of repeated measures in reducing the sample size immensely; the number of patients required are relatively small thereby increasing the chances of baseline imbalances. Of course randomisation techniques such as minimisation, biased coin optimization etc can be use to improve balance in prognostic factors at baseline. I am not advocating or against these techniques but the take home message is that the chances of baseline imbalances are relatively high when using simple randomisation techniques in longitudinal studies. So fitting the model above will end up giving effects of the order 0+a and b+e (where a and e are the mean difference in outcome responce at baseline and inherent effect due to a respectively). The effect of a is to change the "true" effect b by a factor e through regression to the mean phenomenon.(Regression to the mean)

This is normally neglected in longitudinal setting; concentration is placed on difference on mean profile and forget to constraint a to 0. I guess maybe we forget the basic principles of trials. I view this as wrong and inherent effect on inference is substantial. One way to get around this problem is to extend the "ANCOVA" model in longitudinal framework by including baseline as a covariate, i will call it longitudinal ANCOVA. This constraint the baseline treatment effect to be zero (this is what we expect for a fair comparison). In any case, why would we expect treatment effect (not zero) at baseline when all patients haven't been exposed to the new intervention? This can only be due to baseline imbalances which is our number one enermy "terrorist" in clinical trials!!

 A simple example is given below  (done in STATA 11.1); where bdiscore (continuous) in the outcome measure and assessments repeated at different time occassions (time). This can be replicated in SAS, MLWiN, etc

bysort id (time): gen baseline=bdiscore[1] if bdiscore!=.
xtmixed bdiscore baseline time i2.treat||id:time, reml cov(unstr)

Jos W. R. Twisk and Wieke de Vente looked at different approaches to this (Twisk-2008)
although I believe a simulation study should be undertaken to check if their assertion that this method slightly overestimate the treatment effect is true. Its one of the things on my "to do list", I hope it won't lie idle on my table forever!

Any comments, suggestions, additions and subtractions are always welcome :-)

Hurray!!

Note: I wrote this article in a rush so errors are bound to exist!!!!!!