"A stroll in the park"

Sunday 21 August 2011

When the Godfather’s empire crumbles as a result of denial; the Libyan scenario

I’ve never been an aficionado of war neither do I enjoy playing war games on XBox despite “The Godfather” being my favourite gangster film of all time alongside “Good Fellas”. It seems weird but not, it’s about the storyline and lessons learnt that keeps me closer to these films not their violent nature. I’m not a politician neither do I intend to be one another day but I’ve watched and followed events unfolding on the current Libyan crisis. This conflict raised a lot of questions with wider implications.

My wrong perception of Gaddafi (when I was a kid)

For the past 41 years, Muammar Gaddafi has been viewed as “The Godfather” of Libya. He built an empire of patronage, indoctrination, aggression, fear and repression. Without any doubt, he was the untouchable, the invincible and “Don Vito Corleone” of Libya – remember that he came to power through a coup when he was aged just 27. When I was growing up, he was one of my idols – I don’t actually know why? Maybe it’s because of my senile mind you never know? Well, I only remember a bit of it. Imagine an African leader who lived in a tent (his birth place) instead of a pushy mansion with swimming pool, tennis and basketball court, and a golf course close to it. A man who used to prefer walking to taking a Cadillac or Mercedes Benz CLK-GTR – you can name it. I used to associate these little things with the view that Gaddafi was the only African leader with the poor people at heart. Was I correct? I regret my bloody wrong perception of the guy. Yes, (it seems) I was WRONG! I was a kid though. I know I’m not the only one here – maybe we’re many if not uncountable. The guy had a taste for bling bling for real.

Despite being one of the richest countries with oil which is in demand across the whole world, why is Libya has poor infrastructure, poor communities, etc? Imagine a population of 6.4 million owning 3% of the word resources with an income of around £60 billion a year (assuming my figures are correct). This country was supposed to be the Abu Dhabi of Africa without any reasonable doubt. I’m not talking about democracy and human rights here but the standard of living, and quality of life of the general population. I personally don’t get it. Gaddafi could have ruled that country for a century without any major problems if there was equal distribution of wealth. When you sing and preach anti-western and anti-American mantra whilst investing billions of dollars in the same countries under the family or your name then there is a serious problem. When you can’t improve the living standards of your own people then there is a serious problem. When you suppress your own people and butchering them there is a serious problem. Moreover, practicing the same double standards against your political message and ideology is ludicrous. Why don’t you just walk the talk?

What went wrong that ignite the current crisis?

I saw it coming, maybe you didn’t. The Arab intifada was the tip of the iceberg for the current crisis in Libya. The Libyan regime’s responds to democratic protests was predictable given that Gaddafi had ruled the country for more than 40 years. This was a moment of madness and a lost opportunity for the regime. How do you justify an assault against your own people? Is it logical to suppress the views of your own people? This moment of denial by Gaddafi was regrettable at best. I used to believe that he was a real strategist – I was wrong again. The responds by the Security Council was justified given the responds from Gaddafi. Don’t misquote me here; I’m not supporting every action taken against Gaddafi. For instance, the Security Council was swift to react on Libya but not in Syria say. Surely, these double standards are worrying and raise serious questions of personal interests within members of the Security Council. Despite the limitation of the Security Council resolution to “No fly zone” with “necessary measures” to protect civilians, the objective was very clear even before the resolution. Cameron, Sarkozy and Obama (among other leaders) called for him to go beforehand and their stance remained the same even after the resolution. Were they correct to take that stance? I guess not! This is because I don’t like war and I believe leaders have a responsibility to act in a reasonable human manner against its own people without the dictation of other countries. Nevertheless, this shouldn’t distract us from the real issues. I was hoping for a national dialog initiated by Gaddafi himself with the help of the African Union. But to my regret Gaddafi acted like a mad man under the influence of a horse tranquiliser. When you call your own people rats and promise to wipe them out instead of initiating a national dialogue then you don’t deserve respect as a leader. He lost it again and the later initiative by the AU was just a waste of time. His egoism and the NATO bombing undermined the role of the AU in resolving the conflict. Did the AU have the leverage to force Gaddafi into dialogue with the rebels? It’s hard to tell given the weak relationships between Gaddafi and most African leaders. I doubt! Personal EGO is in the blood and DNA of Gaddafi. I may be wrong but unfortunately, we don’t have the opportunity to test that hypothesis it anymore. It’s a shame really.

Does the Western and American intervention linked to democracy and human rights?

I’m anti-war and don’t advocate for one. The costs and consequences are too much but what do you do when a leader becomes mad and start butchering its own people? Someone has to step in to prevent a massacre. Some say “two wrongs don’t make it right” – is it a fallacy? The oppressor and oppressed will view it differently form the same angle. I don’t want to be drawn into religion here – it will be an endless debate. My main concerns are the double standards by the West and Americans. Their actions are not consistent and seem to be driven by greediness and strategic interests. Is it surprising? I don’t think so. The killing of civilians being reported in Syria for instance is alarming but no one is willing to take any necessary actions similar to the ones in Libya. Is this being driven by oil and strategic interests under the pretext of democracy and human rights?

All these nations leading this operation in Libya have sizeable investments there (Spain’s Repsol, France’s Total, Italy’s Eni and Britain’s BP just to name a few). They are investing in war in anticipation of long term oil contracts. That’s a fact! Do you believe that any country can squander tax payer’s money and resources enforcing the UN resolution just for democracy and human rights alone? NO – it’s dirty business! Furthermore, the UN mandate was clear and meant to protect civilians. The current NATO operation is not about that but regime change through direct support of the rebels and NTC. This is not surprising at all because it was made very clear to the whole world that Gaddafi should go before the UN resolution. This was just a vehicle to justify war against a tyrant. Those countries that voted for the resolution or abstained who are complaining about NATO overstepping its mandate should shut up. This was anticipated by any layman person and you wonder how these diplomats didn’t see it coming.

The scandal surrounding London School of Economics and Gaddafi donations linked to the Saif al-Islam Muammar al-Gaddafi’s PhD raised a lot of questions than answers. Honestly, accepting the donation from a dictator and tyrant in exchange of favours in scandalous. When you advocate for democracy and human rights whilst involved in such malpractices is regrettable. The release of Lockerbie bomber is another scandal yet to be told – was this linked to BP contracts in Libya? Maybe the truth will come out some day.

What have we learnt so far especially from African perspective?

I blame nobody for our problems except ourselves. We are the masters of our own distraction and down fall. Firstly, we don’t have a platform for national dialogue in our political systems. This is a disgrace. Great nations today are a result of national building by involving all citizens on board. You don’t build a nation by creating divisions and marginalising societies; you don’t build a nation by classifying other citizens as third class citizens. That is retrogressive and against national progression. The consequences are so grave – it creates a leeway for interference. The sense of national belonging is paramount to any citizen.

Secondly, let’s not take people for granted. Gone are the days when politics used to be a game of promising heaven while delivering dust to people. Let’s walk the talk and be accountable to our own actions. Adapting to change is one of the pillars of progression otherwise you’ll be stagnant or even going backwards.

Thirdly, the culture of tolerance should be an integral part of our society at all levels. The use of force to silence the voices of people is just utterly shameful and unacceptable. That doesn’t mean that people shouldn’t respect the institutions in place in the name of freedom of speech, democracy and human rights. Why should people preach hatred and violence against your own brothers and sisters because of political views? It’s sickening. The solution is simple – a political will where those involved in such practices must be held to account without any favour is required

Fourthly, accountability-accountability-accountability! We have systems where accountability is not in our dictionary and vocabulary. It’s not part of our DNA at political level. This directly fuels high degree corruption with perpetrators getting away with it. Our politics should change. A bottom down approach is the only way forward. Are our leaders accountable to the people they are serving?

Fifth, why should we be taught about democracy and human rights? Don’t we know what is wrong and right for our own societies? Our societies are exposed and clamouring to be heard. Who doesn’t know killing or burning someone’s property for whatever reasons is wrong? Our failure in this respect makes our people view international voices on democracy and human rights as saviours. I don’t blame them at all; it’s their only option at their disposal. People want to live in peace with freedom of association and speech. Living in fear is retrogressive and we need to change.

Sixth, are we really poor to depend on international aid? NAY-we have plenty of natural resources but I believe our priories are wrong and there is no accountability on how the resources are distributed across the population. No one is against empowerment, even a lunatic but the question is “are we empowering those in need or it’s just for patronage?” My heart bleeds!

Last but not least, we are the masters of our destiny! Blaming foreign interference alone is utterly nonsense. Let’s remove our blinkers and look at issues in a bird’s eye view. We seem to be short sighted, dwell much on the past and don’t have a plan for the next generation after us. It’s a sickening syndrome that will leave us begging forever with fragmented society.

You can be in touch with Munya Dimairo on email: mdimairo@gmail.com

Tuesday 31 May 2011

Stata Snippets: How to export formatted results ready for publication?

This post is all about producing automated quality statistical outputs in Stata. Note that there are so many ways to do this and you don't need to be a geek. Why do we bother ourselves as statisticians writing codes and programs for automated quality outputs? Honestly, I used to write simple do files, save as log files, copying and pasting the contents of the do files to text editor (e.g word). Then creat and edit tables within a text editor. You could tell the pain and complexity of this sequence of events. It's just a sheer waste of time and energy. Imagine if you're working on more than 5 trials say and you need to produce statistical reports every fortnight or monthly. Surely, you'll struggle to find time to see your girlfriend. More so, there is increased risk of making errors during copying and pasting of statistical outputs. The only solution is to invest in time once and write codes and programs to produce outputs straight to the printer. Isn't it that appealing? I don't need to go through a tiresome process each time I need an output but to push the button and make some coffee. I love this approach. It also gives you time to do your own research work.

There are a lot of user written commands in Stata that could be used to produce these outputs (e.g tabout, estout, postfile,parmest, etc). There are no rules of thumb when it comes to writing a code but some few tips will always help. I have my own phrase which goes like "Why should I do it your own way?". Sometimes it works BUT not always. You have to consider some snippets from others here and there. Here are some;

Rule 1: Be smart! I'm not talking about wearing a tie or gold chains "bling bling". A statistician needs to be ORGANISED especially when working on different projects.
Rule 2: Invest in time to produce automated outputs. It's a pain and frustrating at first BUT that's the nature of every learning process.
Rule 3: Always give the code to your colleague to review. This is important; Of course to err is human, to forgive is divine BUT a statistician should avoid giving out a WRONG output at all cost.
Rule 4: Always check your code on few variables (categorical and continuous) to assertain if it's executing the intended task. Beware of missing values.
Rule 5: Don't be too obsessed with programming and forget the statistical principles in model building.
Rule 6: Push the button and remember to take your loved one for dinner or holiday. LIFE GOES ON!

Now we are done with the rules and lets start the real fun. I've tried to prepare a simple program to illustrate my point here. You can copy the contents and paste in Stata do file, and check a simple output. As I've said before, there are many ways of doing it and differ across text editors. For instance, Latex is different and there are special user witten commands in Stata to produce outputs straight in Latex. I'll illustrate this in the coming series. Here I'll concentrate on text editors such as word, notepad etc. Let's suppose we conducted a randomised clinical trials to test the effectiveness of a new intervention using a before and after design, and we need to present baseline characteristics of patients at randomisation and efficacy measures based on ANCOVA model. The code below will show you how to do this and results will be shown on Stata Results window. What you will need to do is to highlight all,copy and paste into notepad or word (use courier new font theme, size 8, and adjust line spacing on paragraph), and print. You could then save as pdf if you want. Done! There is nothing unique about Stata, you could produce outputs in a different way in SAS for example. So enjoy it.

*****************************************
* Program Author: Munya Dimairo
* Email: mdimairo@gmail.com or m.dimairo@sheffield.ac.uk
* Date: 31-05-2011
****************************************
* Results are automatically exported on stata result window
* formated for publication (copy and paste results in a text editor)
* to be modified a bit
****************************************
cap prog drop exportme
prog define exportme
*set trace on /*for debugging*/
*program syntax [pass the number of observations to generate hypothetical dataset]
syntax [,OBS(integer 4)]
   set more off
   qui drop _all
   version `c(version)'

   confirm integer number `obs'
if _rc!=0{
    di as error "Number of observations should be an integer"
    exit 198
}
else{
if `obs'<30{
    di as error "Number of Observations should be greater than 30"
    exit 198
}
else{
* set number of observations
qui set obs `obs'
    qui gen id=1000+ _n
    label var id "Patient ID Number"
* now lets simulate hypothetical data set (pre and post) and baseline data: just for illustration
qui gen sex=rbinomial(1,0.35) /*don't ask why i started with SEX :-) */
    qui replace sex=sex+1 /*just want categories to be 1 and 2*/
    label define sex 1"male" 2"female", modify
    label val sex sex
    label var sex "Sex"
qui gen treat=rbinomial(1,0.5)
    qui replace treat=treat+1 /*just want categories to be 1 and 2*/
    label define treat 1"Control" 2"Intervention", modify
    label val treat treat
    label var treat "Intervention Group"
qui gen age=round(rnormal(45,6))
    label var age "Age(years)"
* don't worry about how the variables are generated (that's not the point here)
qui gen qol=round(rnormal(50,5.6))
qui gen qolp= qol + round(rnormal(3,2.5))
    label var qol "Qolife(pre)"
    label var qolp "Qolife(p)"
qui gen dizz=round(rnormal(25,3.5))
qui gen dizzp= dizz - round(rnormal(3,2.5))
    label var dizz "Dizziness(pre)"
    label var dizzp "Dizziness(p)"
qui gen sym=round(rnormal(65,3.5))
qui gen symp= sym - round(rnormal(10,2.5))
    label var sym "Symptoms(pre)"
    label var symp "Symptoms(p)"
qui gen anxi=round(rnormal(72,5.5))
qui gen anxip= anxi - round(rnormal(15,2.5))
    label var anxi "Anxiety(pre)"
    label var anxip "Anxiety(p)"
qui gen cd4=round(rnormal(800,40))
qui gen cd4p= cd4+ round(rnormal(50,10))
    label var cd4 "CD4 count(pre)"
    label var cd4p "CD4 count(p)"
qui gen total=sym+qol
qui gen totalp=symp+qolp
    label var total "Total(pre)"
    label var totalp "Total(p)"
* Now we have created our hypothetical data set and let's the fun begins
***************************************
* the idea is to create baseline characteristics and regression results
* which is ready for publication
* we want to keep editing at its minimum level
***************************************
di _n
di as text _dup(35) "-"
di "Program Author" _col(20) ":" _col(22) "Munya Dimairo"
di "Email"          _col(20) ":" _col(22) "{it:mdimairo@gmail.com}"
di                  _col(20) ":" _col(22) "{it:m.dimairo@sheffield.ac.uk}"
di "Date"           _col(20) ":" _col(22) "`c(current_date)'"
di "Time"           _col(20) ":" _col(22) "`c(current_time)'"
di _dup(35) "-"

di _skip(2)
di _dup(55) "_"
di "Baseline Characteristics of Randomised Participants"
di _dup(55) "_"

qui count
global AN=r(N)
forvalues g=1/2{
    local trt`g': label treat `g'
    qui count if treat==`g'
    global N`g'=r(N)
    global N`g'f: di %3.1f (r(N)/$AN)* 100
    local col=`g'*20
    di _col(`col') as text "`trt`g''" _cont
}
di _n
di _col(2) as res "Total[N(%)]" _cont
forvalues g=1/2{
    local col=`g'*20
    di _col(`col') as res "${N`g'}(${N`g'f'}%)" _cont
}
di _n

* Sex
local sex1: variable label sex
di _col(2) "`sex1'[n(%)]"
forvalues i=1/2{
local sex`i': label sex `i'
    }
forvalues i=1/2{
forvalues g=1/2{
    qui count if treat==`g' & sex==`i'
    local n`g'`i'=r(N)
    local n`g'`i'f: di %3.1f (`n`g'`i''/${N`g'})*100
    }
if "`i'"=="1"{
    di _col(4) "`sex`i''" _col(20) "`n11'(`n11f'%)" _col(40) "`n21'(`n21f'%)""
}
else if "`i'"=="2"{
    di _col(4) "`sex`i''" _col(20) "`n12'(`n12f'%)" _col(40) "`n22'(`n22f'%)""
}
}

di
* Continuous variables
foreach var of varlist age dizz anxi cd4 sym qol total{
local `var'1: variable label `var'
di _col(2) "``var'1'"

* quietly produce summary statistics and save returned results
* tabstat will save into matrix r(Stat1), r(Stat2) and r(StatTotal)
* that is stats in group 1, 2 and all respectively

qui tabstat `var', stats(N mean sd q) by(treat) save
    matrix def treat1=r(Stat1)
    matrix def treat2=r(Stat2)
* extracts the elements of the matrices with respect to position
* that is, row by column position e.g treat1[2,1] means element on row #2 column #1
* save the elements as local macros for later use
* loop over by group to obtain results for group one and two
* this could be extended to include total summaries
forvalues i=1/2{
local rn`i'=treat`i'[1,1]
local mu`i': di %3.1f treat`i'[2,1]
local sd`i': di %3.1f treat`i'[3,1]
local p25_`i': di %3.1f treat`i'[4,1]
local p50_`i': di %3.1f treat`i'[5,1]
local p75_`i': di %3.1f treat`i'[6,1]
}
* now display the results saved above in specified columns
* e.g contents of local macro rn1 will be displayed in col(20)
* contents of local macro rn2 in col(40)
di _col(4) "n" _cont
forvalues i=1/2{
local col=`i'*(20)
di _col(`col') "`rn`i''" _cont
}
* this will dispaly the mean and std deviation as described above
di
di _col(4) "Mean(SD)" _cont
forvalues i=1/2{
local col=`i'*(20)
di _col(`col') "`mu`i''(`sd`i'')" _cont
}
* this will display the median with InterQuartile Range
di
di _col(4) "Median (IQR)" _cont
forvalues i=1/2{
local col=`i'*(20)-2
di _col(`col') "`p50_`i''(`p25_`i''-`p75_`i'')" _cont
}
* just skip one row
di _n
}
* just a line of length 55 characters
di _dup(55) "_"

**************Now we want to fit an ancova regression model (say)
*we want to display the following results
*N=Number of Observations used in the model
*Effect=mean difference between the treatment groups adjusted for baseline values (group 1 as reference)
*SE=Standard Error of the Effect
* LCL to UCL=lower and Upper 95% Confidence Limits Respectively
di _skip(4)
di as text "Analysis of change in the VBQ (and its subscales) and CD4 count using the ANCOVA model"
di
di _dup(78) "_"
di _col(58) as text "95% CI"
di "Variable" _col(30) "N" _col(35) "Effect" _col(45) "SE" _col(55) "LCL to UCL" _col(73) "p value" /*line 168*/
di
*loop over all releveant continuous outcomes
foreach x of varlist dizz anxi cd4 sym qol total{
* count if the baseline and follow-up measurements are not missing (regression set)
    qui count if (`x'!=.&`x'p!=.)
*save it in the local macro N
    local N=r(N)
* extract the variable label for the post measurement variables
    * save this in local macro i`x' for later use
    local i`x': variable label `x'p
* now we quietly and noisly run the ancova model (nothing magic here)
qui regress `x'p `x' i.treat
* extract the intervention effect and save in macro b
    local b: di %5.2f _b[2.treat]
* extract the std error of intervention effect and save in macro se
    local se: di %5.2f _se[2.treat]
* now obtain the t statistic which we will use to calculate the p value and save in macro t
    local t=`b'/`se'
* obtain the p value[e(df_r)=returned degrees of freedom] under the t-distribution
    local p: di %5.3f 2*ttail(e(df_r),abs(`t'))
*calculate the lower and upper confidence 95% CI
    local cl: di %5.2f `b'- invttail(e(df_r),0.025)*`se'
    local cu: di %5.2f `b'+ invttail(e(df_r),0.025)*`se'
* now we're done, we just need to reference the saved macros correponding to each element in LINE 168
di as res "`i`x''" as res _col(30) "`N'" _col(35) "`b'" _col(45) "`se'" _col(55) "`cl' to `cu'" _col(73) "`p'"
di

}
di _dup(78) "_"
}
}
exit

end

exportme,obs(200)
exit

Tuesday 23 November 2010

Is it time to be all Bayesian?

The world is evolving by day and the question is whether the change is for the better or not, and is it worth adapting to? It’s inevitable that also systems within the world are bound to change and the phase has been phenomenal of late. Statistics as a profession has not been spared; there is mushrooming of knowledge and new methodology being developed to deal with complex designs. My fellow colleague once said “I foresee supply outstripping demand”. The main challenges under these circumstances are the validation of these methodologies and selling them to the intended consumers. A fascinating current discussion among statisticians is between frequentist and bayesian inference. This article will try to stimulate this ongoing discussion and look at this in a layman’s language.

Roderick Little once said “The practice of our discipline will mature only when we can come to a basic agreement about how to apply statistics to real problems. Simple and more general illustrations are given of the negative consequences of the existing schism between frequentists and Bayesians.” It seems there is a “cold” war within the discipline and only time will tell on who will win. This is a polemical issue that affect us all. Why can’t we act like Chemists say, no matter which route you use there are only two omponents to come-up with a water molecule? I guess this is the same with physics. Ours is a bit different with three camps; the purely frequentist, purely bayesian and those in the middle suffering from identity crisis. The most annoying thing is that the results don’t always agree. What a nightmare?

Basically, bayesian inference involves the use of explicit quantitative external evidence in the design, monitoring, analysis and reporting. In other words, the conditional distribution of the parameter of interest given the observed data is a function of the likelihood and prior distribution. In contrast, the frequentist approach says that the information about the unknown parameter of interest only comes from the data through the likelihood. Everyone knows that experience is the best teacher. It’s interesting how doctors and health care professionals use this route with current information around them to improve patient diagnosis. In reality, we all tend to operate in this mode all the times and absolutely make sense. Strictly speaking we are all bayesians in a way but formalising it is a sticking point and a different issue all together.

Nevertheless, bayesian is not a platform for a free lunch. In order to come up with the posterior distribution of the unknown parameter of interest (drug effect say), you need the data then assume a model to obtain a likelihood, and most importantly the prior distribution of parameter of interest. I can now see how some anti-bayesian fictional persona creeps in. Obtaining the prior distribution is another nightmare and quagmire to cross. Choosing an “objective” prior out of infinitely many priors raises questions; at the same time how can I trust your prior when I have got mine? But let’s also remember that there same problem exist in frequentist during the planning of a trial. Sample size and power calculations use prior belief in an informal bayesian fashion.

Gelman (2008) as one of the anti-bayesian to come open on the offensive noted without any doubt that bayesian is a coherent mathematical theory but he is sceptical on the practical application. He went on to quote Efron (1983) that “…recommending that scientists use Bayes' theorem is like giving the neighbourhood kids the key to your F-16.” This means a lot, bayesian statistics is really much harder mathematically and practically compared to frequentist. I remember the difficulties I faced when I was learning WinBUGS and the theory behind it; it’s not easy and friendly at all. This has been viewed by many as a major obstacle (Efron). Nevertheless, I don’t agree with Gelman on the notion that it’s better to stick to tried, tested and “true methods“. I wonder if they are true methods, logic tells me that there are no true methods but others are better. In any case, sticking to an old gun thinking that it is the best ever is disastrous at best. Remember that the so called "true" methods were once viewed by sceptics as rubbish but are now the cornerstone of our discipline.

The superiority of bayesian methods over frequentist in many areas such as indirect treatment effect, evidence synthesis and small trials (just to mention a few) should be recognised. For example, in a small trial it is ludicrous and naïve to assume to be in a promised land of asymptotia and use the likelihood theory alone. Another simple example is on the interpretation of confidence intervals opposed to credible intervals in frequentist. Honestly, it’s confusing and ambiguous to talk about repeated sampling (long run process) on interpreting confidence intervals in a frequentist approach when in fact the trial has been conducted once.

There is as much bayesian junk as there is frequentist junk; the use of bayesian methods in hierarchical models with assumption of exchangeability whatever the case might be is worrying and not always tenable. This is one of the areas which require extensive research to incorporate different correlation structures. What about the use of magic Markov chain Monte Carlo “MCMC”? I was really sceptical and uneasy using MCMC but when you are in a multi-dimensional space with a complex function which is normally the case in bayesian framework (when you multiply prior distribution and likelihood function) it is a fantastic approach to evaluate these integrals. I don’t agree with those who think that this approach will erode our mathematical understanding of calculus. Let’s us not forget that the same MCMC approach is an important tool in frequentist multiple imputation techniques. I believe that too much has been given on the subjectivity of bayesian to the extent that we are beginning to lose sight of the objectivity of this approach.

Bayesian should not be viewed as a replacement to frequentist but an alternative and a step forward. The recognition of strengths and weaknesses of the two has led Little to propose a truce in form of Calibrated Bayes roadmap. He argued that this is a compromise hybrid in order to harness the strength of the two approaches. The hybrid paradigm should be strong both on model formulation and assessment, also inference under assumed model and prediction. I can see how Little was influenced by the rule of averages here “if you deep one hand in boiling water and the other in ice, on average you will feel warm and better” :-) Only time will tell whether this dream will be realised or the extreme one will win, maybe through “democracy”. Declaring my allegiance in advance is suicidal but worth taking the risk. I am slightly above average but not purely bayesian...............!!

I hope this will leave us with more issues to argue.

All your comments are welcome.

Munya Dimairo (mdimairo@gmail.com)

Friday 12 November 2010

What could go wrong in the analysis of longitudinal clinical trial?

I quote Dr Fisher's casebook on Keep dancing 'Dr Johnson remarked that “No man but a blockhead ever wrote, except for money.” If the great lexicographer was right, then I am a blockhead, for these little essays bring me neither fortune nor fame. They do, however, bring me something else: pleasure.' My motivation here came from my little experience with clinical trial data, interests in longitudinal studies and a recent post on medstats google group by Chris Hunt (MedStats). thanks to Chris for flagging that question :-) This article is targeted to statisticians and clinical trialists, there is no hardcore stuff which i always avoided.

There is a general rise in appreciation and utilisation of longitidinal designs in clinical trial setting. Indeed, understanding profiles of patients with respect to treatment effect with time is important inorder to come up with an effective intervention that will improve their quality of life immensely. Interest on inference can be targeted to short and long term or time averaged effects. This is just a refresher and i don't want to dwell much on this in this article!

Now straight to the point; without loss of generality, the parameter of interest in any trial is the treatment effect. In the case of randomised control trial, assuming randomisation being effective then the analysis is straight forward compered to epidemiological studies. For simple trials, this can be done using a t-test for contionuous outcomes but in case of baseline imbalances, a well known ANCOVA model adjusting for baseline will be required. The reasons for this are now well appreciated now; thanks to guys like Stephen Senn et al for the great work in this area!

Let's now consider a longitudinal study, there is a great deal of modelling with challenges due to missing data, etc and therei is a danger of producing a higgledy-piggledy recipe. Different models in the name of fixed, mixed and random effects are potential candidates. (mixed models) (multi-level models-Goldstein) There is a danger of getting carried away with the modelling and forget crucial basic statistical principles. I will now consider a simple time averaged effect random effect model comparing two interventions on a continuous outcome; assuming randomisation being effective, the effects are in the order 0 and b say (where 0 and b are the effects at baseline and time averaged after baseline respectively). From my experience, not forgeting the advantage of repeated measures in reducing the sample size immensely; the number of patients required are relatively small thereby increasing the chances of baseline imbalances. Of course randomisation techniques such as minimisation, biased coin optimization etc can be use to improve balance in prognostic factors at baseline. I am not advocating or against these techniques but the take home message is that the chances of baseline imbalances are relatively high when using simple randomisation techniques in longitudinal studies. So fitting the model above will end up giving effects of the order 0+a and b+e (where a and e are the mean difference in outcome responce at baseline and inherent effect due to a respectively). The effect of a is to change the "true" effect b by a factor e through regression to the mean phenomenon.(Regression to the mean)

This is normally neglected in longitudinal setting; concentration is placed on difference on mean profile and forget to constraint a to 0. I guess maybe we forget the basic principles of trials. I view this as wrong and inherent effect on inference is substantial. One way to get around this problem is to extend the "ANCOVA" model in longitudinal framework by including baseline as a covariate, i will call it longitudinal ANCOVA. This constraint the baseline treatment effect to be zero (this is what we expect for a fair comparison). In any case, why would we expect treatment effect (not zero) at baseline when all patients haven't been exposed to the new intervention? This can only be due to baseline imbalances which is our number one enermy "terrorist" in clinical trials!!

A simple example is given below (done in STATA 11.1); where bdiscore (continuous) in the outcome measure and assessments repeated at different time occassions (time). This can be replicated in SAS, MLWiN, etc

bysort id (time): gen baseline=bdiscore[1] if bdiscore!=.

xtmixed bdiscore baseline time i2.treat||id:time, reml cov(unstr)

Jos W. R. Twisk and Wieke de Vente looked at different approaches to this (Twisk-2008)

although I believe a simulation study should be undertaken to check if their assertion that this method slightly overestimate the treatment effect is true. Its one of the things on my "to do list", I hope it won't lie idle on my table forever!

Any comments, suggestions, additions and subtractions are always welcome :-)

Hurray!!

Note: I wrote this article in a rush so errors are bound to exist!!!!!!