Going to attempt my first TWEETORIAL (h/t @ProfDFrancis, @venkmurthy, @VinayPrasad82 and others) about an innovative clinical trial & how it may be useful in making RCT’s pragmatic, fleet & adaptable
Some shiny objects to get everyone’s attention: Bayesian adaptive-enrichment design and how it might help resolve the #RCT4Impella controversy
(Disclaimer: the DAWN trial deals with a clinical situation a little bit outside my poor-man’s medical knowledge, please forgive me)
Insofar as I can understand the clinical question, the DAWN trial investigated the use of mechanical embolectomy vs standard medical therapy in acute ischemic stroke
Credit where it is due: the DAWN trial was performed with a Bayesian adaptive-enrichment design from @statberry and his team
This design offers several unique aspects which might appeal to those reluctant to consider the idea of an #RCT4Impella.
Like what, you ask?
Sequential interim analyses were performed to allow adaptation of enrollment criteria and adjustments of sample size. Those two design elements are particularly attractive in high-risk populations, like, say, pts with cardiogenic shock being considered for Impella.
Let’s talk about why.
The DAWN trial was designed to enroll anywhere from 150 patients up to 500 patients, with interim analyses every 50 patients from 150 onwards
At each decision point, a couple of neat things could happen
First neat thing: at each interim analysis, there was the possibility for population enrichment, with the goal of focusing the trial on those most likely to benefit
In DAWN, enrichment decisions could be made based on baseline infarct size, with five possible subpopulations (0-50 or 0-45, 0-40, 0-35, 0-30) that could be considered
The trial begins including all patients (0-50). If interim analyses demonstrated strong evidence of futility in patients with larger infarcts, decision could be made to discontinue enrollment of those patients
(thereby “enriching” the population to focus on those most likely to benefit and stopping randomization in patients that are unlikely to benefit)
Final efficacy analysis would be performed on patients with infarct sizes that were in the arms still being enrolled after interim analyses
This design feature cuts STRAIGHT TO THE HEART of an oft-repeated complaint about RCT’s: that the trials only provide an estimate of average treatment effect, but we just need to use the therapy on the “right patients”
If adaptive-enrichment trial is designed well (like DAWN!) it will TELL YOU who the “right patients” are and provide feedback that therapy was effective in arms A, B, C but futile for arms D and E.
In the cardiogenic shock-to-Impella population, a common theme that I hear in Twitter debates is that even without RCT evidence there must be a place for this device with “the right patients”
Great. Let’s use an adaptive-randomization scheme like DAWN and, based on some combination of hemodynamic parameters, we can figure out exactly who those “right patients” that will benefit from Impella are.
Second neat thing about this design: if outcomes were proving so good in the active-therapy arm that it was difficult to justify continued enrollment, the trial could be stopped for efficacy (expected success)
There are some nuances to this a little outside the scope of the Tweetorial, but essentially, if the Bayesian posterior probability exceeded a specific threshold of expected success, we would stop trial
(this is what actually happened in DAWN, by the way)
Enrollment stopped at 206 patients because the outcomes were so good with mechanical embolectomy that we had sufficient evidence to conclude that therapy was highly effective
This is worded as a posterior probability of superiority of >0.999, meaning that based on the observed data there was a >99.9% chance that mechanical embolectomy offered superior outcomes to standard medical therapy
(That is much different and more useful than a small p-value, by the way. That will have to be a topic for another day. This is already a really long series of Tweets!)
(And also, yes, there are interim analysis strategies deployed in “regular” trials; I would argue that this design is much more fleet and adaptable and skips to this answer much faster, we can discuss that in side conversation if you like)
OK, back to the point: why is this relevant in the #RCT4Impella debate?
In very high-risk population(s), major concern about exposing large number of patients to RCT if active therapy is efficacious (in which case randomizing to control is withholding effective therapy) or futile (randomizing to active offers no benefit and is potentially harmful).
A version of the adaptive design deployed in DAWN would allow get to answer quicker than standard RCT and minimize number of patients exposed, as opposed to “traditional” power calculations based on set-it-and-forget-it enrollment targets
(Yes, I’m oversimplifying a little bit, as there are safety and efficacy interim analyses during standard RCT as well)
The numbers needed for the trial may be somewhat different than DAWN, depending on mortality in specific enrolled population, but the potential exists to reach an “answer” very quickly if strong evidence of efficacy or futility
Certainly possible this could be designed to enroll a few hundred up to 1000 pts and likely stopped early if strong signal in either direction
The adaptive-enrichment could be used to identify subpopulations (based on hemodynamics) and adjust enrollment as trial progresses to keep enrolling in arms where there is sign of benefit that needs to firm up, but close out enrollment where therapy is proving futile
Deploying such a design would be the most ethical way to balance answering question in statistically valid way while minimizing "n" exposed to RCT (if device works, few randomized to “no device” before we know; if device no good, minimize n receiving a futile therapy)
Otherwise, we continue to fly blind with people on one side insisting that it’s unethical to withhold Impella therapy and other side insisting that good-quality RCT data necessary to firmly establish that device actually improves survival
Thanks for reading, all. If you enjoyed please RT across #cardiotwitter, #biostats, and especially to your trainees that may not have had opportunity to learn what an “adaptive trial” is and why it can be so useful
I will now run away and hide and defer all questions to people like @statberry and @f2harrell , who are smarter and better spoken/written than I on the subject of Bayesian trial designs.
pls share with trainees who may find this a useful read on what an "adaptive design" means in RCT's
I also must apologize to @VinayPrasadMD for my typo in the first post (doesn't someone on here have a twitter handle that ends in "82"? For some reason I goofed and thought that was your handle, sorry dude)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Please be advised that @graemeleehickey and others are more expert than I am in the direct, real-world application of such models, but here I am, so whatever. Read it, or don’t.
Suppose you’re just reading along in the #COAPT primary paper, found here:
(THREAD) As requested/discussed yesterday, here are a few thoughts on post-hoc power
It is not uncommon for reviewers to ask for a “post hoc power” calculation. The most common reasons people ask about this are:
i) the main findings aren’t significant, and they want to know either a) what was the “observed power” (which we’ll discuss in a moment) or b) “given the observed effect size, how large would your study have needed to be for significance”
I've been a mere spectator to the Wansink scandal, but I think the cautionary tale is worth amplifying across the fields of statistics, medicine & the entire research community. Thus far, the discussion *seems* mostly confined to psychology researchers & some statisticians.
I think it’s important to spread this story across all research for those who may not be aware i) what has happened and ii) why it’s a big deal.
I’m going to link several tweets, threads, and articles at the end of the thread. In the first 11 Tweets, here is the “short” version for those unaware of the Wansink scandal:
A few months ago, the Annals of Medicine published a controversial piece titled “Why all randomized controlled trials produce biased results.” Topic: not a bad idea - we should examine trials carefully. Execution: left something to be desired.
We have penned a reply that covered some of the most problematic misstatements, with helpful input from @coachabebe, @GSCollins, and @f2harrell
After some emails between ourselves, Krauss, and the journal, he chose to revise the original piece in response to some of our comments
While I am generally a fan of Ioannidis' & believe he raises valid points here & elsewhere, this piece is more than a little ironic. As of September 12, 2018, he has authored 58 papers published this year (and it's no fluke - 2017: 64, 2016: 78, 2015: 74, etc...)
I do think the article raises some valid points about authorship, and I have certainly seen abuses (in both directions: undeserved authorships granted to people barely involved in the work, and screwjobs that denied people who deserved an authorship their appropriate credit)
People that run up large numbers of authorships (excluding outright nefarious conduct, like publishing in sham journals) are likely senior members of a lab or group for which they led many studies being written up by subordinates.