Risk Ratio, Odds Ratio, Hazard Ratio

2nd and final part of the tweetorial, from ORBITA-HQ!

Fun, easy and informative [*]

#MedEd #FOAMed #cardiology #cardiotwitter
[*] Results may vary and are not guaranteed. See small print.

Not all fears are alike.

* Some fears are one-and-done:
I get a cold which progresses to pneumonia. Will I die, or recover?

* Other things hang over us for much longer, perhaps all your life.
Will you get hit by a bus?
Get a heart attack?
There's no "sell by" date on the fear.
Practice Qs

You are walking through Hyde park, minding your own business.
Hardly causing any trouble at all.
Perhaps the odd humorous tweet.

You get a Direct Message:
"Enough of ur abuse!
You will be hearing form my agent shortly.
AJ Kirtane"

You see the agent:
Shortly, the "swoosh" of an arrow pierces the afternoon quiet, and as you throw yourself into the shrubbery, you notice the archer has no spare arrows.

There is a single arrow.

If it misses, can that particular shot suddenly kill you later that day, or tomorrow?
Fortunately, you escape.
But you do have a bit of a sweet tooth, and a big tummy.

Your doctor calculates your 10-year CV risk as "15% over 10 years".
You are not very good with numbers, but this doesn't sound good to you.
"Cheer up," says your friend Maricris Tuason.

"Per year, it's only 1.5%! You are 98.5% infarct free this year!"

This slowly sinks in as better news.
"Per month, 0.1%," she continues.

"Per day, 0.004%. So stop worrying."
Is a probability of survival of 99.996% good?
For things that "depend for how long", there is no point quoting a risk or an odds, on their own. You MUST also quote how long this is for.

Let's summarise so far.
When you have a one-and-done event, such as a pneumonia where you die or not, a single bullet which kills you or not, then you


The only thing that matters is how many people died (or whatever).
The mathematically best thing to do, to see if a treatment makes one-and-done events SAFER, is the ODDS RATIO or RISK RATIO (also called relative risk)
is mathematically best. Frank Harrell has no doubt got a few chapters in his "Baffling Book of foRmulas" [BBR] about why Odds Ratio is preferable.

In short, ODDS ratio lets us see that 90% mortality is much better than 99%, even though the RISK ratio is almost 1.
is easiest for general people to understand.

For probabilities that are much closer to 0 than 1, odds ratio and risk ratio are almost the same. So we can almost interchange them.

But both are for "one and done" things.
ODDS and RISK are when the only thing that matters is THAT people died, NOT WHEN they died.

Whether you die at 5 am or 9 am, of your pneumonia, is no great difference to your family. There is an answer to "did you survive or not", which is what matters.

In the case of dying of life in general, there is no ultimate escape. Eventually it's gonna get you. The only question is when.

So a treatment that makes life safer for you, doesn't change the "final" percentage. Its 100% one way or another.
Have a look at the CONSENSUS trial, of Enalapril in chronic heart failure.
In the placebo arm, what is the risk of death?
In the enalapril arm, what is the risk of death?
So which of these is the closest to the RISK RATIO (also called RELATIVE RISK) for death, in the enalapril arm versus the placebo arm, in the graph shown above?
Not very exciting, eh?

So why are ACEi now routine care in chronic heart failure?
Yes, you calculated all of the above correctly. By the time everyone was dead in one arm, almost everyone was dead in the other arm.

So the relative risk was almost 1.

That's not the way to study the benefit of an intervention in patients who have very severe, lethal disease.
Let's look at another problem with just _counting_ up the deaths.

Look carefully at the second year of the trial.

In which arm did MORE ACTUAL PEOPLE die, in the second year?

This is almost the same as asking,
"Which red mortality line is steeper?"
Exactly! If we had only turned up at trial HQ at the start of the second year, and ONLY counted the notifications of deaths coming in, we wouldn't have been very happy at the end of the second year.

We'd have convened a secret committee to tell us to change the Primary Endpt to:
Fortunately, there was no need for this shenanigans, tweeting, letter-to-editoring and blocking.

There's a perfectly good reason why there were more deaths in the enalapril arm than the placebo arm, in the second year:
[Note to statisticians. I am avoiding describing what Cumulative Hazard itself is, as doctors tend to find this, and logging and exping, rather distressing. So I am aiming to go straight for hazard ratio, skipping Hazard.]
The idea of hazard ratio (in my simplified mind) focuses on short periods of time. So short that, roughly speaking, there is a steady rate of death in both arms. Different between the arms, but within each arm steady.

Let's look at a short bit of time in the placebo arm.
In this short bit of time, roughly what percentage of people died in this arm (placebo).
The deaths are the red wedge, accumulating steadily through time (even though it is a short time).
And now let's look at the treatment arm (Francisopril).
Roughly by eye, what proportion of people died in the Francisopril arm, during this short period of time?
So, in this very short period of time, what is the relative risk in the francisopril arm, versus the placebo arm as reference?
At a later point in the trial, there have been many deaths in the placebo arm.
There have been fewer deaths in the francisopril arm.

There are 80 patients in the francisopril arm, but only 40 in the placebo arm.
As you watch for a while, 4 people die in each arm.
For the patients alive at the beginning of this little time segment, what is the relative risk of dying in the francisopril arm, versus in the placebo arm?
Wait: I'm not happy with this answer distributn

In the active arm 4/80 people died
In the control arm 4/40 people died

Think about it. If you now had a free choice of "active" or "control" treatment, which would you take?

4 people died in each arm.

Which arm was better?

Don't look at "how many people died in that arm, during that time".

Look at "what PROPORTION of the people in that arm at the start of the time, died during that time."

Which arm had a better mortality rate?
So what is the relative risk, during this time window, of being in the active versus the control arm?
I don't normally give the answer away but in this case I feel obliged, due to the still-disappointingly high proportion getting the answer wrong.

If you did NOT pick the majority answer, please click here to be beaten to your senses.

Now, at last, we come to explore the new bit!

Hazard Ratios
Hazard ratios look at each little bit of time from the start of the trial to the end.
In each little bit of time, we ask "how risky is it to live through THIS BIT OF TIME in the active arm, compared with being in the placebo arm".
In each little bit of time, what should we ask?
What would you choose, if you are sensibly evaluating the riskiness of each arm?

[A] How MANY people died in the active versus control arms?

[B] What PROPORTION of the people died in the active versus control arms?
Suppose you decide to slice a year at a time.

You might start with:

Time | Number or ppl alive
| Active Control
0y | 100 100
1y | 90 80

What is the relative risk of the active arm, for that window of time, from 0 to 1 y?
That was easy. Hopefully everyone got it right.

Now let's look at the second year, i.e. from t=1yr, to t=2yr.

Time | Number alive
| Active Control
1y | 90 80
2y | 81 72

What is the relative risk during THIS window of time?
If you got the answer to either of the above wrong, please note that when we say "RELATIVE risk", "risk RATIO", "odds RATIO", or "hazard RATIO", we are always comparing an active arm to a control arm.

Some of you were just answering "RISK" in the active arm, not "RELATIVE risk"
Hint, if you are stuck on either of the above 2 questions.

Step 1. What is the RISK (i.e. proportion of those alive at the beginning of the period, who die during the period) in the active arm?

Step 2. Ditto for control arm.

Step 3. Calculate ratio, active÷control (as always).
People stuck on the EITHER of the two questions above should click here for more info:

Look at the relative risk you found in the FIRST year, and then the relative risk you found in the SECOND year.

Are they the same or different?
In reality throughout any trial, the relative risks at each moment in time keep varying between the two arms.

This is because real life is noisy and wiggly.

We don't want to track all those ups and downs: they are mostly noise and don't tell us anything useful.
We just want to know, at trial end:
* which arm was more risky on average?
* by how much?
* could this have easily arisen by chance, or is it more dramatic than normally happens by chance, i.e. statistically significant?

i.e. we want an overall Hazard Ratio, a CI and p value.
Well, here you go.
For people who are not @f2harrell, @jhoward or @mshunshin, i.e. don't converse in binary or ask for a multi-core GPU for a Christmas present, here is my explanation.

It is not guaranteed to be correct, due to me not really understanding how hazard ratios are calculated.
However it is good enough for you to be able to convince OTHER people who don't themselves know, i.e. basically everyone.
The Cox Proportional Hazards program makes the ASSUMPTION that at EACH PARTICULAR INSTANT IN TIME, the risk in the active arm is a certain multiple of the risk in the control arm.

Here is the Cox robot:
The robot says to itself:

"maybe the active arm is 0.9 x the risk in the control arm, at every point in time,
or maybe 0.8 x,
or maybe 1.2 x?

Which fits the data the best?"

It is finding a SINGLE scale factor that relates the risk in the active to that in the control.
A SINGLE scale factor, NEVER a time-varying scale factor.

Just to be sure you understand what I mean by a single scale factor working across the whole trial, let's do a quick check.

Look at the two event curves in this study below.
Do you think there is a single scale factor
(a single vertical stretching or shrinking manoeuvre)
that can change one red curve into the other?
For those disbelievers, here is a hint.
I will re-ask, and suggest those who were in the minority, view the video, before re-answering.

Do you think there is a single scale factor
(a single vertical stretching or shrinking manoeuvre)
that can change one red curve into the other?
Question 2.

Look at these 2 curves.
Can one be transformed into the other by a single vertical stretching or shrinking?
I had a call from a Dr G Onstegger.

"You think you are doing something useful with these idiotic tweetorials?
Well you are not!
How many companies line up to get you to run their trials.
I'll give you a clue: NONE.
You can only do trials ordinary riff-raff are interested in"
"Not landmark trials for the Captains of Industry.

And your puzzles are ridiculously easy.
That one you posted this very minute, that people can't solve?
I could do it!
In a second.
So fast it would make your head spin!
Let me show you."

OK, take it away G Onstegger!
The Cox Proportional Hazards calculation system / robot / thingy makes the ASSUMPTION that the scaling is possible, and then does the best possible job of scaling.

It will never be a perfect fit because
But nevertheless it comes up with a best-fitting SINGLE SCALE factor that turns one curve into the other.

Over a very short period of time, Hazard Ratio is is numerically identical to Risk Ratio.

However, over the trial as a whole, Hazard Ratio is weighted in a special way.
The trial-averaged Hazard Ratio is weighted towards the times when we are most confident of the precise value of the Risk Ratio at that time.

For example, at the beginning of the survival graph, there are many patients in it.

Near the end, there are many fewer patients.
At which end of the trial survival graph do we have the MOST PRECISE estimate of risk ratio?

The beginning, where there are lots of patients, and so each "%" is made up of many actual people?

Or near the end, where there are few patients, and one death could be worth many %?

The first two votes to the question of "why will the curves not be an exact scaling of each other" are "because each patient is unique"

What on earth are you thinking?

That was a vacuous moronic "Top-People-speak" phrase I put in that NOBODY should pick.
It is beloved of panels because:

(a) It is true
(b) It is not the explanation for the question at hand and yet
(c) The average audience member thinks it is some sort of deep analysis.

Next time anyone says that to you, throw a rotten tomato at them.
When they are clearing the pips from their eyes, shout out,
"That tomato was unique too.
Is that relevant?

If it was the same as this tomato [Throw another one]
would it matter?"

That'll teach them.
The reason the two curves can't be scaled versions of each other is that PEOPLE DIE IN UNITS OF 1.

So the only way for them to be EXACTLY scaled versions of each other, is: EVERY time someone dies in one arm, EXACTLY 2 people die in the other arm at the same moment.

Or similar.
The vertical steps being "1 person high" prevent perfect scaling from one to the other.

More info on how those steps arise is in this perma-torial:
Next question I asked is also causing people difficulties.
Try this as a hint.

At the beginning, if in a particular month 4% of people die, this might be made up of 40 people out of 1000.

Near the end of the trial, 4% of people dying in a particular month might be made up of 1 person dying out of 25.
Which percentage will have the smaller error bar?

The percentage made up of lots of people (4% of 1000, near the beginning of the trial)?

Or the percentage made up of few people (4% of 25, near the end of the trial)?
Better! Most people getting it right.

So during SOME periods of the trial we are more confident about what the hazard ratio is, than we are at other periods.
The Cox-calculating robot automatically weights this info in a complex way that I won't be able to explain to you in a way you can understand[*]...

[*] This is because despite extensive deep research, i.e. reading the wikipedia page a few times, I still can't get the hang of it.
... and spits out a SINGLE hazard ratio covering the whole trial.

It also gives a confidence interval and P value.
It does NOT give a graph. The graphs we normally see are simply Kaplan-Meier Graphs, calculated in the manner described in the perma-torial above.
Here is a nice example of a condition in which events accumulate gradually over time, i.e. just basically being alive, not just having got out of hospital from a heart attack.

And a therapy (statins) which reduces coronary events by a steady fraction.
The above was ASCOT-LLA.

Below is the Zijlstra et al NEJM trial of Primary Angioplasty versus Streptokinase for acute MI. We all know how that turned out. It keeps us up at night. (Literally!)

Notice that the graphs are NOT scaled versions of each other.
Would they have been right to use the Cox Proportional Hazards model, and calculate a Hazard Ratio for PPCI?
And indeed they did not. They used the simple Kaplan-Meier curve and its corresponding log-rank test to compare the arms, because they were not scaleable versions of each other.
How about that big question we always like to have an opinion about, "CABG versus PCI, which is better" or whatever?

Here are the shapes of the event curves:
Is PCI a scaled up/down version of the CABG curve?
Well, maybe you think it just a random wobble that is making it look non-scaleable?

Look at other examples, and see. Is the pattern the same?
Is the same one always coming up at higher risk at the beginning, and lower risk later?
In general, CABG has higher risk than PCI, in the early hours and days after the procedure.

But as time passes, the CABG patients seem to accumulate events slower than the PCI patients.
Overall, it is close to even.

The earlier your time window of observation, the more it looks like PCI is safer.
The longer you wait before starting to count events, the more it looks like CABG is safer.

But CABG-v-PCI is certainly not a simple proportional hazard.
OK, we are done for this weekend!

Less torturous than most weekends, due to the jollifications Saturday night (happy birthday @rallamee!)

Another weekend I will look at Hazard Ratios per unit something e.g. per mmol/L cholesterol, or per kg weight, and how to interpret them.
Open to questions!

• • •

Missing some Tweet in this thread? You can try to force a refresh

Keep Current with Prof Darrel Francis ☺ Mk CardioFellows Great Again

Prof Darrel Francis ☺ Mk CardioFellows Great Again Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!


Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @ProfDFrancis

Jul 29, 2018
If anyone wants to know how incorrect causal inference arises in cardiology, there's no need to do a PhD on it.

It's encapsulated in this thread. Unlike most of my threads it has a happy ending though !

First, a whole load of unsuspecting patients have PCI.
Then a bunch of cardiologists who are normally ultracompetitive decide to do something constructive for a change, instead of just doing each other down in cross-London acrimony.

Let's get together and ...

They do what we always do when we cardiologists get together. Tell stories.

"I had a guy with 3 vessel thrombosis in cardiogenic shock, going into asystole as he was put on the table, but we still saved him!"
Read 64 tweets
May 27, 2018
How to make your
correlations correlate.
Even if they don't.

A practical tweetorial for aspiring research fraudsters, cardiology fellows doing research, and others in need of a strong association when there isn't one.

#meded #foamed

Please retweet to fellows.
Thank you to @rallamee for loaning me the use of her garden and boating lake to do a few demonstrations in this weekend's tweetorial.
Thank you also to a colleague, for sending me this just now. It is enlightening and I am genuinely sorry.
Read 58 tweets
May 13, 2018
Summer Sunday quiz

from Sunny Plymouth, Pearl of Europe

#DreadMed #DooMed
Look at this table of baseline characteristics.

Control arm consists of 22+2 = 24 people.
Therapy arm consists of 20+1=21 people.
How many people is that, altogether?
Read 24 tweets
May 13, 2018
Joel Giblett @joelgiblett brings up a very important question.

Where does scientific integrity come from? What makes me think I can trust papers written by him? I've never heard of him before this tweet, and I have no idea of his background.
Well, first of all, he seems genuinely curious about integrity. This is a good sign.

Second, he is scared of R&D and MHRA. This is a good sign that he tries to toe the line, pays his taxes, doesn't park in the disabled spots (unless entitled) and doesn't drop litter.
But it is not non-littering, or trembling when the R&D office sends an email, that CAUSES his research to be non-fake.

It is his own personal attitude, those of his colleagues, and the lack of tremendous incentives to fiddle.
Read 22 tweets
Apr 13, 2018
Can you get into the mindset of a probability?

You thought you had a simple and easy life, roaring along the highway from 0 upwards.

Until you see this ahead of you.
You have a twin sister, who in her youth was similar to you in many respects.
But she was always more vivacious and sporty.

When you read comic books, she preferred to dance.

When you watched TV, she joined a cycling club.

Now you are almost at a dead end.
And she is running free.
Read 30 tweets
Apr 7, 2018

What are they?
Why do we need 3 of the damn things?
Which should I use?
Are they the same, or different, or a bit samey?

An #ORBITA-HQ #tweetorial.
#Meded #FOAMed
All 3 of these are about comparing the scale of dangerousness of one thing AGAINST ANOTHER.

Suppose you are walking through an unfamiliar forest and the road comes to a fork: you have to choose one path.

As you stand, uncertain, a troll pops out of the ground.
You ask for advice on which path is safer.

As always, the advice he offers is scrupulously correct, but not necessarily instantly interpretable to the layperson.
Read 65 tweets

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!


0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy


3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!