BAD DATA

In 2018, the British Medical Journal published a paper by Professor Robert Yeh of the Harvard Medical School.

It concerned jumping from aircraft with and without a parachute.

The main finding was that a parachute made no difference to your chances of survival.

“The paper, titled ‘Parachute Use to Prevent Death and Major Trauma When Jumping from Aircraft; Randomised Controlled Trial’ finds that the safety devices do not significantly reduce the likelihood of death or major injury for people jumping from an aircraft as compared with the control group, equipped only with empty backpacks.”

92 people were interviewed for the study, of these just 23 were found eligible to take part.

12 jumped from a plane with a parachute, 13 jumped without a parachute.

Findings were that neither group suffered any significant harm or injury, in fact outcomes were identical whether wearing a parachute or not.

But hidden in the middle of the paper was a sentence with a barely noticeable caveat:

“The participants might have been at lower risk of death or major trauma because they jumped from an average altitude of 0.6m (standard deviation 0.1) from an aircraft moving at an average 0 km/h (standard deviation 0).”

In other words, the plane they jumped from was stationery on the ground.

A bit like jumping off a chair.

Which, although it was only included as a caveat in the paper, a mere aside, actually made all the difference.

Without reading that part, everyone automatically assumes the results concern jumping from a plane thousands of feet high moving at hundreds of miles an hour.

Our brain adds together the words: ‘airplane’-‘jump’–‘parachute’ and supplies a picture of someone jumping into a clear blue sky.

This is the reason Professor Robert Yeh wrote the paper.

He wanted to demonstrate the way we leap on details in research papers and sensationalise the sexy parts.

We then run away with the part that makes a great headline without bothering with the mundane part that won’t help the story.

That’s why he wrote the study, as a satire on the mis-use of research and misunderstanding of data. As he said, the context is critical.

I watched TV and read the papers a while back.

The news was full of the fact that research had shown a bottle of wine a week increased the risk of cancer.

But that’s taken in limbo with no context.

No one bothered to ask why anyone was drinking a bottle of wine a week.

Maybe it was due to increased stress, maybe stress causes cancer, with or without the wine.

But no one bothered to dig any deeper because they’d found an exciting story.

As the founder of the discipline of mathematical statistics, Karl Pearson, said “Correlation does not imply causation”.

As in the parachute example, the wrong question reveals certain data, but ignoring the context can give the wrong answer.

The real question wasn’t between jumping with and without a parachute.

The real question was between jumping from a stationery object on the ground and a moving object in the air.

But the data can’t reveal that, because the data wasn’t asked that question.

Data is just numbers, it can’t provide conclusions, that’s not its job.

Because a human should possess software that data doesn’t, common sense.