There have been some crackles in recent days on twitter, as the Bangladeshi mask trial caught light again. The crackling started with the publication of a ‘short note‘ that provided a ‘simple analysis’ of the recently released raw data from the Bangladeshi trial that claimed that, given the new ‘simple analysis’, the trial failed to show any covid protection benefit from mask wearing. Not content with blowing holes in masks, the authors of the ‘simple note’ also report that they did nonetheless find some other highly significant differences between the intervention and control groups, including one that could introduce more than enough bias to explain the original trial report’s marginal benefit from wearing surgical masks. In the limp language of academic writing, the authors suggest their findings ‘urge caution’ (sic) in interpretation of small differences, and that ‘bias-susceptible endpoints…should be used with care’. Translating into plain English, the masks don’t work, and the mandates should go.
The authors of the original report, who deserve full credit for releasing the raw data, have fought back in gentlemanly fashion. Much of the argument is focused on statistical niceties, in particular the type of analysis used to assess the statistical significance of the findings. These are complex academic matters, and, for those keen to avoid the pain of a four week crash course in medical statistics, not readily grasped in detail, though the crux is straightforward enough: have the necessary conditions needed for more sophisticated statistical analysis been met? The original report’s authors think they have, and so used more powerful (parametric) statistical methods on rates, while the authors of the ‘short note’ think they have not, and so used less powerful (non-parametric) methods on counts. Dr No takes the middle ground on this debate: it is useful to have both analyses. Such largesse is more than reasonable because there are other much more compelling reasons in the study for concluding that masks are to covid what the fig leaf was to Adam in the Garden of Eden — a cover-up, and not much else.
There are a number of clues in the original report that suggest something is amiss. This first is that something is amiss: the raw counts are missing. Neither the 111 page pre-print, nor the 19 page peer-reviewed version of the report, include numerators and denominators for the primary outcome, covid seropositivity following covid symptoms; instead, just the computed rates are reported. Authors (and so the media hacks that report the findings) omit raw counts for a reason: they flag up hotspots that suggest the conclusions may be suspect. This is our old enemy at work, presenting results only as relative risk reductions, rather than as, or at least with, absolute risk reductions. It allows utterly inconsequential differences to sound jolly impressive: a reduction from two in a million in the control group to one in a million in the intervention group yields a jolly impressive relative risk reduction of 50%, while the absolute risk reduction is a rather less impressive 0.000001 (2/1,000,000 minus 1/1,000,000). More pragmatically, among 2 million people, one person benefited. Even if such a result could ever achieve statistical significance (perhaps there were trillions of people in the study), it will never achieve practical significance.
This is such an important point that there are times when Dr No thinks the first rule of appraisal of clinical reports should be if there are no absolute numbers, only relative risks, then the paper goes in the bin there and then. On this first rule, the original Bangladeshi report fails, and that would in a sensible world be the end of the matter, were it not for the fact the establishment, the mainstream media, and a shocking 83% of respondents in a recent poll approve of government’s recent mask mandate for shops and public transport, presumably because the masks stop covid infection. No doubt the same people, polled on whether fig leaves stop men raping women, would reply yes, even in the face of millennia of evidence to the contrary.
Back to the Bangladeshi study. First, let’s take it at face value, and assume that the parametric statistical methods the authors of the original report used were valid. The first thing to note is that the study was a randomised trial, albeit a clustered one, in which villages rather than individuals were randomised, but it is still a randomised trial — or is it? The whole point of randomisation is that all the factors, known and known, that might influence the outcome of a study are randomly allocated to both the intervention and the control groups, and so, being present in equal measure in each group, they cannot explain any differences in outcome. So far so good, but there is a problem in the Bangladeshi study. The villages were randomised, but the recruitment of volunteers to the study suffered a significant bias. Individuals in villages allocated to the intervention were significantly more likely to be recruited to the study than those in villages allocated to the control group. This can be seen in the numbers in each group: 170,497 in the intervention group, compared to 156,938 in the control group. With true randomisation, the numbers should be much closer, and the fact that they are not tells us that recruitment was not random.
We have a trial that was randomised in name, but not in practice. The reason for the differential recruitment was almost certainly greater zeal on the part of the non-blinded recruiters (they knew whether the village was an intervention or a control group) in intervention villages. It matters because of the very real possibility that the difference in numbers is also reflected in differences in behaviour in those recruited, particularly in the marginal volunteers, the ones sucked into the trial in the intervention villages, but left out in the control villages. Let us consider how this might affect results in practice.
Imagine two identical villages each with 100 inhabitants. Everything about the villages is identical, even down to the number of covid infections, say every tenth person gets infected, and so the incidence in each village is an identical 10%. Now imagine the two villages were selected for a mask trial, with one village getting masks and the other not getting masks. At the same time — as happened in the Bangladeshi trial — the recruiters in the mask village were more zealous, and managed to recruit 60 volunteers, compared to only 50 in the control village. As it happens, those 10 marginal recruits were inevitably more indolent than the more easily recruited, and so when the time came to report their symptoms and undergo a blood test, they simply didn’t bother. The trial runs its course. Let’s see what happens.
In the control village, 5 of the 50 willing participants get infected and test positive, giving a rate of 10%, (5/50) as expected. But something different happens in the mask village. The 5 infected willing participants come forward and become cases, but the sixth infection in the 10 marginal participants doesn’t appear, because that marginal participant fails to report his infection, giving an apparent infection rate of 5 out of 60 subjects, or 8.3%, even though the underlying infection rate was an identical 10%. This is another example of lies, damned lies and denominators. The differential behaviour in the mask village bumped up the denominator (from 50 to 60) but the number of cases that came forward remained the same (the 5 willing participants in each village), and so the apparent rate appears lower in the mask village, 8.3% compared to 10%, even when the underlying true rate is identical in both villages.
Now, as it happens, the relative risk reduction of 17% in this hypothetical study with identical underlying rates is actually considerably larger than that reported by the authors of the Bangladeshi study, which also had unbalanced denominators (170k to 157k), of around 10%. It is eminently plausible that the observed effect in the Bangladeshi study appeared solely because such a bias was at work. We have to conclude that the Bangladeshi study — the biggest, most ambitious study of its kind ever undertaken — failed to demonstrate that masks do anything to prevent the spread of covid infection.
In a sane world, that would be a big enough nail to keep the mask coffin shut forever. But we are not in a sane world, so Dr No is going to bang in another nail, just to make sure the coffin remains forever shut. Recall that the authors reported a headline relative risk reduction, of around 10% (the exact percentage varies slightly, depending on how it is presented). This, of itself, is a marginal reduction — 10% is hardly impressive — and furthermore, it is of marginal statistical significance: the reported adjusted prevalence rate of 0.91 has a confidence interval of 0.82 to 1.00, which just includes 1, meaning marginal significance. But what about the absolute numbers, and so absolute risk reduction and number needed to treat?
As we noted earlier, neither the pre-print nor the peer reviewed published report contains these numbers, though they did report symptomatic seropositive rates: 0.76% in the control villages, against 0.68% in the intervention villages. This gives us again a 10% or so relative risk reduction (0.08/0.76 = 0.105, or 10.5%) and an apparent absolute risk reduction of 0.08% (0.76 – 0.68), but we still don’t have the absolute numbers, that is, how many individuals benefited. We get the same percentages from 76 out of a thousand against 68 out of a thousand (8 individuals benefit) as we do from say 760 out of ten thousand against 680 out of ten thousand (80 benefit). They did however become available when the authors, to their credit, released the raw data. The absolute numbers (see page 3) of seropositives in the control group was 1,106, and 1,086 in the intervention group, an absolute difference of only 20 individuals. Not exactly a corking result from a study that enrolled over a third of a million subjects.
It is not clear what denominators the authors used to generate their rates of 0.68% and 0.76%, but by reverse engineering it appears they used a number close to all subjects in each arm. In doing this, they have perhaps done themselves a disfavour, in that only around one third of those with symptoms had their blood tested for antibodies, but if we run with these numbers (1,106/145,526 = 0.76% in the control group, 1,086/159,706 = 0.68% in the intervention group) we (unsurprisingly, Dr No has done the standard sums just to confirm everything adds up, and that there is no numerology involved) find the absolute risk reduction again to be 0.08% (0.76 – 0.68), a rather less impressive result than the already far from impressive 10% or so relative risk reduction. The number needed to treat, in this case submit to a eight week intensive mask promotion exercise, is well over a thousand, 1250 individuals, to prevent one extra case of seropositive covid infection. That is a monumental amount of work to achieve one less covid infection. And to cap it all, the study result includes the possibility that the one less case may have arisen because of bias, or even merely by chance.
There are a number of other flaws in the study, but this post is getting over-long, and has already made the key point: the Bangladeshi Mask RCT, the biggest, most ambitious of it kind ever done, failed to demonstrate that masks prevent covid infection. Let those who believe in masks wear masks. For the rest of us, it is time, Dr No thinks, to have a bonfire of the straw masks.