For some time, Dr No has been bothered by the fact that some people with covid continue to have positive PCR tests long after they can be considered ill and infectious. These people, the long distance shedders, will, because of the way cases are defined and counted, appear as cases, even when clearly they are not cases. They will themselves suffer the loneliness of imposed but pointless self-isolation, but they will also inflate the apparent size of any covid wave, and in so doing aggravate alarm and panic, despite the fact that aggravation is unfounded, because they are not cases, nor are they infectious. In short, every covid wave is inflated, and the question for today is: by how much does the way we count covid cases inflate our estimate of the number of true cases?
This is not an easy question to answer, but Dr No is going to have a stab at answering it anyway. He will get almost too close for comfort to doing some of that dratted numerology, but it is necessary if we are to get something of an answer to the question. He will go though some of the facts estimates and assumptions needed to get ‘a good enough answer to the right question’, rather than an exact answer to the wrong question, and apply them to a hypothetical epidemic curve. Note the upfront declaration of the use of estimates, assumptions and hypothetical curves: this isn’t precision engineering, it is more a case of doing good enough science on the hoof. Consider it more of a discussion paper, than a done and dusted conclusion on the matter.
The facts, estimates and assumptions to consider are the course of covid in a cohort of patients. For convenience, Dr No will say the duration of true infection, including the pre-symptomatic interval, and infectiousness is the same, at two weeks. The is the red line in the chart in Figure 1, the period during which live virus can be isolated from the patient. This chart comes from a paper published in JAMA, and although it only shows estimates, it is in broad agreement with other reports, for instance Figure 1 in this paper. He will then say that, in the pre-symptomatic phase, a PCR test, if done, for whatever reason, is likely to be positive, as it is in the week after the onset of symptoms. These are true positives, a positive PCR test in a patient who really is infected and infectious. Finally, we need to estimates the numbers, or rather percentages, of long shedders, shown by the blue line in Figure 1. Eyeballing the chart, with the aid of an onscreen ruler, we can see that in week 2, some 89% of patients remain PCR positive, even though they have cleared the infection. The corresponding percentages for weeks 3 to 6 are 63%, 29%, 12% and 8% respectively, with long shedding effectively over by week 7. These percentages don’t need to be precise, reasonable estimates are good enough for our purposes.
Figure 1: Estimated time intervals and rates of viral detection are based on data from several published reports. Because of variability in values among studies, estimated time intervals should be considered approximations and the probability of detection of SARS-CoV-2 infection is presented qualitatively. SARS-CoV-2 indicates severe acute respiratory syndrome coronavirus 2; PCR, polymerase chain reaction. aDetection only occurs if patients are followed up proactively from the time of exposure. bMore likely to register a negative than a positive result by PCR of a nasopharyngeal swab. Reproduced from Sethuraman N, Jeremiah SS, Ryo A. Interpreting Diagnostic Tests for SARS-CoV-2. JAMA. 2020;323(22):2249–2251. doi:10.1001/jama.2020.8259
Next, Dr No generated a hypothetical epidemic curve using R, a mathematical computing package. He set the duration of the epidemic to be 100 days (near enough three months, so say March-April-May, of November-December-January, and the total number of cases to 100,000. He then, for each day during the epidemic, summed both the true (new, incident) cases with the numbers of long shedders still hanging around, using the percentages given above. Thus, for example, on any one day, of 100 patients in week 3 after symptom onset, 63 of these now well patients will still nonetheless contribute positive PCR tests to the pool; and so on for the other days. This calculation was done in Excel (easier, and more transparent, than trying to do it in R), and the file is available here. He then plotted the two curves, one for the true cases, counted as such in their first week of illness, and on for the apparent cases, that is, true cases plus long shedders, as shown in Figure 2.
Figure 2: Hypothetical epidemic curve: lower curve is true real cases, counted on the day the became symptomatic/tested positive; upper curve is true cases plus all those long term shedders still hanging around…
Clearly the total number of positive tests on any one day rapidly starts to over-estimate the true number of cases, as all the long shedders get added to the count. But this is the ‘God’s eye view’ of this hypothetical epidemic, in which we know the PCR status for every patient on every day throughout the epidemic. In the real world, while we might assume that most ill people will get tested when they are ill, and so will show up as true cases at the time when they are ill, the same cannot be said for the asymptomatic and post-symptomatic individuals, as only a fraction of them will get tested. So lets cut the number of tests, and so positives, by half. Figure 3 shows the resulting plot.
Figure 3: as Figure 2, but with the long shedder counts cut back by half. Note the chart is the same size as Figure 2, but the Y axis scale is different. We still clearly have a testdemic, with the apparent number of cases far in excess of the true number of cases for most of the epidemic’s duration
Another real world factor is that not all true cases will get tested, in other words testing will only be applied to a sample, not the whole population. As long as the sample is a random sample, or near enough a random sample, this will have no effect on the overall conclusion, that long shedders materially alter the apparent size of the epidemic. Any sampling bias that does apply is most likely to affect the long shedders more than the true current cases; for instance, the worried well may be more likely to get tested, but Dr No suggests this is unlikely to have a major impact of the general findings. Readers are invited to contribute any possible biases they consider relevant in replies to this post.
As Dr No pointed our early on in the post, this isn’t a done and dusted piece of work. For a start, Dr No is only a jobbing mathematician, if that, and there are too many estimates and assumptions in play for it to be conclusive. Instead, it provides an insight, through a worked example, into how long shedders do have an impact on the apparent numbers of cases, with, as can be seen, the greatest impact at the peak of the epidemic. The only unanswered question — after the obvious is this all just a load of old Dr No baloney — is: how many of the ‘cases’, or more correctly test positives, on the government’s corona dashboard are true cases, and how many are testdemic non-cases?