There’s something about seeing the hard, cold numbers globally tracking the coronavirus that’s turned average people into armchair epidemiologists. We’re having conversations (virtually, of course) about how successfully we’re “flattening the curve”and whether we’re doing better or worse than other places. With sports on hiatus, this has become a new kind of game. Yet, there are the vast differences between countries, as well as provinces, states or regions, in how data is gathered and reported which can lead to flawed or inaccurate conclusions.
To understand data, we need to consider it within a bigger context. Data are abstracted elements or representations that seek to categorize and measure phenomena. (Kitchin, 2017) While data is often seen as conveying the facts, data isn’t purely objective. “Counting is political” and how data is collected, stored and constructed, for whom and for what purposes reflects the values of those compiling the data. (Lohr, 2015, p. 91) Data are part of a “complex socio-technical system” that forms a “data assemblage” which includes “ideas, techniques, technologies, systems, people and contexts” that “evolve and mutate over time”. (Kitchin, 2017, p. 22)
This is clearly illustrated in COVID data when countries willfully misreport information, as China has been accused of doing. Another form of misreporting is simply deciding not to count at all, (Coronavirus? What coronavirus?) which was a stance that some countries took before it became all too obvious that there was a problem. Those are blatant political decisions.
However, even when testing is being done and reported diligently in good faith, how the testing is being conducted and on whom are key questions that can drive the numbers. By way of a small example, we can look at the differences between case counts in Alberta and BC:
Alberta has tested more people than BC by a very wide margin. By testing more, do you find more of what you’re looking for? If your testing is guided by who needs the test (ie they are showing symptoms), then perhaps the case count is lower or perhaps there are more asymptomatic cases or people with less severe symptoms whom public health officials feel don’t warrant testing. It gets more complicated when we start to look at this on a national basis. As this article explains, case counts present a "limited picture" and can at times a "highly distorted one". (Platt, 2020) There also administrative errors that can occur such as missing data, which happened with Quebec’s April numbers, that creates a spike if the context isn’t explained.
We then roll up these provincial numbers to the country level and its reported by the Public Health Agency of Canada to organizations like the WHO or John Hopkins as part of global counts. This site examines some of the differences and gaps between various global data sets.
Global data is then used as a discussion point or in some cases to make loaded comparisons. This was the case when the President of the United States suggested that the US was vastly outperforming places like Belgium on death rate per capita due to COVID. However, how countries report deaths attributable to COVID varies greatly. “Belgian officials say they are counting in a way that no other country in the world is currently doing: counting deaths in hospitals and care homes, but including deaths in care homes that are suspected, not confirmed, as Covid-19 cases” (Lee, 2020) As a counter point, this piece in the Atlantic suggests that many Americans are dying at home, untested and thus, not being counted in either the reported cases or the deaths by COVID.
I’m not suggesting the data isn't useful – it is very useful! I am suggesting that we aim to better understand the assumptions made in how it was collected, to be critical and thoughtful when we look at it. Its far too easy to see a number and jump to a conclusion. That's not helpful. There is still so much we don’t know.
-- Katrina Ingram
Kitchin, Rob. (2017). The Data Revolution: Big Data, Open Data, Data Infrastructures & Their Consequences. Sage, UK.
Lee, Gavin. (2020, May 2) Coronavirus: Why so many people are dying in Belgium? BBC News, Brussels. Retrieved from https://www.bbc.com/news/world-europe-52491210
Lohr, S. (2015) Data-ism: The Revolution Transforming Decision Making, Consumer Behaviour And Almost Everything Else. New York, NY: Harper Collins.
Platt, B. (2020, April 2) Canada’s public data on covid-19 is mostly a mess. Here’s how to fin the useful info. National Post. Retrieved from https://nationalpost.com/news/canadas-public-data-on-covid-19-is-mostly-a-mess-heres-how-to-find-the-useful-info