Recently, I participated in CIFAR's AI for Health townhall. The purpose of the meeting was to gather feedback on a draft framework that will enable CIFAR to move forward with AI initiatives in healthcare. The results of the townhall meetings, which took place in Montreal, Toronto and Edmonton, will be shared by CIFAR. I won't comment on the specifics of the framework, but I can share some of the themes and big questions coming out of this discussion that require further public debate.
What is healthcare data?
It may seem like the answer should be obvious but its not. Increasingly, healthcare data isn't just the data gathered between clinician and patient or even the data from our interactions within the healthcare system. There is a bigger picture view of what constitutes healthcare data and its everything from the information gathered from non-clinical wearable devices or apps, to sensor driven information about how we interact in our environments. Even social media data was discussed as having possible merit for healthcare applications.
Since the approach being considered is to take a wide view on healthcare data, it's hard to conceptualize what exactly is or is not healthcare data. Healthcare data has the potential to cross a lot of boundaries. This may make it increasingly difficult to protect, govern and regulate.
Keeping healthcare data in Canada
One of the boundaries that needs to be examine is a literal boundary - the ability to ensure that Canadian healthcare data stays in Canada where is can be subject to Canadian regulation. Data that is housed in other countries isn't protected by Canadian regulation and can be subject to that country's laws. For example, the Patriot Act gives the US government sweeping powers to access information stored on US soil.
With more data residing in the cloud, its important to understand the actual, physical location of servers. This article on Canadian Data Residency and the Public Cloud provides a good overview of key issues. It's interesting to note that there is no national data strategy on healthcare data as healthcare delivery falls under provincial purview. In many cases, big cloud service providers, like Amazon Web Services, may have only a few data centres in Canada, which makes me wonder about differences in provincial vs federal privacy regulation as it pertains to healthcare data storage.
If we mix in privately owned data (social media, data from wearable apps) as part of what constitutes healthcare data, then the issue of where this data resides is subject to the company providing the services. Most of these companies are not located in Canada.
Public vs Private: Who should benefit from our healthcare data?
Canadians are proud of our public healthcare system. It's not perfect, but generally speaking, we believe in having healthcare that's paid for collectively through taxes, accessible, affordable and delivered as a public benefit.
However, as private interests around the gathering, storage and use of data, begin to intersect with the public healthcare system, there are questions around how we plan to navigate public and private interests. Who owns the data gathered for public healthcare? Who gets to benefit from its use?
There are already instances of private Electronic Medical Record companies selling Canadian healthcare data. In the UK, a public private partnership with DeepMind Health involving the use of over a million patient records seemed promising, until it went off the rails causing public controversy. DeepMind Health has since been absorbed by Google Health, which has spawned new concerns.
Conversely, there may also be the need for the public healthcare systems to access privately held data in order to provide better care. There needs to be public discussion on this issue.
Anonymity and Privacy
One of the key ways in which we attempt to protect privacy is by ensuring that if datasets are shared, they are anonymized. Essentially, this involves removing personal identifiable information(PII), such as names, addresses and so forth. However, as many studies have shown, this isn't a fool proof method. Latanya Sweeney has done extensive research in this area. She gained noterity by once sending the Governor of Massachusetts a copy of his own "annonymous" health record.
Increasingly, reindentification can happen simply by combining datasets. While there are statistical methods that can be deployed to decrease the risk of reidenfication, in some cases, it renders the dataset less useful or accurate for machine learning purposes. In other words, there are trade-offs. Can we safe guard privacy without anonymity? Is the provision of anonymity already compromised as a privacy mechanism anyways? What are the risks on either side of this issue?
If you want a deep dive into data privacy and protection issues in general, check out this report by law firm Osler, Hoskin and Harcourt. It's a couple years old, but it covers a lot of ground.
One of the side effects of critically examining any topic is the risk of becoming cynical. I'm really trying to guard against this as I dive into this topic. I see so much potential for AI to be used in productive ways to enable better healthcare. It's important to discuss risk but also to understand what is acceptable risk. The risk of maintaining a status quo system not informed by new technologies needs to be examined as well.
Looking forward to seeing the final draft report from CIFAR!