Privacy preserving federated deep learning for medical imaging
Project summary
Obtaining training data for deep learning models in medical imaging commonly involves the acquisition and aggregation of large amounts of patient data and training a centralized model. This leads to the security threat that patient data could be exposed at the central location or extracted from the model via model inversion and membership inference attacks. This project aims to investigate the use of privacy-preserving federated learning techniques in medical imaging to explore if these techniques can be used to securely train federated models and to counteract the security threats of a central data aggregation strategy.
Project description
The advent of artificial intelligence concepts in medical imaging offers the opportunity to overcome barriers in conventional image processing. Specifically, Deep Learning may create novel applications and enhance speed and image contrast in medical imaging. Deep learning, however, requires large amounts of training data to deliver on this promise and obtaining this training data currently involves the acquisition and central aggregation of large amounts of patient data. These centrally aggregated patient datasets are then used to train a central model that is distributed to different sites in production.
This results in a bottleneck currently limiting the impact of deep learning in medical imaging, because it is challenging to securely aggregate the training data in a central location due to privacy and data ownership concerns. Another problem arising from this centralized approach is the cyber security threat that patient data could be extracted via model inversion and membership inference attacks once deployed in production.
Through this project, we aim to investigate the use of federated privacy-preserving concepts for the secure distributed training of deep learning models using magnetic resonance imaging data directly on the MRI scanner hardware platform. Federated Learning (Konečný et al. 2017) enables the distributed training of a global model without sharing or aggregating the actual data. Although the data are not shared between sites the global model can contain sensitive patient information that can be leaked and extracted via model inversion attacks. To overcome this problem, Zhang et al. (2021) recently proposed a new optimization algorithm named Confined Gradient Descent (CGD) that enables each participant to contribute to the model without exposing any patient information. This technique has been shown to outperform state-of-the-art federated learning techniques in performance and privacy preservation. In this project we will implement the Confined Gradient Descent algorithm for a proof-of-concept application on the MR scanner platform of Siemens Healthineers: We will develop a privacy-preserving federated version of the manifold approximation algorithm UMAP which will allow us to identify clusters of patients in the MRI data in an unsupervised and data driven fashion. We hypothesize that this model can be used to identify outliers in the data that can be highlighted for detailed inspection by the clinicians.
Exploring the use of the Confined Gradient Descent algorithm for medical imaging applications is highly innovative because federated privacy-preserving deep learning techniques are currently not utilized on clinical MRI scanners, but would enable access to large amounts of patient data that cannot leave the hospital environment. This would immediately make patient data available for the training of large models with the potential to discover insightful disease patterns.
Partner organization(s)
Reference
Zhang, Yanjun, Guangdong Bai, Xue Li, Surya Nepal, and Ryan K. L. Ko. ‘Confined Gradient Descent: Privacy-Preserving Optimization for Federated Learning’. ArXiv:2104.13050 [Cs], 2021.
Konečný, Jakub, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. ‘Federated Learning: Strategies for Improving Communication Efficiency’. ArXiv:1610.05492 [Cs], 2017. http://arxiv.org/abs/1610.05492.