Global healthcare is set to be shaken up by blockchain-enabled federated learning (FL). At FLock, we’re so sure of it that we authored an award-winning paper on the topic.
Data sharing is one of the biggest obstacles to building AI in healthcare. Medical data is private, sensitive and varies hugely. An AI model needs heaps of data, and collecting it is tiresome, costly and sometimes impossible.
Our paper is titled “Multi-Continental Healthcare Modelling Using Blockchain-Enabled Federated Learning”. We propose a framework using datasets from Europe, North America and Asia, and try it out on glucose management.
FLock was delighted to receive the Best Application Award at the IEEE Global Blockchain Conference.
Data collection for healthcare AI is fraught with obstacles
Data-driven machine learning has catalysed breakthroughs in a multitude of fields, but in healthcare it has been much slower. Healthcare data has unique challenges – it’s private, sensitive and heterogenous.
Things are looking up. A survey in 2024 showed that 75% of the public support sharing some of their personal health data for the development of AI systems in the NHS.
But obstacles are still hindering data collection. For example, in the study of chronic diseases, researchers must establish partnerships with hospitals, private companies, or public authorities. Each partner treats their data as a proprietary resource, complicating the sharing process. The approval process is often complex and protracted due to regulations and country borders.
Blood glucose level prediction (BGLP) case study
FLock chose blood glucose level prediction (BGLP) as a case study, but it can also be used to model other chronic diseases or broader healthcare challenges.
The goal of BGLP is to predict future glucose levels using current and historical data, including glucose levels (mg/dL) collected from continuous glucose monitors (CGMs), and optional recorded data such as meal intake (grams as carbohydrate), insulin injection (mg or units) and time of the day.
Given the substantial variability inherent in glucose data, training the BGLP model necessitates individual-specific datasets to develop tailored models that accurately reflect personal health metrics.
To fully capitalise on the advantages offered by the population model, it is proposed that a global collaborative framework be established whereby individuals, organisations, and healthcare institutions contribute their data to a collective model training effort in a manner that ensures the privacy of the contributors’ data.
Federated learning for healthcare
Federated learning enables the decentralised training of machine learning models wherein each participant (e.g., a hospital or an individual) retains control over their own data. Only model updates, such as gradients or learned parameters, are shared centrally for aggregation, without transmitting any personal or sensitive data.
FL promises a solution that enables numerous participants (e.g., hospitals and individual patients) to collaboratively train a global ML model while preserving local data privacy.
During FL training of BGLP, only participants’ local model updates, such as gradients or learned parameters of BGLP models [12], are shared for knowledge aggregation without transmitting any personal or sensitive local data.
Combining blockchain and FL to make BCFL
Blockchain-enabled FL (BCFL) represents an innovative paradigm of blockchain technology with FL, aiming to enhance security and trust in decentralised machine learning environments. In this paradigm, blockchain serves as an immutable ledger, recording transactions and models exchanged across the distributed nodes participating in the federated learning process.
This integration addresses core challenges such as data privacy, security, and model integrity, by ensuring transparent and verifiable transactions while maintaining the confidentiality of the data.
Blockchain’s decentralised nature allows for a trustless system where no single entity has control over the entire dataset or the learning process, thereby mitigating risks associated with centralised data storage and processing.
Moreover, the use of smart contracts automates the process of data sharing and model updates in FL. Additionally, blockchain’s inherent incentive mechanisms reward honest participation and penalise malicious activities, further enhancing the security of FL systems.
BCFL glucose modelling
To model blockchain-enabled federated learning for glucose prediction, we designed the Multi-Continental Glucose Prediction (MCGP) Framework. This framework leverages federated learning to enable different hospitals to collaboratively train a glucose prediction model while preserving patient data privacy.
Blockchain technology is utilised to implement a reward and slashing mechanism, incentivising honest participation and detecting malicious actors.
In privacy-sensitive fields such as glucose management, FL can break down data silos among hospitals. Each patient has unique characteristics; for instance, insulin levels can vary significantly in response to eating, drinking water, and insulin injections.
Each hospital’s prediction model needs to learn these personalised features to provide timely and accurate blood sugar predictions for new patients. FL enables hospitals to collaboratively enhance their models by sharing knowledge, leading to improved patient care without compromising data privacy.
Our FL mechanism for glucose prediction follows an algorithm that allows locally trained models to share knowledge about patients’ unique characteristics without transmitting any private data.
Conclusion
In this work, we proposed a Multi-Continental Glucose Prediction (MCGP) framework using blockchain-enabled federated learning to address the challenges of data privacy and sharing in healthcare.
Our experimental results demonstrated that the MCGP framework effectively preserves data privacy while significantly improving prediction accuracy compared to traditional methods. This approach facilitates global collaboration, allowing healthcare institutions to contribute to model training without sharing sensitive healthcare data directly.
Future work will focus on extending the framework to other chronic diseases, enhancing privacy with differential privacy techniques, and improving computational efficiency.