Me@Lofoten
Published in
|
17 Sep 2025
-
4 min
17 Sep 2025
-
4 min

Artificial Intelligence (AI) is deeply changing the way we live, work, and make decisions, powering technologies that were once considered science fiction. From healthcare diagnostics and personalized medicine to autonomous vehicles and recommendation systems, AI drives research across nearly every domain. 

Yet, behind every intelligent system lies a struggle for data. To achieve accuracy and generalizability across users, populations, or tasks, models depend on massive volumes of diverse and representative examples, most of which are always sensitive and tightly regulated. Centralizing such information is not only impractical, but also unethical and frequently prohibited by GDPR and HIPAA.

This is where Federated Learning (FL) makes its entrance, enabling the collaborative training of AI models across distributed devices or silos. In such a manner, hospitals and research centers, for instance, can collectively train diagnostic models for rare diseases, contributing valuable insights without ever sharing patients’ health records.

But as promising as it may seem, FL is not inherently privacy-preserving.

image

A Promising but Incomplete Answer

Despite its decentralized nature, FL reveals more information than it appears to. Model updates shared during training aren’t just numbers but can leak private information.

Attackers can reverse-engineer these updates to reconstruct parts of the original data as patient records or images, determine whether a specific individual’s data was used during training, or uncover hidden patterns and sensitive traits that were never intended to be exposed. The risk lies in the updates themselves as they retain traces of the data that generated them, making FL on its own far from truly privacy-preserving.

The Missing Piece

By enabling computation directly on encrypted data,Homomorphic Encryption (HE) fills FL missing piece, turning data protection from a promise into a reality. Gradients and model updates can now be processed across devices, silos, and servers without exposing their underlying values. As a result, any party without the decryption key, whether a curious intermediary or a compromised server, remains blind to the original information. Even in the event of a breach, no meaningful data can be extracted without access to the secret key.

HE reshapes FL trust model entirely. Instead of exchanging raw updates, each client encrypts their gradients before sending them to the server. The server, unable to decrypt any of this information, combines the encrypted updates blindly, without ever seeing what’s inside. Once the computation is complete, the server returns the encrypted result to the clients, who decrypt it locally and continue model training. At no point does anyone but the original data owner gain access to the underlying information.

Depending on the encryption scheme, this can happen in two ways. In a single-key setup, all clients share the same encryption key, allowing the server to aggregate updates seamlessly and return a single ciphertext that any client can decrypt. This setup is particularly well-suited for environments where all data sources belong to the same trusted entity, such as IoT devices deployed within the same organization and operating under a unified security framework. In contrast, a multi-key setup assigns a unique encryption key to each client, making it ideal for cross-organizational collaborations, where strong data isolation is required. This is the case of hospitals jointly training a medical model without sharing patient data, or banks cooperating to detect fraud keeping proprietary information confidential. While introducing more complex cryptographic operations, the multi-key approach delivers a significantly higher level of privacy, crucial in sensitive and highly distributed settings.

In both cases, privacy is enforced by design into the learning process, eliminating the need for a trusted authority to manage sensitive information.

Not only Decentralized, but Privacy-Preserving

FL marked a significant step toward decentralizing AI, but decentralization alone does not guarantee privacy. Without stronger protections, sensitive information can still be exposed through the very updates meant to keep data local. HE fundamentally changes that.It doesn’t just improve, but it completes FL, turning it into the privacy preserving framework it was always meant to be. Combining FL with HE is no more optional, it’s essential to build powerful and trustworthy AI systems.

At DHIRIA, we’re pushing this vision forward, advancing research to make privacy-preserving FL not just possible, but practical and secure for real-world deployment.