What is secure aggregation in federated learning?

Secure aggregation is a cryptographic protocol that allows the aggregator to compute the sum of participants' model updates without seeing any individual participant's update. Each participant encrypts their update using a pairwise masking scheme derived from Diffie-Hellman key exchange. The masks cancel out in the sum, so the aggregator receives the correct aggregate but cannot learn any individual update. Bonawitz et al. 2017 introduced the practical protocol used in production. Secure aggregation defends against gradient inversion by a malicious aggregator.

Federated Learning: Architecture, Threat Model and Poisoning | Track 3D

Q: What is the FedAvg algorithm?

FedAvg (Federated Averaging) was introduced by McMahan, Moore, Ramage, Hampson, and Arcas in 2017. In each round, a subset of participants is selected. Each participant downloads the current global model, trains it on their local data for several steps, and sends the resulting model weights back to the aggregator. The aggregator averages the received weights, weighted by each participant's dataset size. This weighted average becomes the new global model. FedAvg converges to a similar solution as centralised training under IID data conditions, and can approximate it under non-IID conditions with sufficient rounds.

Q: What is a model poisoning attack in federated learning?

In a model poisoning attack, a malicious participant sends deliberately crafted model updates rather than honestly trained ones. The crafted updates are designed to degrade global model accuracy, introduce backdoor behaviour (trigger-specific misclassification), or bypass anomaly detection by appearing within normal norms. Unlike data poisoning, model poisoning does not require the attacker to control the training data: the attacker directly controls the gradient computation on the malicious participant's device and can submit any gradient they choose.

Q: How does gradient inversion work in federated learning?

Gradient inversion, demonstrated by Zhao, Meng, Shen, and Ren in 2020 (Deep Leakage from Gradients), shows that an attacker who can see gradient updates can reconstruct the original training data that produced those gradients. The attack optimises dummy inputs to match the observed gradients. For small batch sizes and high-resolution images, the reconstruction can be near-exact. This means that even though federated learning participants never share raw data, a compromised aggregator or eavesdropper on the communication channel can recover training data from the gradients themselves.

Q: What is Byzantine fault tolerance in federated learning?

Byzantine fault tolerance (BFT) in federated learning means the training process produces a correct global model even when some fraction of participants send arbitrary, malicious, or faulty updates. Standard FedAvg is not Byzantine-robust: a single malicious participant can corrupt the global model by sending updates with very large norms. Byzantine-robust aggregation methods replace simple averaging with robust alternatives. Krum (Blanchard et al. 2017) selects the single update closest to all others. Coordinate-wise median takes the median of each model parameter across participants. Bulyan combines Krum-like filtering with geometric median computation. These methods tolerate up to a fraction of malicious participants without significantly degrading model quality.

Section 01

What is federated learning

In centralised machine learning, you collect all training data in one place, train a model on it, and deploy that model. This is the standard approach, and it works well when you control all the data. It breaks down when the data is too sensitive to move, legally restricted to specific jurisdictions, or held by multiple competing organisations that will not share it with each other.

Federated learning solves this by moving the training to the data rather than moving the data to the training. Each participant keeps their data locally. Instead of sharing records, they share model updates: the changes to model weights that result from training on their local data. A central aggregator collects these updates, combines them into a new global model, and sends it back to all participants. No raw data ever leaves any participant's environment.

The term was introduced by McMahan, Moore, Ramage, Hampson, and y Arcas at Google in 2017, alongside the FedAvg algorithm. Google deployed federated learning in production the same year for Gboard's next-word prediction, training on text typed on Android phones without that text ever leaving the device.

Federated learning: one training round

🏠

Hospital A

Local patient data. Trains local model. Sends weight update only.

🏠

Hospital B

Separate patient data. Different demographics. No data shared with A.

🏠

Hospital C

Independent dataset. Participates in same training round.

🏠

Hospital D

May opt out of any round if participation criteria not met.

Model updates sent up ↑ Global model sent down ↓

Central Aggregator

Receives model updates. Runs FedAvg. Produces new global model. Never sees raw patient data.

Step 1

Aggregator selects participating clients for this round

Step 2

Participants download current global model weights

Step 3

Each participant trains locally for E epochs

Step 4

Updated weights (not data) sent to aggregator

Step 5

Aggregator averages updates, produces new global model

Why "federated"? The name refers to the federated structure of the system: multiple independent parties cooperating under a shared protocol without centralised control of their data. Each party remains sovereign over its own data while contributing to a shared outcome.

Section 02

The FedAvg algorithm

FedAvg (Federated Averaging) is the algorithm that made federated learning practical. Before FedAvg, distributed training required participants to exchange gradients after every single gradient step, which meant enormous communication overhead. FedAvg allows each participant to run many gradient steps locally before communicating, reducing communication rounds by a factor of 10 to 100 while maintaining comparable model quality.

1

Initialise global model

The aggregator initialises model weights w_0. All participants start from the same initial weights. This is the only moment all parties are guaranteed to have identical models.

2

Select participant subset for round t

The aggregator selects a random subset S_t of C fraction of participants. Partial participation handles the reality that not all participants are available for every round.

|S_t| = max(C * K, 1) where K = total participants

3

Distribute current global model

The aggregator sends the current global weights w_t to each selected participant. Each participant starts their local training from these shared weights.

4

Local training: E epochs on local data

Each participant k runs E full epochs of SGD on their local dataset D_k using the received weights as the starting point. More local epochs means less communication but potentially more client drift from the global optimum.

w_k = LocalSGD(w_t, D_k, E, lr)

5

Send updated weights to aggregator

Each participant sends their locally updated weights w_k back to the aggregator. Only the weights are sent, not the training data or the individual gradients. The aggregator cannot reconstruct what data was used from the weights alone (without gradient inversion, covered in Section 06).

6

Weighted average: new global model

The aggregator computes the new global model as the weighted average of all received updates, weighted by each participant's dataset size n_k. Participants with more data contribute more to the global model.

w_{t+1} = sum_k (n_k / n) * w_k where n = sum of all n_k

7

Repeat for T rounds

Steps 2 to 6 repeat for T rounds. The global model converges to a solution similar to centralised training under IID data. Under non-IID data (different participants have very different data distributions), more rounds are needed and the solution may differ from the centralised optimum.

Non-IID data is the central challenge of federated learning. In a hospital consortium, one hospital may specialise in cardiac cases while another sees mostly trauma patients. Their local data distributions are very different. FedAvg can still converge, but may require many more rounds, and the global model may perform worse on any individual participant's data than a locally trained model would. This tension between global model quality and local model relevance is an active research area.

2017 McMahan, Moore, Ramage, Hampson, y Arcas · Communication-Efficient Learning of Deep Networks from Decentralized Data (AISTATS 2017)

Section 03

Cross-device vs cross-silo

Federated learning is deployed in two very different operational settings. They differ in the number and type of participants, the trust model, the communication pattern, and the threat model. Choosing the right design for each setting is essential.

Cross-device FL

Millions of endpoints, small data per device

Participants: millions of mobile phones or IoT devices

Dataset per participant: hundreds to thousands of examples

Participation: opportunistic (device charging, on Wi-Fi)

Each device participates rarely: a user's phone may join once a day

Communication: highly asynchronous, expect dropout

Trust: devices are untrusted; any device may be compromised

Data is highly heterogeneous across users

Examples: Google Gboard, Apple Siri, next-word prediction

Cross-silo FL

Tens to hundreds of organisations, large data per silo

Participants: 2 to a few hundred hospitals, banks, or enterprises

Dataset per participant: millions to billions of examples

Participation: all organisations join every round (synchronous)

Each silo participates in every training round

Communication: lower latency acceptable, dedicated connections

Trust: organisations vetted but potentially competing

Legal agreements govern participation (data sharing agreements)

Examples: MELLODDY drug discovery, financial fraud, healthcare AI

Section 04

The threat model

Federated learning was designed to address one threat: the centralised aggregator learning participants' raw training data. It does this well. But it introduced a new threat surface that centralised training does not have: participants can send malicious updates. And it did not fully solve the original threat: gradients leak information about training data even when raw data is not shared.

There are four distinct attack classes. They require different attacker capabilities and have different defences. Understanding which attacks are relevant to a given deployment is the first step in designing the right set of controls.

☣

Data poisoning

High severity

A participant trains on locally poisoned data, injecting a backdoor trigger into the global model. Trigger-specific inputs produce attacker-chosen outputs. The model behaves normally on all non-triggered inputs.

Attacker controls: their own local training data

📈

Model poisoning

High severity

A participant submits crafted model updates regardless of what their local training produced. The crafted updates can target specific inputs, degrade global accuracy, or introduce hidden behaviour while staying within normal gradient norms to evade detection.

Attacker controls: the gradient computation itself, not just local data

📸

Gradient inversion

High severity

The aggregator (or a network eavesdropper) reconstructs training data from the gradient updates. Demonstrated to produce near-exact image reconstruction from single-batch gradients. Does not require any participant to be malicious.

Attacker controls: the aggregator or the communication channel

👑

Free-rider attack

Medium severity

A participant downloads the global model every round but contributes random or zero updates instead of honestly trained ones. They receive a high-quality model trained by others without contributing useful training signal.

Attacker controls: their own update submission, not data or aggregator

Section 05

Data and model poisoning

Poisoning attacks in federated learning exploit the fact that the aggregator cannot inspect participants' training data or verify that training was performed honestly. They come in two variants that require different attacker capabilities and produce different types of harm.

Data poisoning requires only that the attacker controls what data is used for local training. The attacker adds examples with a specific trigger pattern to their local training set, pairing each trigger with a target wrong label. After enough training rounds, this backdoor is incorporated into the global model: any input containing the trigger will produce the attacker's chosen output, while all inputs without the trigger behave normally.

A practical example from the research literature: in a medical imaging federated learning system, an attacker adds a small watermark to a subset of their X-ray images and labels them as normal, even when the underlying condition is present. After training, the global model classifies any X-ray containing that watermark as normal, regardless of its actual clinical content. All other X-rays are classified correctly. The backdoor is difficult to detect in standard evaluation because the trigger is not present in the test set.

Model poisoning is more powerful because the attacker controls the gradient computation directly, not just the input data. The attacker can compute gradients that would correspond to data they do not actually have, scale their malicious update to dominate the aggregation, or craft updates that pass norm-based anomaly detection by staying within the typical update norm range.

Bhagoji, Chakraborty, Mittal, and Calo (2019) demonstrated that a single malicious participant in a 100-participant federated system could achieve near-100% attack success rate using model poisoning, even when their updates were scaled and manipulated to evade detection, by boosting their update by the inverse of the fraction of participants.

Backdoor attacks are hard to detect with standard evaluation. A model that has been backdoored will show normal performance on any test set that does not include the trigger. The trigger is chosen to be rare in natural data. Standard accuracy metrics will not reveal the backdoor. Detection requires either auditing training data (not possible in federated learning by definition) or running specialised backdoor detection algorithms on the model weights.

2019 Bhagoji, Chakraborty, Mittal, Calo · Analyzing Federated Learning through an Adversarial Lens (ICML 2019)

Section 06

Gradient inversion

Gradient inversion attacks show that the privacy of federated learning is not guaranteed by the protocol itself. Just because raw data is not shared does not mean training data cannot be recovered. Gradients carry information about the data that produced them, and that information can be inverted.

Zhao, Meng, Shen, and Ren (2020) published "Deep Leakage from Gradients," demonstrating that given the gradient updates from a single training batch, an attacker can reconstruct the original training images to pixel-level accuracy. The attack initialises dummy inputs and dummy labels, computes the gradients those dummy inputs would produce, and iteratively adjusts the dummy inputs to minimise the difference between their gradients and the observed gradients. When convergence is achieved, the dummy inputs are visually nearly identical to the real training images.

Gradient inversion: deep leakage from gradients (Zhao et al. 2020)

Participant trains on

Private image
(e.g. medical scan
of patient Alice)

→

Sends gradient update

dL/dw
(weight gradient)
No raw data

→

Attacker (aggregator) runs inversion

Optimise dummy inputs
to match observed gradient
Reconstructs Alice's scan

The attack works because gradients encode information about the input. The reconstruction quality is highest for small batch sizes (batch of 1 gives near-exact reconstruction) and degrades with larger batches, higher model depth, and added noise. DP-FedAvg (Section 10) directly addresses this by adding noise to gradients before aggregation, making inversion infeasible.

The practical severity depends on batch size and gradient compression. With a batch size of 1, reconstruction is near-exact. With a batch size of 64, reconstruction quality degrades but is still meaningfully above random. In cross-device FL, batch sizes are small by necessity (limited device memory), which makes this attack particularly relevant for consumer deployments.

2020 Zhao, Meng, Shen, Ren · iDLG: Improved Deep Leakage from Gradients (arXiv 2001.02610) 2019 Zhu, Liu, Han · Deep Leakage from Gradients (NeurIPS 2019)

Section 07

Free-rider attacks

A free-rider attack is an integrity problem rather than a privacy problem. The attacker wants the benefits of the federated model without contributing honest training work. They participate in every training round but send either random updates, zero updates, or copies of the most recently received global model weights (which look like a valid update but carry no new information).

Free-riding matters for two reasons. First, in a cross-silo setting where participation is based on data contribution, a free-rider is effectively stealing the model. An organisation with no relevant data of their own can join a federated consortium and gain access to a model trained on the combined data of all other participants. Second, if enough participants free-ride, the global model degrades because too few honest training contributions are reaching the aggregator.

Detection is not straightforward. A participant who sends the current global model weights back unchanged looks identical to a participant who trained on data that was already very similar to the global model. Gradient contributions alone cannot distinguish these cases. Detection methods typically look for consistently low loss on the participant's claimed local evaluation metrics, or use verifiable computation approaches that require participants to prove their training was performed honestly.

Section 08

Byzantine fault tolerance

A Byzantine fault is a failure mode where a component behaves arbitrarily rather than simply failing or producing the wrong answer. In federated learning, Byzantine participants can send any update they choose, including updates specifically crafted to maximise damage to the global model.

Standard FedAvg is not Byzantine-robust. A single participant who sends an update with a very large norm can dominate the weighted average and corrupt the global model in a single round. Byzantine-robust aggregation replaces simple averaging with statistical estimators that are resistant to outliers.

FedAvg (standard)

Computes the weighted mean of all submitted updates. One malicious participant with a large-norm update can shift the average arbitrarily far from the honest updates.

Not Byzantine-robustEfficientEasy to implement

Krum

Introduced by Blanchard, El Mhamdi, Guerraoui, and Stainer (2017). Selects the single update that has the smallest sum of squared distances to its n-f-2 nearest neighbours, where f is the assumed number of Byzantine participants. Only one update is used per round.

Byzantine-robust (known f)Loses information (uses only 1 update)Slow with many participants

Coordinate-wise median

Takes the median of each model parameter independently across all submitted updates. The median is robust to up to half the updates being Byzantine. More computationally efficient than Krum and uses all participants' updates.

Robust to up to n/2 ByzantineEfficientMay lose accuracy vs mean on non-IID data

Bulyan

Combines a Krum-like selection step to identify a set of likely honest updates, then computes the coordinate-wise mean of those selected updates. Achieves stronger Byzantine tolerance than either Krum or median alone with less accuracy loss.

Strong Byzantine toleranceHigher computational costBetter accuracy than Krum

FLTrust

The aggregator maintains a small clean root dataset and computes its own gradient as a reference. Each participant's update is scored by its cosine similarity to the root gradient and scaled accordingly. Updates pointing away from the root direction are downweighted or rejected.

Effective against adaptive attacksRequires aggregator to hold clean dataWorks under non-IID data

All robust aggregation methods assume f is known or bounded. If more than the assumed fraction of participants are Byzantine, the guarantees break down. In practice, this means Byzantine-robust FL works best in cross-silo settings where the number of participants is small and each is authenticated. In cross-device settings with millions of devices, Byzantine robustness is harder to guarantee.

2017 Blanchard, El Mhamdi, Guerraoui, Stainer · Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent (NeurIPS 2017)

Section 09

Secure aggregation

Secure aggregation (SecAgg) is a cryptographic protocol that allows the aggregator to compute the sum of all participants' model updates without seeing any individual participant's update. It is the primary defence against gradient inversion by a malicious aggregator.

The protocol was introduced by Bonawitz, Ivanov, Kreuter, Marcedone, McMahan, Patel, Ramage, Segal, and Seth at Google in 2017. It is used in production in Google's cross-device federated learning deployments.

Secure aggregation protocol phases (Bonawitz et al. 2017)

1

Key agreement

Each pair of participants performs a Diffie-Hellman key exchange to establish a shared secret. With n participants, each participant establishes n-1 pairwise secrets.

2

Mask generation

From each pairwise secret, each participant derives a pseudorandom mask vector. For each pair (i, j), participant i uses +mask(i,j) and participant j uses -mask(i,j). The masks are equal and opposite.

3

Masked update submission

Each participant adds all their pairwise masks to their local model update and submits the masked update. The aggregator receives sum(update_i + mask_i_total) for each participant i.

4

Cancellation at aggregation

When the aggregator sums all masked updates, every +mask(i,j) is cancelled by the corresponding -mask(i,j) from participant j. The sum of all masks is exactly zero. The result is the true sum of all model updates.

5

Aggregator receives only the sum

The aggregator computes the correct aggregate update without learning any individual participant's contribution. Gradient inversion requires seeing individual updates, not their sum, so the attack is blocked.

Secure aggregation has a dropout problem. If a participant drops out mid-protocol (their device goes offline), the aggregation cannot complete because the masks no longer cancel perfectly. The Bonawitz et al. protocol includes a dropout recovery mechanism using secret sharing, but handling dropout at scale adds significant communication overhead. This is one reason why secure aggregation is more commonly deployed in cross-silo settings with reliable participants than in cross-device settings.

2017 Bonawitz et al. · Practical Secure Aggregation for Privacy-Preserving Machine Learning (CCS 2017)

Section 10

DP-FedAvg

DP-FedAvg applies differential privacy to federated learning by adding Gaussian noise to model updates before they are sent to the aggregator. This makes gradient inversion infeasible and reduces the information any single participant's update carries about their training data.

Geyer, Klein, and Nabi (2017) and McMahan, Ramage, Talwar, and Zhang (2018) formalised this combination. There are two places where noise can be added: at the participant (local noise) or at the aggregator after receiving all updates (central noise).

User-level privacy protects entire participants. No participant's presence in the federated system should be detectable from the global model. This requires clipping updates to a maximum norm C and adding Gaussian noise with standard deviation proportional to C. The privacy guarantee is over which participants joined, not which individual data records they trained on.

Record-level privacy extends DP to individual training records within each participant's local dataset. This is essentially DP-SGD (from D3) applied locally, with the resulting differentially private gradient updates sent to the aggregator. The privacy guarantee covers individual records across the whole federation.

The tradeoff is familiar from D3: more noise means stronger privacy but worse accuracy. The noise is added on top of the communication and aggregation overhead already present in federated learning. In practice, DP-FedAvg with reasonable epsilon values (1 to 10) achieves accuracy within a few percentage points of non-private FedAvg on large cross-device deployments. Google deploys DP-FedAvg in production for Gboard.

2018 McMahan, Ramage, Talwar, Zhang · Learning Differentially Private Recurrent Language Models (ICLR 2018)

Section 11

Defence summary

No single defence addresses all four attack classes. A production federated learning system typically combines several controls, each targeting different threats. This table maps defences to the attacks they address.

Defence	Data poisoning	Model poisoning	Gradient inversion	Free-rider
Byzantine-robust aggregation Krum, median, Bulyan	Partial	Strong	None	Partial
Secure aggregation Bonawitz et al. 2017	None	None	Strong	None
DP-FedAvg Gradient noise	Partial	Partial	Strong	None
Norm clipping + anomaly detection	Partial	Partial	None	None
Reputation and audit systems	Partial	Partial	None	Strong
Backdoor detection (model inspection)	Strong	Partial	None	None
Authentication and vetting of participants	Partial	Strong	None	Strong

The recommended production stack for cross-silo FL combines Byzantine-robust aggregation (Bulyan or FLTrust), secure aggregation, DP-FedAvg with epsilon of 1 to 10, and participant authentication with legal agreements. This covers all four attack classes with strong or partial protection. For gradient inversion specifically, secure aggregation is the only strong defence: DP-FedAvg makes inversion harder but not impossible at high epsilon values.

Section 12

Production deployments

Federated learning has moved well beyond research. Major technology companies and healthcare consortiums run production federated learning systems at significant scale. These deployments illustrate both the practical benefits and the engineering challenges.

Google

Gboard next-word prediction

Trains next-word prediction models on text typed on Android devices. No typed text ever leaves the device. Runs on millions of devices. Uses DP-FedAvg and secure aggregation in production. First major production deployment of federated learning in 2017.

Cross-deviceDP-FedAvgSecAgg

Apple

Siri and QuickType improvements

Uses on-device learning for Siri voice recognition and QuickType keyboard improvements. Combines federated learning with local DP. Apple calls their system Private Federated Learning. Training happens only when the device is locked, charging, and on Wi-Fi.

Cross-deviceLocal DPOn-device

MELLODDY Consortium

Drug discovery AI

Ten pharmaceutical companies trained shared molecular property prediction models using federated learning. No company shared their proprietary compound databases. Coordinated by MELLODDY (Machine Learning Ledger Orchestration for Drug Discovery). Demonstrated that cross-silo FL can outperform any individual company's model.

Cross-siloHealthcare10 orgs

Financial Services

Cross-bank fraud detection

Multiple banks in consortiums (WeBank in China, multiple European banks via Federated AI Technology Enabler FATE) train fraud detection models on transaction patterns without sharing customer transaction data. Individual banks cannot share account data across jurisdictions; federated learning provides a compliant path to cross-institution signal.

Cross-siloFinanceRegulatory compliance

HealthChain (France)

Medical imaging AI

French hospital network trained breast cancer detection models across five hospitals using federated learning. Each hospital kept patient imaging data on-site. The federated model showed comparable performance to a model trained on all pooled data, while satisfying French data protection requirements under GDPR.

Cross-siloHealthcareGDPR compliant

Intel and Partners

OpenFL medical imaging

Intel's OpenFL framework powers multiple medical imaging consortiums, including a 71-institution study on brain tumour segmentation. Demonstrated that federated models trained across institutions with diverse imaging equipment can outperform models trained on any single site's data.

Cross-siloOpenFL71 institutions

Section 13

Frameworks

Several mature open-source frameworks implement federated learning. They differ in their target deployment scenario, programming model, and built-in security features. Choosing the right framework depends on whether you are running cross-device or cross-silo, your existing ML stack, and what security controls you need out of the box.

TensorFlow Federated (TFF)

Google · Python / TensorFlow

Google's open-source FL framework. Supports both simulation and production deployment. Has built-in DP-FedAvg and secure aggregation. The same framework used in Google's production Gboard deployment. Strong tooling for privacy accounting and differential privacy integration via TensorFlow Privacy.

Cross-deviceDP built-inSecAggProduction-grade

Flower (flwr)

Adap / Open-source · Python

Framework-agnostic: works with PyTorch, TensorFlow, JAX, and scikit-learn. Focuses on simplicity and interoperability. Supports custom aggregation strategies including Krum, FedAvg, and FedProx. Popular in research and cross-silo deployments. Straightforward to integrate custom Byzantine-robust aggregators.

Framework-agnosticCross-siloResearch-friendly

FATE

WeBank · Python

Federated AI Technology Enabler, developed by WeBank (China). Designed specifically for cross-silo enterprise FL with strong emphasis on privacy-preserving ML. Includes implementations of secure aggregation, homomorphic encryption, and secret sharing. Used in production in Chinese financial services FL deployments.

Cross-siloEnterpriseHE built-inFinancial services

PySyft

OpenMined · Python / PyTorch

Research-focused framework from OpenMined emphasising privacy-preserving ML. Supports federated learning, differential privacy, and secure computation (SMPC). Used heavily in academic research into privacy-preserving AI. Less production-ready than TFF or FATE but provides a complete stack for experimenting with all techniques in this track.

ResearchDP + SMPCPyTorchEducational

Section 14

Frequently asked questions

What is federated learning and how does it differ from centralised training?

In centralised training, all raw data is collected on a single server for training. In federated learning, the raw data stays on each participant's device or server. Participants train local model updates on their own data and send only the updates to a central aggregator. The aggregator combines updates into a new global model and sends it back. No raw data ever leaves any participant's environment. This enables training across organisations whose data is too sensitive to share directly.

What is the FedAvg algorithm?

FedAvg (Federated Averaging), introduced by McMahan et al. in 2017, aggregates local model updates by computing a weighted average of participants' updated model weights, weighted by each participant's dataset size. Each round: the aggregator selects a subset of participants, distributes the current global model, each participant trains locally for several epochs, sends updated weights back, and the aggregator computes the weighted average to produce the next global model. The key insight is that multiple local SGD steps before each communication round dramatically reduces the number of communication rounds needed.

What is a model poisoning attack in federated learning?

In a model poisoning attack, a malicious participant sends crafted model updates rather than honestly trained ones. Unlike data poisoning, the attacker directly controls the gradient computation and can submit any gradient they choose, not just the gradient that honest training on poisoned data would produce. This makes model poisoning more powerful: the attacker can craft updates to introduce specific backdoors, degrade accuracy on particular inputs, or boost the impact of their update while staying within normal gradient norms to evade detection.

How does gradient inversion work in federated learning?

Gradient inversion reconstructs training data from gradient updates. The attacker initialises dummy inputs and iteratively adjusts them to minimise the difference between the gradients they would produce and the observed gradients from the victim participant. Zhao et al. 2020 demonstrated near-exact image reconstruction from single-batch gradients. The attack is most effective with small batch sizes and degrades with larger batches. Secure aggregation (where the aggregator sees only the sum of updates, not individual updates) is the primary defence, because the attack requires access to individual participant gradients.

What is secure aggregation and how does it protect against gradient inversion?

Secure aggregation uses pairwise masking derived from Diffie-Hellman key exchange to ensure the aggregator can compute the sum of all updates without seeing any individual update. Each pair of participants generates equal and opposite masks. Each participant adds their masks to their update before sending. When the aggregator sums all masked updates, all masks cancel out perfectly, leaving only the true sum. Gradient inversion requires individual participant gradients, not their sum, so secure aggregation blocks the attack at the protocol level.

What is Byzantine fault tolerance in federated learning?

Byzantine fault tolerance means the training process produces a correct global model even when some fraction of participants send arbitrary, malicious, or faulty updates. Standard FedAvg is not Byzantine-robust. Byzantine-robust aggregation methods replace simple averaging: Krum (Blanchard et al. 2017) selects the single update closest to all others. Coordinate-wise median takes the median of each parameter across participants. Bulyan combines Krum-like filtering with coordinate-wise mean on the filtered set. These methods tolerate up to a bounded fraction of malicious participants without significantly degrading model quality.

Federated Learning

What is federated learning

The FedAvg algorithm

Cross-device vs cross-silo

The threat model

Data and model poisoning

Gradient inversion

Free-rider attacks

Byzantine fault tolerance

Secure aggregation

DP-FedAvg

After federated training ends, the inference pipeline begins

Defence summary

Production deployments

Frameworks

Frequently asked questions

Federated training protects your data. VectaX protects your inference pipeline.