Secure Cross-Cloud Federated Learning: A Layered Defense Architecture
Açıklama
Secure Cross-Cloud Federated Learning: A Layered Defense Architecture
While traditional Machine Learning requires centralizing data on a single server, Federated Learning (FL) preserves privacy by keeping data on local devices during the training process. However, as this paradigm expands into Cross-Cloud Federated Learning (CCFL) across multiple providers, it introduces significant risks such as Sybil attacks, model poisoning, and inference-based data leakage.
A Layered Defense Strategy
This research proposes a six-layered security architecture designed to provide end-to-end protection for heterogeneous cloud environments:
-
Local Differential Privacy (LDP): Prevents inference attacks by masking gradients with calibrated noise.
-
Authentic Computation with ZKP: Utilizes Zero-Knowledge Proofs to verify the integrity of computations without exposing private data.
-
Sybil Attack Detection: Identifies fake or colluding clients through similarity clustering and reputation scoring
-
Poisoning Resistance: Filters anomalous updates using Z-score statistical analysis to protect the global model
-
Secure Aggregation: Employs Paillier homomorphic encryption to enable model averaging while gradients remain encrypted
-
Blockchain-Based Logging: Ensures auditability and accountability across providers through a tamper-evident ledger.
Key Performance Insights
-
Resilience: Under a 50% adversarial participation scenario, the proposed architecture reduces accuracy loss from 60% to just 12%, performing nearly five times better than standard FL.
-
Robustness: The system maintains a global model accuracy of 81% even when subjected to a 30% poisoning rate.
-
Feasibility: While the security layers introduce a communication overhead (approximately +3.2 MB per round), the cost remains practical for the enhanced levels of confidentiality and auditability provided.
This work establishes a verifiable "defense-in-depth" roadmap for secure collaboration in sensitive sectors such as digital health and finance, where data remains siloed across diverse administrative domains.