Skip to main content
completed federated-learning-security

Federated Healthcare Fraud Defense (SignGuard)

A secure federated learning framework that lets US states collaboratively train healthcare fraud detection models without sharing sensitive patient data.

Started: January 1, 2026
Completed: March 1, 2026
227M+
Records Processed
54
Jurisdictions

Tags

federated-learninghealthcaresecurityfraud-detectionecdsa

Technologies

Python PyTorch Cryptography

Architecture

SignGuard Defense Pipeline

Multi-layered defense against adversarial attacks in federated learning environments

Normal FlowUnder Attack
Client
Legitimate user submitting signature
Legitimate user submitting signature
Sign Module
Signs valid data with private key
Signs valid data with private key
Verify
Verifies signature integrity
Verifies signature integrity
Detect
Monitors for anomalies
Monitors for anomalies
Reputation
Maintains trust scores
Maintains trust scores
Aggregate
Aggregates valid contributions
Aggregates valid contributions
94.5%Attack Detection Rate

Overview

Machine learning is a critical tool for identifying anomalous provider behavior in national healthcare systems. However, centralizing highly sensitive claims data (like Medicare/Medicaid) poses severe compliance and privacy risks (HIPAA). Federated Learning (FL) offers a paradigm to collaboratively train fraud detection models across decentralized institutional silos (US states) without transferring raw patient histories.

However, Vanilla FL is wildly vulnerable to Byzantine failures and Data Poisoning via Sybil networks. This repository contains the implementation of SignGuard, a hybrid cryptographic defense mechanism evaluated against the largest public healthcare dataset to dateβ€”227 million real-world HHS Medicaid provider claims partitioned across 54 United States jurisdictions.

πŸ” The SignGuard Architecture

SignGuard strictly decouples identity non-repudiation from gradient validation.

1. Cryptographic Identity (ECDSA NIST P-256)

All participants generate an Elliptic Curve key pair. The aggregation server drops any update failing verification, neutralizing network spoofing.

2. Statistical Validation & Reputation

Accepted identities have their parameter gradients mathematically audited:

  • L2 Norm Thresholding: Reject structural explosions inherent to gradient ascent poisoning.
  • Cosine Alignment: Angle determination against the historical global memory vector to detect coordinated subversion.
  • Reputation Ledger (EMA): Rejections immediately decay trust exponentially, permanently isolating Sybil participants.

πŸ“Š Empirical Evaluation Matrix

  • Random Poisoning (Sensory failures)
  • Model Poisoning (Gradient Ascent / Targeted degradation)
  • Label Flipping (Collusion data-poisoning)
  • Free Riding (Compute theft)
  • Sybil Networks (Coordinated identity spoofing)

SignGuard actively neutralizes these structural and Sybil manipulation injections while preserving AUPRC within 1.8% of theoretically optimal centralized baselines.

βš™οΈ Evolution & Core Research

The real-world implementation applied to Medicaid stems from the core theoretical research on ECDSA-Based Federated Learning Defense.

The Underlying Security Proof

Federated Learning (FL) is vulnerable to Byzantine attacks where malicious clients submit poisoned model updates. Existing solutions (Krum, Multi-Krum) lack cryptographic verification of client identity.

SignGuard introduces three core algorithmic components under the hood:

  1. Signature Generator: Clients sign their model updates using ECDSA private keys (SECP256R1), ensuring absolute identity non-repudiation.
  2. Verification Engine: The central aggregator verifies ECDSA signatures in ~1.2ms per client before performing any compute-heavy aggregation logic, instantly dropping unauthenticated packets.
  3. Reputation Manager: Verified updates undergo statistical anomaly detection. Anomalous vectors immediately decay trust exponentially, permanently isolating Sybil participants over multiple rounds.