Table of Contents
Fetching ...

TopoReformer: Mitigating Adversarial Attacks Using Topological Purification in OCR Models

Bhagyesh Kumar, A S Aravinthakashan, Akshat Satyanarayan, Ishaan Gakhar, Ujjwal Verma

TL;DR

This work introduces TopoReformer, a topology-guided purification pipeline that defends OCR models from adversarial perturbations in a model-agnostic manner. By embedding persistent-homology–based losses into a topology-preserving autoencoder, the method achieves manifold-level robustness without adversarial training, and a Freeze-Flow training scheme with an auxiliary pathway further boosts performance under adaptive attacks. Results on MNIST, EMNIST, and OCR data show substantial gains against classical and adaptive attacks, including FAWA, while preserving or exceeding baseline accuracy on clean inputs. The approach offers a scalable defense that emphasizes global geometric consistency over pixel-level denoising, with practical implications for secure document processing and OCR-enabled systems.

Abstract

Adversarially perturbed images of text can cause sophisticated OCR systems to produce misleading or incorrect transcriptions from seemingly invisible changes to humans. Some of these perturbations even survive physical capture, posing security risks to high-stakes applications such as document processing, license plate recognition, and automated compliance systems. Existing defenses, such as adversarial training, input preprocessing, or post-recognition correction, are often model-specific, computationally expensive, and affect performance on unperturbed inputs while remaining vulnerable to unseen or adaptive attacks. To address these challenges, TopoReformer is introduced, a model-agnostic reformation pipeline that mitigates adversarial perturbations while preserving the structural integrity of text images. Topology studies properties of shapes and spaces that remain unchanged under continuous deformations, focusing on global structures such as connectivity, holes, and loops rather than exact distance. Leveraging these topological features, TopoReformer employs a topological autoencoder to enforce manifold-level consistency in latent space and improve robustness without explicit gradient regularization. The proposed method is benchmarked on EMNIST, MNIST, against standard adversarial attacks (FGSM, PGD, Carlini-Wagner), adaptive attacks (EOT, BDPA), and an OCR-specific watermark attack (FAWA).

TopoReformer: Mitigating Adversarial Attacks Using Topological Purification in OCR Models

TL;DR

This work introduces TopoReformer, a topology-guided purification pipeline that defends OCR models from adversarial perturbations in a model-agnostic manner. By embedding persistent-homology–based losses into a topology-preserving autoencoder, the method achieves manifold-level robustness without adversarial training, and a Freeze-Flow training scheme with an auxiliary pathway further boosts performance under adaptive attacks. Results on MNIST, EMNIST, and OCR data show substantial gains against classical and adaptive attacks, including FAWA, while preserving or exceeding baseline accuracy on clean inputs. The approach offers a scalable defense that emphasizes global geometric consistency over pixel-level denoising, with practical implications for secure document processing and OCR-enabled systems.

Abstract

Adversarially perturbed images of text can cause sophisticated OCR systems to produce misleading or incorrect transcriptions from seemingly invisible changes to humans. Some of these perturbations even survive physical capture, posing security risks to high-stakes applications such as document processing, license plate recognition, and automated compliance systems. Existing defenses, such as adversarial training, input preprocessing, or post-recognition correction, are often model-specific, computationally expensive, and affect performance on unperturbed inputs while remaining vulnerable to unseen or adaptive attacks. To address these challenges, TopoReformer is introduced, a model-agnostic reformation pipeline that mitigates adversarial perturbations while preserving the structural integrity of text images. Topology studies properties of shapes and spaces that remain unchanged under continuous deformations, focusing on global structures such as connectivity, holes, and loops rather than exact distance. Leveraging these topological features, TopoReformer employs a topological autoencoder to enforce manifold-level consistency in latent space and improve robustness without explicit gradient regularization. The proposed method is benchmarked on EMNIST, MNIST, against standard adversarial attacks (FGSM, PGD, Carlini-Wagner), adaptive attacks (EOT, BDPA), and an OCR-specific watermark attack (FAWA).

Paper Structure

This paper contains 21 sections, 3 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Schematic of the TopoReformer pipeline and the freeze-flow paradigm. The Topological Autoencoder (TopoAE), Auxiliary Module, and Reformer diagrams are illustrative and do not represent the actual neural network architectures.
  • Figure 2: Comparison of topological latent visualizations.
  • Figure 3: Comparison of Grad-CAM visualizations. (a)–(b) Unperturbed images, (c)–(d) adversarial images. Reformers project inputs toward topology-consistent manifolds, leading to more focused and confident class-discriminative regions.