Table of Contents
Fetching ...

Enhancing 3D Robotic Vision Robustness by Minimizing Adversarial Mutual Information through a Curriculum Training Approach

Nastaran Darabi, Dinithi Jayasuriya, Devashri Naik, Theja Tulabandhula, Amit Ranjan Trivedi

TL;DR

This work proposes a training objective that simultaneously minimizes prediction loss and mutual information (MI) under adversarial perturbations to contain the upper bound of misprediction errors and integrates curriculum advisors in the training setup that gradually introduces adversarial objectives.

Abstract

Adversarial attacks exploit vulnerabilities in a model's decision boundaries through small, carefully crafted perturbations that lead to significant mispredictions. In 3D vision, the high dimensionality and sparsity of data greatly expand the attack surface, making 3D vision particularly vulnerable for safety-critical robotics. To enhance 3D vision's adversarial robustness, we propose a training objective that simultaneously minimizes prediction loss and mutual information (MI) under adversarial perturbations to contain the upper bound of misprediction errors. This approach simplifies handling adversarial examples compared to conventional methods, which require explicit searching and training on adversarial samples. However, minimizing prediction loss conflicts with minimizing MI, leading to reduced robustness and catastrophic forgetting. To address this, we integrate curriculum advisors in the training setup that gradually introduce adversarial objectives to balance training and prevent models from being overwhelmed by difficult cases early in the process. The advisors also enhance robustness by encouraging training on diverse MI examples through entropy regularizers. We evaluated our method on ModelNet40 and KITTI using PointNet, DGCNN, SECOND, and PointTransformers, achieving 2-5% accuracy gains on ModelNet40 and a 5-10% mAP improvement in object detection. Our code is publicly available at https://github.com/nstrndrbi/Mine-N-Learn.

Enhancing 3D Robotic Vision Robustness by Minimizing Adversarial Mutual Information through a Curriculum Training Approach

TL;DR

This work proposes a training objective that simultaneously minimizes prediction loss and mutual information (MI) under adversarial perturbations to contain the upper bound of misprediction errors and integrates curriculum advisors in the training setup that gradually introduces adversarial objectives.

Abstract

Adversarial attacks exploit vulnerabilities in a model's decision boundaries through small, carefully crafted perturbations that lead to significant mispredictions. In 3D vision, the high dimensionality and sparsity of data greatly expand the attack surface, making 3D vision particularly vulnerable for safety-critical robotics. To enhance 3D vision's adversarial robustness, we propose a training objective that simultaneously minimizes prediction loss and mutual information (MI) under adversarial perturbations to contain the upper bound of misprediction errors. This approach simplifies handling adversarial examples compared to conventional methods, which require explicit searching and training on adversarial samples. However, minimizing prediction loss conflicts with minimizing MI, leading to reduced robustness and catastrophic forgetting. To address this, we integrate curriculum advisors in the training setup that gradually introduce adversarial objectives to balance training and prevent models from being overwhelmed by difficult cases early in the process. The advisors also enhance robustness by encouraging training on diverse MI examples through entropy regularizers. We evaluated our method on ModelNet40 and KITTI using PointNet, DGCNN, SECOND, and PointTransformers, achieving 2-5% accuracy gains on ModelNet40 and a 5-10% mAP improvement in object detection. Our code is publicly available at https://github.com/nstrndrbi/Mine-N-Learn.
Paper Structure (15 sections, 18 equations, 4 figures, 4 tables)

This paper contains 15 sections, 18 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: 3D vision sensors used in safety-critical robotics are vulnerable to various adversarial attacks. For instance, sensor data can be manipulated by adding or removing points, or environments can be altered by placing reflective surfaces that create ghost objects. Non-idealities of 3D sensor circuits can also be exploited such as by leveraging their harmonics to create false objects in the sensor’s field of view.
  • Figure 2: Directly training with the proposed MI-based objective leads to catastrophic forgetting. See AT + MINE results in (a) for PointTransformer and (b) for PointNet on ModelNet40. This is due to the conflicting goals of minimizing the prediction loss, which leverages gradient sensitivity, and minimizing the adversarial loss, which reduces sensitivity to adversarial perturbations. We address this by integrating curriculum training with the proposed MI-based approach (AT + MINE + CT results).
  • Figure 3: MI distribution for (a) natural and (b) adversarial perturbations on ModelNet40 with PointNet-based predictions. The MI for natural data clusters around the mean with a tail toward lower values, while adversarial data shows a skewed distribution, peaking at lower MI values and extending higher based on perturbation strength.
  • Figure 4: Overall training framework: (a) A parametric perturbation generates perturbed data from clean data. Mutual information is extracted for both natural and adversarial data, while the prediction model receives both inputs. Optimization focuses on minimizing prediction loss and MI. After batch-wise training, summary statistics of natural and adversarial MI are sent to the curriculum advisor, which disentangles adversarial cases, clusters them, and generates entropy-based regularization. The curriculum advisor's parameters are learned through a closed-loop process, following the loss in Eq. 19. (b) Representative profiles of LiDAR point cloud manipulations under adversarial drop and IFGM that were studied to characterize the framework.