Table of Contents
Fetching ...

Marginalized Generalized IoU (MGIoU): A Unified Objective Function for Optimizing Any Convex Parametric Shapes

Duy-Tho Le, Trung Pham, Jianfei Cai, Hamid Rezatofighi

TL;DR

MGIoU introduces a unified, differentiable loss for optimizing convex parametric shapes by projecting shapes onto a set of normals and computing a one-dimensional GIoU per direction, then averaging to form MGIoU. It extends to MGIoU$^+$ for unstructured shapes with a convexity regularizer and MGIoU$^-$ for minimizing overlaps in trajectory prediction, creating a general framework across 2D/3D, rotated geometries, and temporal sequences. Empirically, MGIoU and its variants outperform strong baselines across 2D oriented detection, monocular 3D 6-DoF recognition, quadrangle detection, and collision-avoidant trajectory prediction, while offering substantial latency reductions (10–40x) and satisfying core metric properties, including scale invariance. The approach provides a practical, unified tool for shape optimization with broad applicability and improved robustness in real-world tasks.”

Abstract

Optimizing the similarity between parametric shapes is crucial for numerous computer vision tasks, where Intersection over Union (IoU) stands as the canonical measure. However, existing optimization methods exhibit significant shortcomings: regression-based losses like L1/L2 lack correlation with IoU, IoU-based losses are unstable and limited to simple shapes, and task-specific methods are computationally intensive and not generalizable accross domains. As a result, the current landscape of parametric shape objective functions has become scattered, with each domain proposing distinct IoU approximations. To address this, we unify the parametric shape optimization objective functions by introducing Marginalized Generalized IoU (MGIoU), a novel loss function that overcomes these challenges by projecting structured convex shapes onto their unique shape Normals to compute one-dimensional normalized GIoU. MGIoU offers a simple, efficient, fully differentiable approximation strongly correlated with IoU. We then extend MGIoU to MGIoU+ that supports optimizing unstructured convex shapes. Together, MGIoU and MGIoU+ unify parametric shape optimization across diverse applications. Experiments on standard benchmarks demonstrate that MGIoU and MGIoU+ consistently outperform existing losses while reducing loss computation latency by 10-40x. Additionally, MGIoU and MGIoU+ satisfy metric properties and scale-invariance, ensuring robustness as an objective function. We further propose MGIoU- for minimizing overlaps in tasks like collision-free trajectory prediction. Code is available at https://ldtho.github.io/MGIoU

Marginalized Generalized IoU (MGIoU): A Unified Objective Function for Optimizing Any Convex Parametric Shapes

TL;DR

MGIoU introduces a unified, differentiable loss for optimizing convex parametric shapes by projecting shapes onto a set of normals and computing a one-dimensional GIoU per direction, then averaging to form MGIoU. It extends to MGIoU for unstructured shapes with a convexity regularizer and MGIoU for minimizing overlaps in trajectory prediction, creating a general framework across 2D/3D, rotated geometries, and temporal sequences. Empirically, MGIoU and its variants outperform strong baselines across 2D oriented detection, monocular 3D 6-DoF recognition, quadrangle detection, and collision-avoidant trajectory prediction, while offering substantial latency reductions (10–40x) and satisfying core metric properties, including scale invariance. The approach provides a practical, unified tool for shape optimization with broad applicability and improved robustness in real-world tasks.”

Abstract

Optimizing the similarity between parametric shapes is crucial for numerous computer vision tasks, where Intersection over Union (IoU) stands as the canonical measure. However, existing optimization methods exhibit significant shortcomings: regression-based losses like L1/L2 lack correlation with IoU, IoU-based losses are unstable and limited to simple shapes, and task-specific methods are computationally intensive and not generalizable accross domains. As a result, the current landscape of parametric shape objective functions has become scattered, with each domain proposing distinct IoU approximations. To address this, we unify the parametric shape optimization objective functions by introducing Marginalized Generalized IoU (MGIoU), a novel loss function that overcomes these challenges by projecting structured convex shapes onto their unique shape Normals to compute one-dimensional normalized GIoU. MGIoU offers a simple, efficient, fully differentiable approximation strongly correlated with IoU. We then extend MGIoU to MGIoU+ that supports optimizing unstructured convex shapes. Together, MGIoU and MGIoU+ unify parametric shape optimization across diverse applications. Experiments on standard benchmarks demonstrate that MGIoU and MGIoU+ consistently outperform existing losses while reducing loss computation latency by 10-40x. Additionally, MGIoU and MGIoU+ satisfy metric properties and scale-invariance, ensuring robustness as an objective function. We further propose MGIoU- for minimizing overlaps in tasks like collision-free trajectory prediction. Code is available at https://ldtho.github.io/MGIoU

Paper Structure

This paper contains 15 sections, 4 theorems, 36 equations, 4 figures, 4 tables, 2 algorithms.

Key Result

Lemma 1

For any two structured convex shapes $P$ and $G$,

Figures (4)

  • Figure 1: MGIoU and its variants computation. Predicted ($P$, red) and ground-truth ($G$, blue) shapes are projected onto their unique Normals ($P_i, G_i$) to calculate one-dimensional overlaps, measuring intersection (purple segments) relative to union (green segments). To reduce visual clutter, some panels depict projections onto unique Normals of only one shape. Examples: MGIoU$^+[$A) unstructured quadrilaterals, B) general polygons$]$, MGIoU$[$C) ellipses, D) 3D cuboids$]$, and MGIoU$^-[$E) rotated rectangles$]$. Best viewed zoomed in.
  • Figure 2: Qualitative visualisation (Test set images) comparing MGIoU vs KLD losses on DOTA dataset, and MGIoU$^+$ vs OKS distance on ICDAR2017 Dataset. MGIoU and MGIoU$^+$ can capture the orientation (2D oriented detection) and vertices better (quadrilaterals detection)
  • Figure 3: Qualitative visualisation on Omni3D dataset.
  • Figure 4: Qualitative visualisation on Waymo dataset, we visualize the predicted future bounding boxes of road agents in the next 8 seconds (80 timesteps), with and without MGIoU$^-$ during training. With MGIoU$^-$ incorporated in the training stage, model now have a better understanding of the physical world and can make safer interactions between road agents. A) 2 cars avoid collision at an intersection, B) Pedestrian stops and wait, C) Smooth interaction among three vehicles without collisions, D) Car appropriately yields to a pedestrian

Theorems & Definitions (8)

  • Lemma 1: Symmetry of MGIoU
  • Proof 1
  • Lemma 2: Identity Property of MGIoU
  • Proof 2
  • Lemma 3: Scale-Invariance of MGIoU
  • Proof 3
  • Proposition 1: Properties of $\mathcal{L}_{\text{MGIoU}}$
  • Proof 4