Table of Contents
Fetching ...

In The Wild Ellipse Parameter Estimation for Circular Dining Plates and Bowls

Akil Pathiranage, Chris Czarnecki, Yuhao Chen, Pengcheng Xi, Linlin Xu, Alexander Wong

TL;DR

We tackle the problem of robust ellipse parameter estimation for circular diningware rims in unconstrained food images. We introduce WildEllipseFit, a multi-stage pipeline that fuses GroundingDINO-based semantic detections with edge-driven contour extraction and grouping to produce accurate ellipse parameters. We contribute the Yummly-ellipse dataset with manually annotated rim ellipses and demonstrate superior precision (lower Chamfer distance) with fewer predictions compared to a prior ellipse-fitting method, highlighting zero-shot semantic guidance's effectiveness. This approach enables more reliable camera-angle estimation and portion-size inference in real-world food imagery.

Abstract

Ellipse estimation is an important topic in food image processing because it can be leveraged to parameterize plates and bowls, which in turn can be used to estimate camera view angles and food portion sizes. Automatically detecting the elliptical rim of plates and bowls and estimating their ellipse parameters for data "in-the-wild" is challenging: diverse camera angles and plate shapes could have been used for capture, noisy background, multiple non-uniform plates and bowls in the image could be present. Recent advancements in foundational models offer promising capabilities for zero-shot semantic understanding and object segmentation. However, the output mask boundaries for plates and bowls generated by these models often lack consistency and precision compared to traditional ellipse fitting methods. In this paper, we combine ellipse fitting with semantic information extracted by zero-shot foundational models and propose WildEllipseFit, a method to detect and estimate the elliptical rim for plate and bowl. Evaluation on the proposed Yummly-ellipse dataset demonstrates its efficacy and zero-shot capability in real-world scenarios.

In The Wild Ellipse Parameter Estimation for Circular Dining Plates and Bowls

TL;DR

We tackle the problem of robust ellipse parameter estimation for circular diningware rims in unconstrained food images. We introduce WildEllipseFit, a multi-stage pipeline that fuses GroundingDINO-based semantic detections with edge-driven contour extraction and grouping to produce accurate ellipse parameters. We contribute the Yummly-ellipse dataset with manually annotated rim ellipses and demonstrate superior precision (lower Chamfer distance) with fewer predictions compared to a prior ellipse-fitting method, highlighting zero-shot semantic guidance's effectiveness. This approach enables more reliable camera-angle estimation and portion-size inference in real-world food imagery.

Abstract

Ellipse estimation is an important topic in food image processing because it can be leveraged to parameterize plates and bowls, which in turn can be used to estimate camera view angles and food portion sizes. Automatically detecting the elliptical rim of plates and bowls and estimating their ellipse parameters for data "in-the-wild" is challenging: diverse camera angles and plate shapes could have been used for capture, noisy background, multiple non-uniform plates and bowls in the image could be present. Recent advancements in foundational models offer promising capabilities for zero-shot semantic understanding and object segmentation. However, the output mask boundaries for plates and bowls generated by these models often lack consistency and precision compared to traditional ellipse fitting methods. In this paper, we combine ellipse fitting with semantic information extracted by zero-shot foundational models and propose WildEllipseFit, a method to detect and estimate the elliptical rim for plate and bowl. Evaluation on the proposed Yummly-ellipse dataset demonstrates its efficacy and zero-shot capability in real-world scenarios.
Paper Structure (10 sections, 11 equations, 7 figures, 2 tables)

This paper contains 10 sections, 11 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: (a): the input image containing soup in a bowl. (b): the bowl segmented using GroundedSAM GroundedSAM, where the elliptical rim has been lost. (c): candidate ellipses generated by our method. (d): final prediction (red) and the ground truth ellipse (green).
  • Figure 2: A block diagram for the ellipse fitting process with an example image.
  • Figure 3: Example of curve extraction process, top left is original image. Middle is the Canny edge detection algorithm applied to the entire image. The right image is the contours extracted as a result of applying the curve extraction process.
  • Figure 4: Plate filtering.
  • Figure 5: Food distance filtering.
  • ...and 2 more figures