Table of Contents
Fetching ...

MeFEm: Medical Face Embedding model

Yury Borets, Stepan Botman

TL;DR

MeFEm, a vision model based on a modified Joint Embedding Predictive Architecture (JEPA) for biometric and medical analysis from facial images, outperforms strong baselines like FaRL and Franca on core anthropometric tasks despite using significantly less data.

Abstract

We present MeFEm, a vision model based on a modified Joint Embedding Predictive Architecture (JEPA) for biometric and medical analysis from facial images. Key modifications include an axial stripe masking strategy to focus learning on semantically relevant regions, a circular loss weighting scheme, and the probabilistic reassignment of the CLS token for high quality linear probing. Trained on a consolidated dataset of curated images, MeFEm outperforms strong baselines like FaRL and Franca on core anthropometric tasks despite using significantly less data. It also shows promising results on Body Mass Index (BMI) estimation, evaluated on a novel, consolidated closed-source dataset that addresses the domain bias prevalent in existing data. Model weights are available at https://huggingface.co/boretsyury/MeFEm , offering a strong baseline for future work in this domain.

MeFEm: Medical Face Embedding model

TL;DR

MeFEm, a vision model based on a modified Joint Embedding Predictive Architecture (JEPA) for biometric and medical analysis from facial images, outperforms strong baselines like FaRL and Franca on core anthropometric tasks despite using significantly less data.

Abstract

We present MeFEm, a vision model based on a modified Joint Embedding Predictive Architecture (JEPA) for biometric and medical analysis from facial images. Key modifications include an axial stripe masking strategy to focus learning on semantically relevant regions, a circular loss weighting scheme, and the probabilistic reassignment of the CLS token for high quality linear probing. Trained on a consolidated dataset of curated images, MeFEm outperforms strong baselines like FaRL and Franca on core anthropometric tasks despite using significantly less data. It also shows promising results on Body Mass Index (BMI) estimation, evaluated on a novel, consolidated closed-source dataset that addresses the domain bias prevalent in existing data. Model weights are available at https://huggingface.co/boretsyury/MeFEm , offering a strong baseline for future work in this domain.
Paper Structure (17 sections, 6 equations, 3 figures, 4 tables)

This paper contains 17 sections, 6 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Visual examples from the datasets comprising the training set.
  • Figure 2: Schematic representation of different masking strategies: multiblock (a), quadrant (b) and, axial stripes (c). Colored regions represent source the mask; remaining part is the target.
  • Figure 3: Loss weights matrix.