Weakly-supervised Autism Severity Assessment in Long Videos
Abid Ali, Mahmoud Ali, Jean-Marc Odobez, Camilla Barbini, Séverine Dubuisson, Francois Bremond, Susanne Thümmler
TL;DR
This paper tackles autism severity assessment from long, untrimmed videos under weak supervision by learning typical versus atypical behavioral biomarkers. It introduces a three-stage architecture: a Visual Encoder using VideoMAE-v2/DinoV2 features, a WTAL-based Outlier Embedder and Cross-Temporal Scale Transformer with a Detector to identify ASD segments, and a shallow TCN-MLP severity regressor that maps learned biomarkers to ADOS-based severity scores. The approach achieves superior discrimination of typical vs ASD patterns and provides automatic severity estimates on real-world clinical data, outperforming several TAL baselines. By leveraging weak supervision and long-video biomarkers, the method offers a scalable, non-invasive aid for clinicians in early ASD detection and ongoing assessment. This work thus promotes objective, multi-biomarker analysis from untrimmed videos with potential for broader clinical impact.
Abstract
Autism Spectrum Disorder (ASD) is a diverse collection of neurobiological conditions marked by challenges in social communication and reciprocal interactions, as well as repetitive and stereotypical behaviors. Atypical behavior patterns in a long, untrimmed video can serve as biomarkers for children with ASD. In this paper, we propose a video-based weakly-supervised method that takes spatio-temporal features of long videos to learn typical and atypical behaviors for autism detection. On top of that, we propose a shallow TCN-MLP network, which is designed to further categorize the severity score. We evaluate our method on actual evaluation videos of children with autism collected and annotated (for severity score) by clinical professionals. Experimental results demonstrate the effectiveness of behavioral biomarkers that could help clinicians in autism spectrum analysis.
