Table of Contents
Fetching ...

Involution Fused ConvNet for Classifying Eye-Tracking Patterns of Children with Autism Spectrum Disorder

Md. Farhadul Islam, Meem Arafat Manab, Joyanta Jyoti Mondal, Sarah Zabeen, Fardin Bin Rahman, Md. Zahidul Hasan, Farig Sadeque, Jannatun Noor

TL;DR

This work tackles the challenging task of diagnosing Autism Spectrum Disorder (ASD) from eye-tracking gaze patterns. It introduces a hybrid Involution-Convolution neural network, placing three involution layers before convolutional blocks to learn location-specific spatial cues in gaze data. Evaluated on two public datasets with augmentation, the model achieves near state-of-the-art accuracy (about 99.4% on Dataset 1 and ~96.8% on Dataset 2) while maintaining a remarkably small footprint (~1.36 MB). The results demonstrate that combining involution with convolution yields strong performance with a compact model, enabling potential edge deployment for ASD screening using eye-tracking data.

Abstract

Autism Spectrum Disorder (ASD) is a complicated neurological condition which is challenging to diagnose. Numerous studies demonstrate that children diagnosed with autism struggle with maintaining attention spans and have less focused vision. The eye-tracking technology has drawn special attention in the context of ASD since anomalies in gaze have long been acknowledged as a defining feature of autism in general. Deep Learning (DL) approaches coupled with eye-tracking sensors are exploiting additional capabilities to advance the diagnostic and its applications. By learning intricate nonlinear input-output relations, DL can accurately recognize the various gaze and eye-tracking patterns and adjust to the data. Convolutions alone are insufficient to capture the important spatial information in gaze patterns or eye tracking. The dynamic kernel-based process known as involutions can improve the efficiency of classifying gaze patterns or eye tracking data. In this paper, we utilise two different image-processing operations to see how these processes learn eye-tracking patterns. Since these patterns are primarily based on spatial information, we use involution with convolution making it a hybrid, which adds location-specific capability to a deep learning model. Our proposed model is implemented in a simple yet effective approach, which makes it easier for applying in real life. We investigate the reasons why our approach works well for classifying eye-tracking patterns. For comparative analysis, we experiment with two separate datasets as well as a combined version of both. The results show that IC with three involution layers outperforms the previous approaches.

Involution Fused ConvNet for Classifying Eye-Tracking Patterns of Children with Autism Spectrum Disorder

TL;DR

This work tackles the challenging task of diagnosing Autism Spectrum Disorder (ASD) from eye-tracking gaze patterns. It introduces a hybrid Involution-Convolution neural network, placing three involution layers before convolutional blocks to learn location-specific spatial cues in gaze data. Evaluated on two public datasets with augmentation, the model achieves near state-of-the-art accuracy (about 99.4% on Dataset 1 and ~96.8% on Dataset 2) while maintaining a remarkably small footprint (~1.36 MB). The results demonstrate that combining involution with convolution yields strong performance with a compact model, enabling potential edge deployment for ASD screening using eye-tracking data.

Abstract

Autism Spectrum Disorder (ASD) is a complicated neurological condition which is challenging to diagnose. Numerous studies demonstrate that children diagnosed with autism struggle with maintaining attention spans and have less focused vision. The eye-tracking technology has drawn special attention in the context of ASD since anomalies in gaze have long been acknowledged as a defining feature of autism in general. Deep Learning (DL) approaches coupled with eye-tracking sensors are exploiting additional capabilities to advance the diagnostic and its applications. By learning intricate nonlinear input-output relations, DL can accurately recognize the various gaze and eye-tracking patterns and adjust to the data. Convolutions alone are insufficient to capture the important spatial information in gaze patterns or eye tracking. The dynamic kernel-based process known as involutions can improve the efficiency of classifying gaze patterns or eye tracking data. In this paper, we utilise two different image-processing operations to see how these processes learn eye-tracking patterns. Since these patterns are primarily based on spatial information, we use involution with convolution making it a hybrid, which adds location-specific capability to a deep learning model. Our proposed model is implemented in a simple yet effective approach, which makes it easier for applying in real life. We investigate the reasons why our approach works well for classifying eye-tracking patterns. For comparative analysis, we experiment with two separate datasets as well as a combined version of both. The results show that IC with three involution layers outperforms the previous approaches.
Paper Structure (16 sections, 3 equations, 13 figures, 7 tables)

This paper contains 16 sections, 3 equations, 13 figures, 7 tables.

Figures (13)

  • Figure 1: Samples images from dataset-1 and dataset-2.
  • Figure 2: Bar chart of data frequency of both augmented and non-augmented versions.
  • Figure 3: Mean images of each class. These images were generated by finding the mean pixel value of each images per class.
  • Figure 4: Visualization of the kernel production of convolution.
  • Figure 5: Visualization of the kernel production of involution.
  • ...and 8 more figures