Table of Contents
Fetching ...

A Clustering-Based Method for Automatic Educational Video Recommendation Using Deep Face-Features of Lecturers

Paulo R. C. Mendes, Eduardo S. Vieira, Álan L. V. Guedes, Antonio J. G. Busson, Sérgio Colcher

TL;DR

This work tackles educational video discovery by leveraging lecturers' visual presence rather than textual metadata. It proposes an unsupervised, two-phase method: first, representing each video via lecturer centroids obtained from clustering face embeddings; second, relating and ranking videos by shared lecturer presence across the dataset using a centroid-based clustering and a presence-time weighted similarity score. The approach preserves privacy by not identifying lecturers and enables lecturer-based timelines for targeted content retrieval. Empirical evaluation on 98 YouTube videos demonstrates strong ranking performance, with a mean Average Precision around 0.99, and shows the method can scale to large video collections while providing actionable timelines for content access.

Abstract

Discovering and accessing specific content within educational video bases is a challenging task, mainly because of the abundance of video content and its diversity. Recommender systems are often used to enhance the ability to find and select content. But, recommendation mechanisms, especially those based on textual information, exhibit some limitations, such as being error-prone to manually created keywords or due to imprecise speech recognition. This paper presents a method for generating educational video recommendation using deep face-features of lecturers without identifying them. More precisely, we use an unsupervised face clustering mechanism to create relations among the videos based on the lecturer's presence. Then, for a selected educational video taken as a reference, we recommend the ones where the presence of the same lecturers is detected. Moreover, we rank these recommended videos based on the amount of time the referenced lecturers were present. For this task, we achieved a mAP value of 99.165%.

A Clustering-Based Method for Automatic Educational Video Recommendation Using Deep Face-Features of Lecturers

TL;DR

This work tackles educational video discovery by leveraging lecturers' visual presence rather than textual metadata. It proposes an unsupervised, two-phase method: first, representing each video via lecturer centroids obtained from clustering face embeddings; second, relating and ranking videos by shared lecturer presence across the dataset using a centroid-based clustering and a presence-time weighted similarity score. The approach preserves privacy by not identifying lecturers and enables lecturer-based timelines for targeted content retrieval. Empirical evaluation on 98 YouTube videos demonstrates strong ranking performance, with a mean Average Precision around 0.99, and shows the method can scale to large video collections while providing actionable timelines for content access.

Abstract

Discovering and accessing specific content within educational video bases is a challenging task, mainly because of the abundance of video content and its diversity. Recommender systems are often used to enhance the ability to find and select content. But, recommendation mechanisms, especially those based on textual information, exhibit some limitations, such as being error-prone to manually created keywords or due to imprecise speech recognition. This paper presents a method for generating educational video recommendation using deep face-features of lecturers without identifying them. More precisely, we use an unsupervised face clustering mechanism to create relations among the videos based on the lecturer's presence. Then, for a selected educational video taken as a reference, we recommend the ones where the presence of the same lecturers is detected. Moreover, we rank these recommended videos based on the amount of time the referenced lecturers were present. For this task, we achieved a mAP value of 99.165%.

Paper Structure

This paper contains 10 sections, 5 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: Lecturers representation process in video. This process receives a video file and returns the centroids of the clusters that ideally represent each of the lecturers present in the video file.
  • Figure 2: Video Recommendation based on Lecturers Centroids Clustering. This pipeline receives the centroids of lecturers from all the videos in the dataset, then it creates relationships among videos that share the presence of the same lecturers. Finally, it performs ranking of recommended videos for each of the videos in the dataset. This ranking is based on the number of lecturers shared and their time presence.
  • Figure 3: Centroids images correction in VideoFacesTool. On the top, each image represents one lecturer. When a lecturer is selected, the tool displays all appearances (centroids) of that lecturer in different videos. The user can then mark each of these appearances as correct or wrong.
  • Figure 4: Examples of wrong faces centroids. (a) a part of an icon (b) a hand and (c) face centroids that are not from the same lecturer
  • Figure 5: Example of how the Average Precision (AP) is computed for a reference video and its recommended videos.
  • ...and 1 more figures