Selfie Taking with Facial Expression Recognition Using Omni-directional Camera
Kazutaka Kiuchi, Shimpei Imamura, Norihiko Kawai
TL;DR
The paper tackles the challenge of enabling group selfies for visually impaired users by leveraging an omni-directional camera to capture short video, robust face detection with false-detection elimination and interpolation across frames, and facial expression recognition to select a frame with uniformly high happiness values. It introduces a 3D-to-2D transformation that maps the omni-directional frame to a perspective projection image in which all participants are contained within the view, using a happiness score $H = M - cV$ to drive frame selection. The key contributions include a mean-shift based approach to prune false faces and interpolate missing ones, a frame-selection criterion based on facial expressions, and a geometric transformation framework to ensure all faces appear in the resulting image. The work demonstrates improved frame selection and reliable inclusion of participants, offering a practical, hands-free solution for accessible group selfies using omni-directional imaging.
Abstract
Recent studies have shown that visually impaired people have desires to take selfies in the same way as sighted people do to record their photos and share them with others. Although support applications using sound and vibration have been developed to help visually impaired people take selfies using smartphone cameras, it is still difficult to capture everyone in the angle of view, and it is also difficult to confirm that they all have good expressions in the photo. To mitigate these issues, we propose a method to take selfies with multiple people using an omni-directional camera. Specifically, a user takes a few seconds of video with an omni-directional camera, followed by face detection on all frames. The proposed method then eliminates false face detections and complements undetected ones considering the consistency across all frames. After performing facial expression recognition on all the frames, the proposed method finally extracts the frame in which the participants are happiest, and generates a perspective projection image in which all the participants are in the angle of view from the omni-directional frame. In experiments, we use several scenes with different number of people taken to demonstrate the effectiveness of the proposed method.
