Learning-based Axial Video Motion Magnification

Kwon Byung-Ki; Oh Hyun-Bin; Kim Jun-Seong; Hyunwoo Ha; Tae-Hyun Oh

Learning-based Axial Video Motion Magnification

Kwon Byung-Ki, Oh Hyun-Bin, Kim Jun-Seong, Hyunwoo Ha, Tae-Hyun Oh

TL;DR

This work proposes a new concept, axial motion magnification, which magnifies decomposed motions along the user-specified direction, and proposes a novel Motion Separation Module that enables to disentangle and magnify the motion representation along axes of interest.

Abstract

Video motion magnification amplifies invisible small motions to be perceptible, which provides humans with a spatially dense and holistic understanding of small motions in the scene of interest. This is based on the premise that magnifying small motions enhances the legibility of motions. In the real world, however, vibrating objects often possess convoluted systems that have complex natural frequencies, modes, and directions. Existing motion magnification often fails to improve legibility since the intricate motions still retain complex characteristics even after being magnified, which may distract us from analyzing them. In this work, we focus on improving legibility by proposing a new concept, axial motion magnification, which magnifies decomposed motions along the user-specified direction. Axial motion magnification can be applied to various applications where motions of specific axes are critical, by providing simplified and easily readable motion information. To achieve this, we propose a novel Motion Separation Module that enables to disentangle and magnify the motion representation along axes of interest. Furthermore, we build a new synthetic training dataset for the axial motion magnification task. Our proposed method improves the legibility of resulting motions along certain axes by adding a new feature: user controllability. Axial motion magnification is a more generalized concept; thus, our method can be directly adapted to the generic motion magnification and achieves favorable performance against competing methods.

Learning-based Axial Video Motion Magnification

TL;DR

Abstract

Paper Structure (38 sections, 11 equations, 20 figures, 1 table)

This paper contains 38 sections, 11 equations, 20 figures, 1 table.

Introduction
Related Work
Learning-based Axial Motion Magnification
Preliminary -- Generic Motion Magnification
Axial Motion Magnification
Problem Definition
Relationship with Generic Motion Magnification
Neural Networks and Training
Network Architecture
Training Data Generation
Experiments
Implementation Details
Evaluation Setup
Axial Motion Magnification
Qualitative Results
...and 23 more sections

Figures (20)

Figure 1: Importance of axial motion magnification. When identifying faults in rotating machinery, analysis of the vulnerable axial vibration is critical luo2021analysis. Existing learning-based methods oh2018learninglado2023stbpan2024selfsingh2023multi amplify motions along all axes, which yield artifacts. It hinders the analyses of vulnerable axial vibration. This motivates the importance of our axial motion magnification that magnifies decomposed motions along a user-specified axis. We magnify the axial vibration only, achieving artifact-free results and the legibility of critical motions. For the visualization purpose, we overlay the sample trajectories obtained from the Kanade-Lucas-Tomasi (KLT) Tracker lucas1981iterative.
Figure 2: Proposed architecture. (a) The Encoder outputs features from input images and the features are fed to the Texture branch and Motion Separation Module (MSM). (b) Using weight-shared 1D convolutions, the Shape branch extracts shape representations along the $x$ and $y$-axes. These representations are fed to the projection layer $P^{\phi}$, which generates axial shape representations, i.e., $\textbf{S}^{\phi}_{t}$ and $\textbf{S}^{\phi_\perp}_{t}$. (c) the Manipulator amplifies them by the axial magnification factors and the inverse projection layer $P^{-\phi}$ re-project them onto the $x$ and $y$-axes. Finally, the Decoder predicts the axially magnified image from the outputs from both the Texture branch and MSM.
Figure 3: Synthetic data generation pipeline for axial motion magnification. From the sampled background and foregrounds, each with their own segmentation masks, we compose the previous layer images $\{{\mathbf{L}}^k_{1}\}_{k=1}^{K}$ and masks $\{\boldsymbol{\Omega}^k_{1}\}_{k=1}^{K}$. To generate next layer images $\{{\mathbf{L}}^k_{2}\}_{k=1}^{K}$ and masks $\{\boldsymbol{\Omega}^k_{2}\}_{k=1}^{K}$, we apply the random translations to $\{{\mathbf{L}}^k_{1}\}_{k=1}^{K}$ and $\{\boldsymbol{\Omega}^k_{1}\}_{k=1}^{K}$. Axially magnified layer images $\{\hat{{\mathbf{L}}}^{\phi,k}\}_{k=1}^{K}$ and masks $\{\hat{\boldsymbol{\Omega}}^{\phi,k}\}_{k=1}^{K}$ are also synthesized by translations but with the axially magnified translation parameters. These images and masks are then superimposed into a single image to yield $\textbf{I}_{1}$, $\textbf{I}_{2}$, and $\hat{\textbf{I}}^{\phi}$, respectively. The dataset also include angles $\phi$ and the object-wise magnification maps $\boldsymbol{\Lambda}$ generated by superimposing $\{\boldsymbol{\alpha}^{k}\}_{k=1}^{K}$ with $\{\boldsymbol{\Omega}_{1}^k\}_{k=1}^{K}$.
Figure 4: [Left] Imposing an imbalance on a rotor, [Right] Qualitative results in axial motion magnification scenario. We attach weights to a rotor to impose an imbalance and acquire rotor imbalance sequence, which has axial vibrations. Then, we amplify only the motion of rotor's axial direction with the magnification factor $\alpha = 40$, using ours and modified phase-based method. We also show the magnified result of DMM oh2018learning as a reference result of generic motion magnification. Our method generates magnified frames without artifacts and exhibits the $x$-t slice showing clearly legible axial vibrations, while modified phase-based method and DMM both suffer from severe artifacts and have unclear axial vibrations in the $x$-t slice.
Figure 5: Quantitative results in axial motion magnification scenario. (a) In the subpixel test, ours shows superior performance on SSIM over the modified phase-based method across all input motion amount, ranging from $0.04$ to $1.0$. (b) In the noise tests when the input motion amount is 0.05 pixel, we observe a growing disparity in SSIM scores between ours and the phase-based approach, as the noise factor rises.
...and 15 more figures

Learning-based Axial Video Motion Magnification

TL;DR

Abstract

Learning-based Axial Video Motion Magnification

Authors

TL;DR

Abstract

Table of Contents

Figures (20)