Making a long story short: A Multi-Importance fast-forwarding egocentric videos with the emphasis on relevant objects
Special Issue on Egocentric Vision and Lifelogging Tools of the Journal of Visual Communication and Image Representation (JVCI)
Abstract
The emergence of low-cost, high-quality personal wearable cameras combined with the increasing storage capacity of video-sharing websites have evoked a growing interest in first-person videos. Since most videos are composed of long-running unedited streams which are usually tedious and unpleasant to watch. State-of-the-art fast-forward methods currently face the challenge of providing an adequate balance between smoothness in visual flow and the emphasis on the relevant parts. In this work, we present the Multi-Importance Fast-Forward (MIFF), a fully automatic methodology to fast-forward egocentric videos facing these challenges. The dilemma of defining what is the semantic information of a video is addressed by a learning process based on the preferences of the user. Results show that the proposed method keeps over 3 times more semantic content than the state-of-the-art fast-forward. Finally, we discuss the need of a particular video stabilization techniques for fast-forward egocentric videos.
Methodology and Results |
Citation
@article{Silva2018,
title = {Making a long story short: A Multi-Importance fast-forwarding egocentric videos with the emphasis on relevant objects},
author = {Michel M. Silva and Washington L. S. Ramos and Felipe C. Chamone and João P. K. Ferreira and Mario F. M. Campos and Erickson R. Nascimento},
journal = {Journal of Visual Communication and Image Representation},
volume = {53},
number = {},
pages = {55 – 64},
year = {2018},
issn = {1047-3203},
doi = {10.1016/j.jvcir.2018.02.013}
}
title = {Making a long story short: A Multi-Importance fast-forwarding egocentric videos with the emphasis on relevant objects},
author = {Michel M. Silva and Washington L. S. Ramos and Felipe C. Chamone and João P. K. Ferreira and Mario F. M. Campos and Erickson R. Nascimento},
journal = {Journal of Visual Communication and Image Representation},
volume = {53},
number = {},
pages = {55 – 64},
year = {2018},
issn = {1047-3203},
doi = {10.1016/j.jvcir.2018.02.013}
}
Baselines
We compare the proposed methodology against the following methods:
- EgoSampling – Poleg et al., Egosampling: Fast-forward and stereo for egocentric videos, CVPR 2015.
- Microsoft Hyperlapse – Joshi et al., Real-time hyperlapse creation via optimal frame selection, ACM. Trans. Graph. 2015.
- Stabilized Semantic Fast-Forward (SSFF) – Silva et al., Towards semantic fast-forward and stabilized egocentric videos, EPIC@ECCV 2016.
Datasets
We conducted the experimental evaluation using the following datasets:
- EgoSequences – Poleg et al., Egosampling: Fast-forward and stereo for egocentric videos, CVPR 2015.
- Semantic Dataset – Silva et al., Towards Semantic Fast-Forward and Stabilized Egocentric Videos, EPIC@ECCV 2016.