Video synchronization is a fundamental step for many applications in computer vision, ranging from video morphing to motion analysis. We present a novel method for synchronizing action videos where a similar action is performed by different people at different times and different locations with different local speed changes, e.g., as in sports like weightlifting, baseball pitch, or dance. Our approach extends the popular “snapping” tool of video editing software and allows users to automatically snap action videos together in a timeline based on their content. Since the action can take place at different locations, existing appearance-based methods are not appropriate. Our approach leverages motion information and computes a nonlinear synchronization of the input videos to establish frame-to-frame temporal correspondences. We demonstrate our approach can be applied for video synchronization, video annotation, and action snapshots. Our approach has been successfully evaluated with ground truth data and a user study.
The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.