Deep Generative Video Compression
The usage of deep generative models for image compression has led to impressive performance gains over classical codecs while neural video compression is still in its infancy. Here, we propose an end-to-end, deep generative modeling approach to compress temporal sequences with a focus on video. Our approach builds upon variational autoencoder (VAE) models for sequential data and combines them with recent work on neural image compression.
Towards a Natural Motion Generator: a Pipeline to Control a Humanoid based on Motion Data
Light Field Video Synthesis Using Inexpensive Surveillance Camera Systems
We present a light field video synthesis technique that can achieve accurate reconstruction given a low-cost, widebaseline camera rig. Our system called, INDiuM, novelly integrates optical flow with methods for rectification, disparity estimation, and feature extraction, which we then feed to a neural network view synthesis solver with widebaseline capability. A new bi-directional warping approach resolves reprojection ambiguities that would result from either backward or forward warping only. The system and method enables the use of off-the-shelf surveillance camera hardware in a simplified and expedited capture workflow. A thorough analysis of the refinement process and resulting view synthesis accuracy over state of the art is provided.
Normalized Cut Loss for Weakly-supervised CNN Segmentation
Our normalized cut loss approach to segmentation brings the quality of weakly-supervised training significantly closer to fully supervised methods.
Multi-Spectral Material Classification in Landscape Scenes Using Commodity Hardware
We investigate the advantages of a stereo, multi-spectral acquisition system for material classication in ground-level landscape images.
Underwater 3D Capture using a Low-Cost Commercial Depth Camera
This paper presents underwater 3D capture using a commercial depth camera.
Motion Fields to Predict Play Evolution in Dynamic Sport Scenes
Videos of multi-player team sports provide a challenging domain for dynamic scene analysis. Player actions and interactions are complex as they are driven by many factors, such as the short-term goals of the individual player, the overall team strategy, the rules of the sport, and the current context of the game.
Jointly Summarizing Large-Scale Web Images and Videos for the Storyline Reconstruction
In this paper, we address the problem of jointly summarizing large-scale Flickr images and YouTube user videos.
Distinguishing Texture Edges from Object Boundaries in Video
We address this issue by introducing a simple, low-level, patch-consistency assumption that leverages the extra information present in video data to resolve this ambiguity.
Page 1 of 2