Computer Vision Archives - Page 2 of 9

Empowerment and Embodiment for Collaborative Mixed Reality System

We present several mixed reality based remote collaboration settings by using consumer head-mounted displays, including an AR system linked with an AR system, a VR system with virtual body, a VR system without virtual body and a desktop computer.

Learn More

GPU-Accelerated Depth Codec for Real-Time, High-Quality Light Field Reconstruction

In this paper, we propose a depth image and video codec based on block compression, that exploits typical characteristics of depth streams, drawing inspiration from S3TC texture compression and geometric wavelets.

Learn More

Paxel: A Generic Framework to Superimpose High-Frequency Print Patterns using Projected Light

We propose Paxel, a generic framework for modeling the interaction between a projector and a high-frequency pattern surface.

Learn More

Compressed Animated Light Fields with Real-time View-dependent Reconstruction

We propose an end-to-end solution for presenting movie quality animated graphics to the user while still allowing the sense of presence afforded by free viewpoint head motion.

Learn More

Deep Deformable Patch Metric Learning for Person Re-identification

In this paper, we propose to learn appearance measures for patches that are combined using deformable models.

Learn More

Robust Geometric Self-Calibration of Generic Multi-Projector Camera Systems

We evaluated the proposed methods using more than ten multi-projection datasets ranging from a toy castle set up consisting of three cameras and one projector up to a half dome display system with more than 30 devices.

Learn More

Lei Chen

Learn More

One-Shot Metric Learning for Person Re-identification

The proposed one-shot learning achieves performance that is competitive with supervised methods but uses only a single example rather than the hundreds required for the fully supervised case.

Learn More

Weakly-Supervised Visual Grounding of Phrases with Linguistic Structures

We propose a weakly-supervised approach that takes image-sentence pairs as input and learns to visually ground (i.e., localize) arbitrary linguistic phrases, in the form of spatial attention masks.

Learn More

Factorized Variational Autoencoders for Modeling Audience Reactions to Movies

In this paper, we study non-linear tensor factorization methods based on deep variational autoencoders.

Learn More