Machine Learning and Optimization Archives

RNN Based Incremental Online Spoken Language Understanding

In this paper, we propose recurrent neural network (RNN) based incremental processing towards the SLU task of intent detection. The proposed methodology offers lower latencies than a typical SLU system, without any significant reduction in system accuracy. We introduce and analyze different recurrent neural network architectures for incremental and online processing of the ASR transcripts and compare it to the existing offline systems.

Learn More

Deep Generative Video Compression

The usage of deep generative models for image compression has led to impressive performance gains over classical codecs while neural video compression is still in its infancy. Here, we propose an end-to-end, deep generative modeling approach to compress temporal sequences with a focus on video. Our approach builds upon variational autoencoder (VAE) models for sequential data and combines them with recent work on neural image compression.

Learn More

Doug Fidaleo

Learn More

Fast Handovers with a Robot Character: Small Sensorimotor Delays Improve Perceived Qualities

We present a system for fast and robust handovers with a robot character, together with a user study investigating the effect of robot speed and reaction time on perceived interaction quality. The system can match and exceed human speeds and confirms that users prefer human-level timing.

Learn More

Towards a Natural Motion Generator: a Pipeline to Control a Humanoid based on Motion Data

Learn More

Light Field Video Synthesis Using Inexpensive Surveillance Camera Systems

We present a light field video synthesis technique that can achieve accurate reconstruction given a low-cost, widebaseline camera rig. Our system called, INDiuM, novelly integrates optical flow with methods for rectification, disparity estimation, and feature extraction, which we then feed to a neural network view synthesis solver with widebaseline capability. A new bi-directional warping approach resolves reprojection ambiguities that would result from either backward or forward warping only. The system and method enables the use of off-the-shelf surveillance camera hardware in a simplified and expedited capture workflow. A thorough analysis of the refinement process and resulting view synthesis accuracy over state of the art is provided.

Learn More

Naveen Kumar

Learn More

Smile Intensity Detection in Multiparty Interaction using Deep Learning

Emotion expression recognition is an important aspect for enabling decision making in autonomous agents and systems designed to interact with humans. In this paper, we present our experience in developing a software component for smile intensity detection for multiparty interaction. First, the deep learning architecture and training process is described in detail. This is followed by analysis of the results obtained from testing the trained network. Finally, we outline the steps taken to implement and visualize this network in a real-time software component.

Learn More

A Two-Level Planning Framework for Mixed Reality Interactive Narratives with User Engagement

We present an event-based interactive storytelling system for virtual 3D environments that aims to offer free-form user experiences while constraining the narrative to follow author intent.

Learn More

PICA: Proactive Intelligent Conversational Agent for Interactive Narratives

We developed PICA: a proactive intelligent conversational agent for interactive narratives that can guide users through such experiences.

Learn More