As a challenging task of high-level video understanding, weakly supervised temporal action localization has attracted more attention recently. With only video-level category labels, this task shoul…
In the task of pedestrian trajectory prediction, social interaction could be one of the most complicated factors since it is difficult to be interpreted through simple rules. Recent studies have sh…
Multi-scale representations deeply learned via convolutional neural networks have shown tremendous importance for various pixel-level prediction problems. In this paper we present a novel approach …
Reconstructing 3D human shape and pose from monocular images is challenging despite the promising results achieved by the most recent learning-based methods. The commonly occurred misalignment come…
Spatio-temporal action localization consists of three levels of tasks: spatial localization, action classification, and temporal localization. In this work, we propose a new progressive cross-strea…
Traditional video compression approaches build upon the hybrid coding framework with motion-compensated prediction and residual transform coding. In this paper, we propose the first end-to-end deep…
This paper proposes a new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN), which uses the domain-collaborative and domain-adversarial learning strategies …