Learning a good image prior is a long-term goal for image restoration and manipulation. While existing methods like deep image prior (DIP) capture low-level image statistics, there are still gaps t…
State-of-the-art methods for driving-scene LiDAR-based perception (including point cloud semantic segmentation, panoptic segmentation and 3D detection, etc .) often project the point clouds to 2D s…
Feature reassembly, i.e. feature downsampling and upsampling, is a key operation in a number of modern convolutional network architectures, e.g., residual networks and feature pyramids. Its design …
We present a general and flexible video-level framework for learning action models in videos. This method, called temporal segment network (TSN), aims to model long-range temporal structure with a …