Temporal action localization, which requires a machine to recognize the location as well as the category of action instances in videos, has long been researched in computer vision. The main challen…
The explosive growth in video streaming requires video understanding at high accuracy and low computation cost. Conventional 2D CNNs are computationally cheap but cannot capture temporal relationsh…
Recently, substantial research effort has focused on how to apply CNNs or RNNs to better capture temporal patterns in videos, so as to improve the accuracy of video classification. In this paper, w…