Abstract: Scene classification in very high-resolution (VHR) remote sensing (RS) images is a challenging task due to the complex and diverse content of the images. Recently, convolution neural ...
Abstract: Dense video captioning requires localization and description of multiple events in long videos. Prior works detect events in videos solely relying on the visual content and completely ignore ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results