Robotics and Autonomous Systems Series

From Scene Understanding to Vision-Language Joint Understanding

25th May 2023, 13:00 add to calenderMeeting Room 101 Ashton Building 1st Floor
Guangliang Cheng
Department of Computer Science, University of Liverpool

Abstract

The field of computer vision has experienced remarkable advancements in recent years, as efforts have been made to enable machines to comprehend visual scenes and derive meaningful information from images and videos. However, traditional approaches to scene understanding have primarily focused on isolated visual analysis, disregarding the wealth of semantic knowledge and contextual understanding that can be acquired through the incorporation of natural language processing. Consequently, there has been a shift in paradigm towards the emerging research area of vision-language joint understanding, which aims to bridge the gap between visual and linguistic modalities in order to achieve a more comprehensive understanding of visual content.

During this presentation, I will provide a brief overview of my previous research endeavors pertaining to Scene Understanding and their applications in autonomous driving systems. I will also summarize our most recent research advancements in the field of vision-language joint understanding, highlighting some pioneering works.

add to calender (including abstract)