You must be logged in  to watch this session

Your personal data will be used to support your experience throughout this website, to manage access to your account, and for other purposes described in our privacy policy.

Sense Media, on behalf of AutoSens, needs the contact information you provide to us to update you with information about AutoSens and our products. You may unsubscribe from these communications at anytime. For information on how to unsubscribe, as well as our privacy practices and commitment to protecting your privacy, check out our privacy policy.

Finetuning for Object Level Open Vocabulary Image Retrieval

Event: AutoSens Europe
| Session date: Thursday 9th October
Session date: Thursday 9th October
, 2025

Hear from:

Guy Heller
Guy Heller
Guy Heller
Researcher,

General Motors

Guy Heller
Guy Heller
Guy Heller
Researcher,

General Motors

Modern ADAS and autonomous driving systems generate terabytes of sensor data, creating a need for intelligent systems that can retrieve images containing objects of interest described by natural language queries. Applications of such systems are varied, including rare-object mining from large-scale unlabeled datasets, targeted data annotation, and streamlining system evaluation processes. The previous leading approach relies on aggregating OpenAI CLIP features without any adaptation to the target domain, ultimately limiting its performance. Our work “FOR: Finetuning for Object Level Open Vocabulary Image Retrieval”, WACV 2025, addresses this limitation through fine-tuning on a target dataset using closed-set labels, while preserving the visual-language association that is crucial for open-set retrieval. FOR is based on a combination of dedicated architecture elements based on CLIP, coupled with a multi-objective training framework. Together, these design choices result in a significant accuracy improvement over previous SoTA across multiple datasets.

Passes0
There are no passes in your basket!
0