PRAM: Place Recognition Anywhere Model for Efficient Visual Localization

University of Cambridge
Arxiv 2024

Localization by sparse self-defined landmark recognition
and fast landmark-wise 2D-3D verification

Features

1. Self-defined 3D landmarks - not limited to classic semantic labels
2. Sparse landmark recognition - faster than per-pixel classification
3. Localization by recognition and landmark-wise registration
4. Landmark-wise 3D map sparsification
5. Automatic inlier/outlier identification
6. Flexible to accept multi-modality input

Abstract

Humans localize themselves efficiently in known environments by first recognizing landmarks defined on certain objects and their spatial relationships, and then verifying the location by aligning detailed structures of recognized objects with those in the memory. Inspired by this, we propose the place recognition anywhere model (PRAM) to perform visual localization as efficiently as humans do. PRAM consists of two main components - recognition and registration. In detail, first of all, a self-supervised map-centric landmark definition strategy is adopted, making places in either indoor or outdoor scenes act as unique landmarks. Then, sparse keypoints extracted from images, are utilized as the input to a transformer-based deep neural network for landmark recognition; these keypoints enable PRAM to recognize hundreds of landmarks with high time and memory efficiency. Keypoints along with recognized landmark labels are further used for registration between query images and the 3D landmark map. Different from previous hierarchical methods, PRAM discards global and local descriptors, and reduces over 90% storage. Since PRAM utilizes recognition and landmark-wise verification to replace global reference search and exhaustive matching respectively, it runs 2.4 times faster than prior state-of-the-art approaches. Moreover, PRAM opens new directions for visual localization including multi-modality localization, map-centric feature learning, and hierarchical scene coordinate regression.

Overview

MY ALT TEXT

Self-supervised 3D landmark definition

Map sparsification based on 3D landmarks

Robust sparse landmark recognition to long-term changes

BibTeX

        
          @article{xue2024pram,
          author    = {Fei Xue and Ignas Budvytis and Roberto Cipolla},
          title     = {PRAM: Place Recognition Anywhere Model for Efficient Visual Localization},
          journal   = {arXiv preprint arXiv:2404.07785},
          year      = {2024}
          }