Image Descriptors
Datasets
Copyright Notice
The datasets available for download in this page are published under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License. This means you must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. You may not use the material for commercial purposes.
Kinect 1 Sequences (38 MB) : This dataset contains six different real-world objects under different deformation levels and illumination changes. The RGB-D images were acquired at 640 x 480 resolution with a Kinect 1 sensor. Each image has approximately 50 manually annotated keypoints.
Simulation (26 MB) : This dataset is composed of simulated RGB-D sequences (640 x 480 pixels) with a physics cloth engine simulation. Several textured clothes are subjected to challenging non-rigid deformation, illumination, rotation and scale changes. The keypoints in this sequence are selected with Harris score in the first reference texture image, and their exact correspondence overtime are tracked in the simulation.
Extended Dataset (Proposed in CVIU’22):
Kinect 2 Sequences (1.1 GB) : This dataset contains five additional real-world objects acquired with a Kinect 2 sensor at 1920 x 1080 resolution images. We provide image sequences for each of the five objects containing different levels of deformations: light, medium and heavy deformations. Accurate 80 pointwise correspondences are automatically obtained with a motion capture system.
Dataset File Format: All datasets follow the same format: Color images are stored as 8-bit PNG and depth images are stored as 16-bit PNG images in millimetres. The intrinsics.xml file contains the intrinsic parameters of the camera, allowing the reconstruction of the pointcloud. Each image also has a respective .csv file, where each line consists of a keypoint number (ID), its 2D image coordinates and a boolean flag indicating if the keypoint is visible in the current keyframe. The keypoints are selected in the reference image, therefore all keypoints are visible in the reference frame.
Dense TPS warps: In addition to the pixel-accurate landmarks contained in the .csv, we also provide a dense thin-plate-splines (TPS) warp from the reference image to the target frames. The dense TPS warps are obtained by first using the landmarks as control points for a coarse TPS warp estimation, and then these warps are progressively refined by minimizing a photometric cost. This script demonstrates how to use the TPS warp files to generate ground-truth correspondences for SIFT detected keypoints.
Kinect 1 (TPS files, 2.9 MB) | Kinect 2 (TPS files, 43 MB) | Simulation (TPS files, 4.0 MB)
Publications
[NeurIPS 2021] Guilherme Potje and Renato Martins and Felipe Chamone and Erickson R. Nascimento. Extracting Deformation-Aware Local Features by Learning to Deform, Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS), 2021.
Visit the page for more information and paper access.
[ICCV 2019] Erickson R. Nascimento and Guilherme Potje and Renato Martins and Felipe Chamone and Mario F. M. Campos and Ruzena Bajcsy. GEOBIT: A Geodesic-Based Binary Descriptor Invariant to Non-Rigid Deformations for RGB-D Images, 2019 IEEE International Conference on Computer Vision (ICCV), 2019.
Visit the page for more information and paper access.
Acknowledgments
This project is supported by CAPES, CNPq, and FAPEMIG.