header

Welcome


bdrp


Welcome to the Computer Vision Group at RWTH Aachen University!

The Computer Vision group has been established at RWTH Aachen University in context with the Cluster of Excellence "UMIC - Ultra High-Speed Mobile Information and Communication" and is associated with the Chair Computer Sciences 8 - Computer Graphics, Computer Vision, and Multimedia. The group focuses on computer vision applications for mobile devices and robotic or automotive platforms. Our main research areas are visual object recognition, tracking, self-localization, 3D reconstruction, and in particular combinations between those topics.

We offer lectures and seminars about computer vision and machine learning.

You can browse through all our publications and the projects we are working on.

News

CVPR'22

We have two papers accepted at the Conference on Computer Vision and Pattern Recognition (CVPR) 2022. Both are selected for oral presentations! Check them out:

Moreover, we have one paper accepted at the Transformers 4 Vision Workshop at CVPR'22. It is selected as a spotlight presentation. Check it out:

March 30, 2022

3DV'21

We have one paper accepted at the International Conference on 3D Vision (3DV) 2021:

Oct. 11, 2021

CVPR'21

Our work on 3D multi-object reconstruction from a single image was accepted at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021. Check it out:

June 12, 2021

IJCV'20

We are excited to share that our paper HOTA: A Higher Order Metric for Evaluating Multi-object Tracking has been accepted for publication in the International Journal of Computer Vision (IJCV'20).

Nov. 3, 2020

WACV'21

We have one paper accepted at the 2021 Winter Conference on Applications of Computer Vision (WACV ’21)

Nov. 2, 2020

We won the ECCV2020 "3D Poses in the Wild" Challenge!

See our MeTRAbs paper, accepted for publication in the IEEE T-BIOM special journal issue "Selected Best works on Automatic Face and Gesture Recognition 2020" for our approach and check out the code on GitHub.

Aug. 23, 2020

Recent Publications

pubimg
HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022 (Oral)

Existing state-of-the-art methods for Video Object Segmentation (VOS) learn low-level pixel-to-pixel correspondences between frames to propagate object masks across video. This requires a large amount of densely annotated video data, which is costly to annotate, and largely redundant since frames within a video are highly correlated. In light of this, we propose HODOR: a novel method that tackles VOS by effectively leveraging annotated static images for understanding object appearance and scene context. We encode object instances and scene information from an image frame into robust high-level descriptors which can then be used to re-segment those objects in different frames. As a result, HODOR achieves state-of-the-art performance on the DAVIS and YouTube-VOS benchmarks compared to existing methods trained without video annotations. Without any architectural modification, HODOR can also learn from video context around single annotated video frames by utilizing cyclic consistency, whereas other methods rely on dense, temporally consistent annotations.

fadeout
 
pubimg
Opening up Open World Tracking

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022 (Oral)

Tracking and detecting any object, including ones never-seen-before during model training, is a crucial but elusive capability of autonomous systems. An autonomous agent that is blind to never-seen-before objects poses a safety hazard when operating in the real world and yet this is how almost all current systems work. One of the main obstacles towards advancing tracking any object is that this task is notoriously difficult to evaluate. A benchmark that would allow us to perform an apples-to-apples comparison of existing efforts is a crucial first step towards advancing this important research field. This paper addresses this evaluation deficit and lays out the landscape and evaluation methodology for detecting and tracking both known and unknown objects in the open-world setting. We propose a new benchmark, TAO-OW: Tracking Any Object in an Open World}, analyze existing efforts in multi-object tracking, and construct a baseline for this task while highlighting future challenges. We hope to open a new front in multi-object tracking research that will hopefully bring us a step closer to intelligent systems that can operate safely in the real world.

fadeout
 
pubimg
M2F3D: Mask2Former for 3D Instance Segmentation

Transformers For Vision Workshop
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022 (Spotlight)

In this work, we show that the top performing Mask2Former approach for image-based segmentation tasks works surprisingly well when adapted to the 3D scene understanding domain. Current 3D semantic instance segmentation methods rely largely on predicting centers followed by clustering approaches and little progress has been made in applying transformer-based approaches to this task. We show that with small modifications to the Mask2Former approach for 2D, we can create a 3D instance segmentation approach, without the need for highly 3D specific components or carefully hand-engineered hyperparameters. Initial experiments on the ScanNet benchmark are very promising and sets a new state-of-the-art on ScanNet test (+ 0.4 mAP50).

fadeout
Disclaimer Home Visual Computing institute RWTH Aachen University