Welcome to the Computer Vision Group at RWTH Aachen University!

The Computer Vision group has been established at RWTH Aachen University in context with the Cluster of Excellence "UMIC - Ultra High-Speed Mobile Information and Communication" and is associated with the Chair Computer Sciences 8 - Computer Graphics, Computer Vision, and Multimedia. The group focuses on computer vision applications for mobile devices and robotic or automotive platforms. Our main research areas are visual object recognition, tracking, self-localization, 3D reconstruction, and in particular combinations between those topics.

We offer lectures and seminars about computer vision and machine learning.

You can browse through all our publications and the projects we are working on.

We have one paper accepted at the ECCV'18, GMDL Workshop.

Oct. 4, 2018

We won the 1st Large-scale Video Object Segmentation Challenge at ECCV 2018

You can find details on our approach in our short paper.

Sept. 1, 2018

We won the 2018 ECCV PoseTrack Challenge on 3D human pose estimation.

You can find details on our approach in our short paper.

Sept. 1, 2018

We won the DAVIS Challenge on Video Object Segmentation at CVPR 2018

You can find details on our approach in our short paper.

June 18, 2018

We have one paper accepted at the IEEE International Conference on Computer Vision (ICCV) 2017, 3DRMS Workshop.

Sept. 18, 2017

We have two papers accepted at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017.

June 15, 2017

Recent Publications

Know What Your Neighbors Do: 3D Semantic Segmentation of Point Clouds

IEEE European Conference on Computer Vision (ECCV'18), GMDL Workshop

In this paper, we present a deep learning architecture which addresses the problem of 3D semantic segmentation of unstructured point clouds. Compared to previous work, we introduce grouping techniques which define point neighborhoods in the initial world space and the learned feature space. Neighborhoods are important as they allow to compute local or global point features depending on the spatial extend of the neighborhood. Additionally, we incorporate dedicated loss functions to further structure the learned point feature space: the pairwise distance loss and the centroid loss. We show how to apply these mechanisms to the task of 3D semantic segmentation of point clouds and report state-of-the-art performance on indoor and outdoor datasets.


PReMVOS: Proposal-generation, Refinement and Merging for the DAVIS Challenge on Video Object Segmentation 2018

The 2018 DAVIS Challenge on Video Object Segmentation - CVPR Workshops

We address semi-supervised video object segmentation, the task of automatically generating accurate and consistent pixel masks for objects in a video sequence, given the first-frame ground truth annotations. Towards this goal, we present the PReMVOS algorithm (Proposal-generation, Refinement and Merging for Video Object Segmentation). This method involves generating coarse object proposals using a Mask R-CNN like object detector, followed by a refinement network that produces accurate pixel masks for each proposal. We then select and link these proposals over time using a merging algorithm that takes into account an objectness score, the optical flow warping, and a Re-ID feature embedding vector for each proposal. We adapt our networks to the target video domain by fine-tuning on a large set of augmented images generated from the first-frame ground truth. Our approach surpasses all previous state-of-the-art results on the DAVIS 2017 video object segmentation benchmark and achieves first place in the DAVIS 2018 Video Object Segmentation Challenge with a mean of J & F score of 74.7.


How Robust is 3D Human Pose Estimation to Occlusion?

IEEE/RSJ Int. Conference on Intelligent Robots and Systems (IROS'18) Workshops

Occlusion is commonplace in realistic human-robot shared environments, yet its effects are not considered in standard 3D human pose estimation benchmarks. This leaves the question open: how robust are state-of-the-art 3D pose estimation methods against partial occlusions? We study several types of synthetic occlusions over the Human3.6M dataset and find a method with state-of-the-art benchmark performance to be sensitive even to low amounts of occlusion. Addressing this issue is key to progress in applications such as collaborative and service robotics. We take a first step in this direction by improving occlusion-robustness through training data augmentation with synthetic occlusions. This also turns out to be an effective regularizer that is beneficial even for non-occluded test cases.

Disclaimer Home Visual Computing institute RWTH Aachen University