Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "multi-object tracking"

Sort by: Order: Results:

  • Pelvo, Nasti (2024)
    Object detection and multi-object tracking are crucial components of computer vision systems aiming for comprehensive scene understanding and reliable autonomous decision making. While methods developed for visual input data are widely studied, they are susceptible to environmental factors such as poor lighting and weather conditions. Thermal imaging, on the other hand, is robust against most adversarial environmental conditions and thus presents an intriguing alternative to visual photography. Due to the characteristics of thermal images, current state-of-the-art object detection and tracking methods perform poorly when presented with thermal input. Open source thermal data for training large neural network models is not widely available: existing datasets are small and homogenenous, and the resulting models lack the generalizability required for their application on real world input data. The effect is especially relevant for transformer-based methods, which exhibit a lack of visual inductive bias and thus require large-scale training. This thesis presents the first in-depth literature review and experimental study into transformer-based object detection and tracking on challenging thermal and aerial data. By conducting an analysis on existing transformer-based multi-object tracking methods, we argue for the application of the joint detection and tracking paradigm, where multi-object tracking is treated as an end-to-end problem. Our experiments on two transformer-based multi-object tracking models confirm that fully exploiting multi-frame input can increase the stability of object detection and enforce robustness against the domain issues prevalent in thermal images. Due to the high training data requirement of transformers, the methods are, however, held back by the lack of open source training data. We thus introduce two novel data augmentation techniques which aim to supplement and diversify existing training data, and thus improve the transferability of detection and tracking methods between the visual and thermal domains.