Abstract
Multi-Object Tracking (MOT) mainly implements certain complex multi-step tracking by using detection algorithms, which respectively perform target detection, feature extraction and data association. The main challenges of Unmanned Aerial Vehicle (UAV) tracking are complex background and mutual occlusion. In this work, we propose a simple online UAV tracking with Transformer. It uses the encoder-decoder mechanism to introduce a set of object query in the pipeline, which can achieve the detection of new targets accurately. This method uses an online joint-detection-and-tracking pipeline based on the encoder-decoder mechanism. The complex and multi-step components in the previous method are simplified. Further, it is a new structure based on Transformer. Object query detects the object in the current frame. The object feature query in the previous frame associates these current objects with the previous object. We proposed a method that can be well applied to small targets such as UAVs, achieved 64.1% MOTA competition on the MOT15 challenge dataset.