Real-Time Multiple Object Tracking with Hierarchical Attention

Bashar, Mk; Islam, Samia; Hussain, Kashifa Kawaakib

Real-Time Multiple Object Tracking with Hierarchical Attention

Bashar, Mk; Islam, Samia; Hussain, Kashifa Kawaakib

URI: http://hdl.handle.net/123456789/2060

Date: 2023-05-30

Abstract:

Multiple object tracking (MOT) is a crucial task in computer vision, with applications in fields such as surveillance, robotics, and autonomous systems. Accurate MOT is essential for maintaining situational awareness in complex environments and detecting objects accurately and tracking objects in real-time. In this paper, we present a novel approach for MOT that combines joint detection and embedding (JDE) which offers simultaneous detection and identification of multiple objects with a Swin Transformer for multi-scale feature extraction. The Swin Transformer, a variant of the popular Transformer architecture, is used to extract rich, multi-scale features from the input data in linear time complexity, enabling our method to handle objects of varying sizes and shapes. We added every stage of Swin blocks with prediction heads to get the multi-scale features. Also, we increased the number of Swin blocks at the first stage to accurately detect objects from large receptive fields. We evaluated our approach on a test set defined by our self-defined MIX dataset and achieved an accuracy of 84.9%. While this is a promising result, there is more room for improvement like improving the reidentification part or modifying the mlp layers of Swin blocks.

Description:

Supervised by Prof. Dr. Md. Hasanul Kabir, Co-supervisor, Mr. Md. Bakhtiar Hasan, Assistant Professor, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh

Show full item record