IEEE SPS VIP Cup 2025: Multimodal UAV Detection, Tracking & Payload Identification
Built a real-time multimodal RGB–IR system for UAV detection, tracking, and payload classification for IEEE SPS VIP Cup 2025. The pipeline combines fusion-based detection, robust multi-object tracking, and a payload classifier to improve performance under low-light and adverse conditions.

Project Overview
This project is Team Cogniview’s solution for the IEEE Signal Processing Society VIP Cup 2025. We developed a real-time system to detect, track, and classify UAVs and their payloads using multimodal fusion of RGB and infrared (IR) imagery. The goal was to improve robustness in challenging conditions such as low light, fog, occlusion, and motion blur.
Supervisor
Supervisor: Dr. Wageesha N. Manamperi
What the System Does
The solution covers three main tasks:
- RGB + IR UAV detection (including drone vs bird discrimination)
- Multi-object tracking with persistent IDs and motion behavior analysis (approaching or receding)
- Payload classification using RGB–IR information
Technical Approach
Key components of the pipeline include:
- Multimodal fusion using both early fusion (transformer-based feature fusion) and late fusion (decision-level merging with NMS)
- Object detection models trained for RGB-only, IR-only, and fused RGB–IR inputs
- Robust tracking using BoT-SORT with Kalman filtering and history buffers for smoother trajectories
- Payload classification using an RGB–IR fusion classifier
Results
Reported performance highlights include:
- Drone detection (RGB–IR fusion): F1-score 0.9846
- Payload classification: F1-score > 99%, mAP@50–95: 0.9947
- Tracking persistence: 90%+ under occlusion and distortions
- Real-time speed: 25–30 FPS on GPU
Role and Team
Team Cogniview (University of Moratuwa). I served as the team leader and contributed to the development and integration of the multimodal detection and fusion pipeline, along with overall coordination, experimentation, and results consolidation.