Abstract:To address the challenge of balancing accuracy and real-time performance in front-end target recognition and localization for drones in complex battlefield environments with limited onboard resources, a front-end target recognition and localization method for drone operations in complex battlefield environments was developed: Using a ″backbone-neck-head″ as the basic network architecture, a non-local attention expansion module, a global multi-scale decoupled network, and a lightweight bottleneck module were introduced. Focal Loss and DIoU Loss were employed as the combined loss functions to achieve feature modeling and multi-scale detection enhancement, thereby improving the ability to capture features and enhancing accuracy; based on dependency graph-structured pruning and channel-wise knowledge distillation, a collaborative lightweight strategy was proposed, effectively reducing model complexity and improving embedded deployability. Experiments show that this method improved mAP@0.5, mAP@0.75, and mAP@0.5:0.95 by 6.0%, 7.2%, and 5.9% respectively, while reducing model parameters and GFLOPs to 17.1% and 12.0%, with precision loss controlled within 4.1%. Finally, deployment validation on embedded hardware demonstrated a frame rate of 34 fps, effectively meeting the accuracy and real-time requirements for front-end target recognition and localization during drone operations.