Multispectral-Object-Detection

Intro

Official Code for Cross-Modality Fusion Transformer for Multispectral Object Detection.

Multispectral Object Detection with Transformer and Yolov5

Abstract

Multispectral image pairs can provide the combined information, making object detection applications more reliable and robust in the open world. To fully exploit the different modalities, we present a simple yet effective cross-modality feature fusion approach, named Cross-Modality Fusion Transformer (CFT) in this paper. Unlike prior CNNs-based works, guided by the Transformer scheme, our network learns long-range dependencies and integrates global contextual information in the feature extraction stage. More importantly, by leveraging the self attention of the Transformer, the network can naturally carry out simultaneous intra-modality and inter-modality fusion, and robustly capture the latent interactions between RGB and Thermal domains, thereby significantly improving the performance of multispectral object detection. Extensive experiments and ablation studies on multiple datasets demonstrate that our approach is effective and achieves state-of-the-art detection performance.

Demo

Night Scene

Day Scene

Overview

Installation

Python>=3.6.0 is required with all requirements.txt installed including PyTorch>=1.7 (The same as yolov5 https://github.com/ultralytics/yolov5 ).

Clone the repo

git clone https://github.com/DocF/multispectral-object-detection

Install requirements

$ pip install -r requirements.txt

Dataset

-[FLIR] [Google Drive] [Baidu Drive] extraction code:qwer

A new aligned version.

-[LLVIP] download

-[VEDAI] download

You need to convert all annotations to YOLOv5 format.

Refer: https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data

Run

Download the pretrained weights

yolov5 weights (pre-train)

-[yolov5s] google drive

-[yolov5m] google drive

-[yolov5l] google drive

-[yolov5x] google drive

CFT weights

-[LLVIP] google drive

-[FLIR] google drive

Change the data cfg

some example in data/multispectral/

Change the model cfg

some example in models/transformer/

note!!! we used xxxx_transfomerx3_dataset.yaml in our paper.

Results

Dataset	CFT	mAP50	mAP75	mAP
FLIR		73.0	32.0	37.4
FLIR	✔️	78.7 (Δ5.7)	35.5 (Δ3.5)	40.2 (Δ2.8)
LLVIP		95.8	71.4	62.3
LLVIP	✔️	97.5 (Δ1.7)	72.9 (Δ1.5)	63.6 (Δ1.3)
VEDAI		79.7	47.7	46.8
VEDAI	✔️	85.3 (Δ5.6)	65.9(Δ18.2)	56.0 (Δ9.2)

LLVIP

Log Average Miss Rate

Model	Log Average Miss Rate
YOLOv3-RGB	37.70%
YOLOv3-IR	17.73%
YOLOv5-RGB	22.59%
YOLOv5-IR	10.66%
Baseline(Ours)	6.91%
CFT(Ours)	5.40%

Miss Rate - FPPI curve

Train Test and Detect

yolov3 yolov5 对应train.py train_multi_modal.py test.py predict.py脚本

yolov6 yolov8 yolov9 yolov11对应train_v11.py train_multi_modal_v11.py test_v11.py predict_v11.py脚本

其中models/yolo中的Model对应单模态模型，models/yolotest中的Model对应多模态模型

多模态的模型两种解决方案

第一种是add相加，如：yolov5s_fusion_add.yaml PC2f_MPF_yolov8.yaml

第二种是transformer CFT融合：yolov5s_fusion_transformer.yaml

DDP多机多卡训练

在这个rep已经支持检测模型的多机多卡训练，使用torchrun的方式进行。如果需要检测nccl是否正常可以使用test_dist.py这个脚本测试。

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
models		models
utils		utils
video		video
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
MR.png		MR.png
README.md		README.md
cft.png		cft.png
detect_twostream.py		detect_twostream.py
example.png		example.png
export.py		export.py
global_var.py		global_var.py
hubconf.py		hubconf.py
predict.py		predict.py
predict_v11.py		predict_v11.py
requirements.txt		requirements.txt
test.py		test.py
test_dis.py		test_dis.py
test_multi_modal.py		test_multi_modal.py
test_multi_modal_v11.py		test_multi_modal_v11.py
test_v11.py		test_v11.py
train.py		train.py
train_multi_modal.py		train_multi_modal.py
train_multi_modal_v11.py		train_multi_modal_v11.py
train_v11.py		train_v11.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multispectral-Object-Detection

Intro

Abstract

Demo

Overview

Installation

Clone the repo

Install requirements

Dataset

Run

Download the pretrained weights

Change the data cfg

Change the model cfg

Results

LLVIP

Train Test and Detect

多模态的模型两种解决方案

DDP多机多卡训练

About

Releases

Packages

Languages

License

Eurususu/yolo_free

Folders and files

Latest commit

History

Repository files navigation

Multispectral-Object-Detection

Intro

Abstract

Demo

Overview

Installation

Clone the repo

Install requirements

Dataset

Run

Download the pretrained weights

Change the data cfg

Change the model cfg

Results

LLVIP

Train Test and Detect

多模态的模型两种解决方案

DDP多机多卡训练

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages