Skip to content

Latest commit

 

History

History
133 lines (104 loc) · 3.88 KB

README.md

File metadata and controls

133 lines (104 loc) · 3.88 KB

DACAT: Dual-stream Adaptive Clip-aware Time Modeling for Robust Online Surgical Phase Recognition

DACAT


DACAT consists of two main branches, $\textit{i.e.}$, (i) Frame-wise Branch (FWB) processing the frame-wise feature and (ii) Adaptive Clip-aware Branch (ACB) which reads out the most relevant clip with the current frame from pre-trained feature cache and integrates these frame-wise features into adaptive clip-aware feature through cross-attention (CA) module. DACAT enhances the relevant context and filter out interference for current frame, which reduces the the complexity of temporal processing and leads to more accurate phase identification.

Result


1. Preparation

Step 1:

Download the Cholec80, M2CAI16, AutoLaparo
  • Access can be requested Cholec80, M2CAI16, AutoLaparo.
  • Download the videos for each datasets and extract frames at 1fps. E.g. for video01.mp4 with ffmpeg, run:
mkdir /<PATH_TO_THIS_FOLDER>/data/frames_1fps/01/
ffmpeg -hide_banner -i /<PATH_TO_VIDEOS>/video01.mp4 -r 1 -start_number 0 /<PATH_TO_THIS_FOLDER>/data/frames_1fps/01/%08d.jpg
  • We also prepare a shell file to extract at here
  • The final dataset structure should look like this:
Cholec80/
	data/
		frames_1fps/
			01/
				00000001.jpg
				00000002.jpg
				00000003.jpg
				00000004.jpg
				...
			02/
				...
			...
			80/
				...
		phase_annotations/
			video01-phase.txt
			video02-phase.txt
			...
			video80-phase.txt
		tool_annotations/
			video01-tool.txt
			video02-tool.txt
			...
			video80-tool.txt
	output/
	train_scripts/
	predict.sh
	train.sh

Step 2:

Download pretrained models ConvNeXt V2-T
  • download ConvNeXt V2-T weights and place here: .../train_scripts/convnext/convnextv2_tiny_1k_224_ema.pt

Step 3:

Environment Requirements

See requirements.txt.

2. Train

2.1 Train Feature Cache

source .../Cholec80/train.sh

After training, please rename and save the checkpoint .../output/checkpoints/phase/YourTrainNameXXX/models/checkpoint_best_acc.pth.tar in .../train_scripts/newly_opt_ykx/LongShortNet/long_net_convnextv2.pth.tar

2.2 Train DACAT

Change the .../Cholec80/train.sh, make python3 train_longshort.py active and

source .../Cholec80/train.sh

3. Infer

Set the model path in .../Cholec80/predict.sh and

source .../Cholec80/predict.sh

Our trained checkpoints can be download in google drive.

4. Evaluate

4.1 Cholec80

Use the Matlab file.

4.2 M2CAI16

Use the Matlab file.

4.3 Cholec80

Use the Python file.

Reference

Citations

If you find this repository useful, please consider citing our paper:

@article{yang2024dacat,
  title={DACAT: Dual-stream Adaptive Clip-aware Time Modeling for Robust Online Surgical Phase Recognition},
  author={Yang, Kaixiang and Li, Qiang and Wang, Zhiwei},
  journal={arXiv preprint arXiv:2409.06217},
  year={2024}
}