Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
egeozsoy committed Jun 15, 2023
0 parents commit 37d1910
Show file tree
Hide file tree
Showing 102 changed files with 613,920 additions and 0 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2023 Ege Özsoy

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
80 changes: 80 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# LABRAD-OR: Lightweight Memory Scene Graphs for Accurate Bimodal Reasoning in Dynamic Operating Rooms
<img align="right" src="figures/teaser.jpg" alt="teaser" width="30%" style="margin-left: 10px">
Official code of the paper LABRAD-OR: Lightweight Memory Scene Graphs for Accurate Bimodal Reasoning in Dynamic Operating Rooms (https://arxiv.org/abs/2303.13293) to be published at MICCAI 2023. LABRAD-OR introduces a novel way of using temporal information for more accurate and consistent holistic OR modeling.
Specifically, we introduce memory scene graphs, where the scene graphs of previous time steps act as the temporal representation guiding the current prediction. We design an end-to-end architecture
that intelligently fuses the temporal information of our lightweight memory scene graphs with the visual information from point clouds and images. We evaluate our method on the 4D-OR dataset and
demonstrate that integrating temporality leads to more accurate and consistent results achieving an +5% increase and a new SOTA of 0.88 in macro F1.


**Authors**: [Ege Özsoy][eo], [Tobias Czempiel][tc], [Felix Holm][fh] , [Chantal Pellegrini][cp], [Nassir Navab][nassir]

[eo]:https://www.cs.cit.tum.de/camp/members/ege-oezsoy/

[tc]:https://www.cs.cit.tum.de/camp/members/tobias-czempiel/

[fh]:https://www.cs.cit.tum.de/camp/members/felix-holm/

[cp]:https://www.cs.cit.tum.de/camp/members/chantal-pellegrini/

[nassir]:https://www.cs.cit.tum.de/camp/members/cv-nassir-navab/nassir-navab/

```
@inproceedings{Özsoy2023_LABRAD_OR,
title={LABRAD-OR: Lightweight Memory Scene Graphs for Accurate Bimodal Reasoning in Dynamic Operating Rooms},
author={Ege Özsoy, Tobias Czempiel, Felix Holm, Chantal Pellegrini, Nassir Navab},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
year={2023},
organization={Springer}
}
@inproceedings{Özsoy2022_4D_OR,
title={4D-OR: Semantic Scene Graphs for OR Domain Modeling},
author={Ege Özsoy, Evin Pınar Örnek, Ulrich Eck, Tobias Czempiel, Federico Tombari, Nassir Navab},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
year={2022},
organization={Springer}
}
@inproceedings{Özsoy2021_MSSG,
title={Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical Procedures},
author={Ege Özsoy, Evin Pınar Örnek, Ulrich Eck, Federico Tombari, Nassir Navab},
booktitle={Arxiv},
year={2021}
}
```

## What is not included in this repository?

The 4D-OR Dataset itself, Human and Object Pose Prediction methods as well as the downstream task of role prediction are not part of this repository. Please refer to the
original [4D-OR](https://github.com/egeozsoy/4D-OR) repository for information on downloading the dataset and running the 2D, 3D Human Pose Prediction and 3D Object Pose Prediction as well as the
downstream task of Role Prediction.

## Create Scene Graph Prediction Environment

- Recommended PyTorch Version: pytorch==1.10.0
- conda create --name labrad-or python=3.7
- conda activate labrad-or
- conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
- `cd` into scene_graph_prediction and run `pip install -r requirements.txt`
- Run `wget ` and unzip # TODO modify link
- (Optional) To use the pretrained models, move files from the unzipped directory ending with `.ckpt` into the folder `scene_graph_prediction/scene_graph_helpers/paper_weights`
- `cd` into pointnet2_dir and run `CUDA_HOME=/usr/local/cuda-11.3 pip install pointnet2_ops_lib/.`
Run `pip install torch-scatter==2.0.9 torch-sparse==0.6.12 torch-cluster==1.5.9 torch-spline-conv==1.2.1 torch-geometric==2.0.2 -f https://data.pyg.org/whl/torch-1.10.0+cu113.html`

## Scene Graph Prediction
<img src="figures/visual_abstract.pdf" alt="pipeline" width="75%"/>

As we built upon https://github.com/egeozsoy/4D-OR, the code is structured similarly.

- `cd` into scene_graph_prediction
- To train a new visual only model which only uses point cloud, run `python -m scene_graph_prediction.main --config visual_only.json`.
- To train a new visual only model which uses point cloud and images, run `python -m scene_graph_prediction.main --config visual_only_with_images.json`
- To train labrad-or which only uses point clouds, run `python -m scene_graph_prediction.main --config labrad_or.json`. This requires the pretrained visual only model to be present.
- To train labrad-or which uses point clouds and images, run `python -m scene_graph_prediction.main --config labrad_or_with_images.json`. This requires the pretrained visual only model to be present.
- We provide all four pretrained models TODO link. You can simply use them instead of retraining your own models, as described in the environment setup.
- To evaluate either a model you trained or a pretrained model from us, change the mode to `evaluate` in the main.py and rerun using the same commands as before
- If you want to replicate the results from the paper, you can hardcode the corresponding weight checkpoint as `checkpoint_path` in the main.py
- To infer on the test set, change the mode to `infer` again run one of the 4 corresponding commands. Again you can hardcode the corresponding weight checkpoint as `checkpoint_path` in the main.py
- By default, evaluation is done on the validation set, and inference on test, but these can be changed.
- You can evaluate on the test set as well by using https://bit.ly/4D-OR_evaluator and uploading your inferred predictions. Be aware that compared to the
evaluation in the paper, this evaluation does not require human poses to be available, and therefore can slightly overestimate the results. We get a macro
0.76 instead of 0.75 on test set.
- If you want to continue with role prediction, please refer to the original 4D-OR repository. You can use the inferred scene graphs from the previous step as input to the role prediction.
64 changes: 64 additions & 0 deletions consistency_calculator.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
import json

from helpers.configurations import TAKE_SPLIT


def main():
# json_file_name = 'scan_relations_visual_only_val.json' # visual_only
# json_file_name = 'scan_relations_visual_only_with_images_val.json' # visual_only_with_images
# json_file_name = 'scan_relations_labrad-or_val.json' # labrad-or
json_file_name = 'scan_relations_labrad-or_with_images_val.json' # labrad-or_with_images
print(f'Calculating consistency for {json_file_name}...')
with open(json_file_name) as f:
data = json.load(f)
macro_rel_consistency = []
for take in TAKE_SPLIT['val']:
take_rel_consistencies = []
last_rels = None
for scan_id in sorted([key for key in data.keys() if int(key.split('_')[0]) == take]):
unique_rels = {pred for _, pred, _ in data[scan_id]}
if last_rels is not None:
if len(unique_rels) == 0 and len(last_rels) == 0:
rel_consistency = 1
else:
rel_consistency = len(unique_rels.intersection(last_rels)) / len(unique_rels.union(last_rels))
if rel_consistency < 1:
print(f'{scan_id}: {unique_rels.symmetric_difference(last_rels)}')
take_rel_consistencies.append(rel_consistency)

last_rels = unique_rels
take_rel_consistency = sum(take_rel_consistencies) / len(take_rel_consistencies)
print(f'Take {take} rel consistency: {take_rel_consistency:.4f}')
macro_rel_consistency.append(take_rel_consistency)

print(f'Macro rel consistency: {sum(macro_rel_consistency) / len(macro_rel_consistency):.4f}')


def main_gt():
json_file_name = 'data/relationships_validation.json'
print(f'Calculating consistency for GT {json_file_name}...')
with open(json_file_name) as f:
data = json.load(f)
macro_rel_consistency = []
for take in TAKE_SPLIT['val']:
take_rel_consistencies = []
last_rels = None
for scan in sorted([scan for scan in data['scans'] if scan['take_idx'] == take], key=lambda x: x['scan']):
unique_rels = {pred for _, _, _, pred in scan['relationships']}
if last_rels is not None:
if len(unique_rels) == 0 and len(last_rels) == 0:
rel_consistency = 1
else:
rel_consistency = len(unique_rels.intersection(last_rels)) / len(unique_rels.union(last_rels))
take_rel_consistencies.append(rel_consistency)

last_rels = unique_rels
take_rel_consistency = sum(take_rel_consistencies) / len(take_rel_consistencies)
print(f'Take {take} rel consistency: {take_rel_consistency:.4f}')
macro_rel_consistency.append(take_rel_consistency)

print(f'Macro rel consistency: {sum(macro_rel_consistency) / len(macro_rel_consistency):.4f}')


if __name__ == '__main__':
main_gt()
12 changes: 12 additions & 0 deletions data/classes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Patient
anesthesia_equipment
human_0
human_1
human_2
human_3
human_4
human_5
instrument
instrument_table
operating_table
secondary_table
14 changes: 14 additions & 0 deletions data/relationships.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Assisting
Cementing
Cleaning
CloseTo
Cutting
Drilling
Hammering
Holding
LyingOn
Operating
Preparing
Sawing
Suturing
Touching
Loading

0 comments on commit 37d1910

Please sign in to comment.