Skip to content

Commit

Permalink
update: TextReID model upload
Browse files Browse the repository at this point in the history
  • Loading branch information
chaews0327 committed Mar 31, 2024
1 parent 8e4809e commit 798b509
Show file tree
Hide file tree
Showing 55 changed files with 3,685 additions and 0 deletions.
13 changes: 13 additions & 0 deletions ai/TextReID/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
.ipynb_checkpoints
*.pyc
*.ipynb
*.npy

output/
datasets/
pretrained/
__pycache__/
condor_log/
.cache/
.nv/
docker_stderror
33 changes: 33 additions & 0 deletions ai/TextReID/.pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
repos:

- repo: https://github.com/psf/black
rev: 20.8b1 # Replace by any tag/version: https://github.com/psf/black/tags
hooks:
- id: black
language_version: python3 # Should be a command that runs python3.6+

# isort
- repo: https://github.com/timothycrosley/isort
rev: 5.6.4
hooks:
- id: isort

# flake8
- repo: https://github.com/PyCQA/flake8
rev: 3.8.3
hooks:
- id: flake8
args: ["--config=setup.cfg", "--ignore=W504, W503, E501, E203, E741, F821"]

# pre-commit-hooks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
hooks:
- id: trailing-whitespace # Trim trailing whitespace
- id: check-merge-conflict # Check for files that contain merge conflict strings
- id: end-of-file-fixer # Make sure files end in a newline and only a newline
- id: requirements-txt-fixer # Sort entries in requirements.txt and remove incorrect entry for pkg-resources==0.0.0
- id: fix-encoding-pragma # Remove the coding pragma: # -*- coding: utf-8 -*-
args: ["--remove"]
- id: mixed-line-ending # Replace or check mixed line ending
args: ["--fix=lf"]
93 changes: 93 additions & 0 deletions ai/TextReID/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Text Based Person Search with Limited Data

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/text-based-person-search-with-limited-data/nlp-based-person-retrival-on-cuhk-pedes)](https://paperswithcode.com/sota/nlp-based-person-retrival-on-cuhk-pedes?p=text-based-person-search-with-limited-data)

This is the codebase for our [BMVC 2021 paper](https://arxiv.org/abs/2110.10807).

Slides and video for the online presentation are now available at [BMVC 2021 virtual conference website](https://www.bmvc2021-virtualconference.com/conference/papers/paper_0044.html).

## Updates
- (10/12/2021) Add download link of trained models.
- (06/12/2021) Code refactor for easy reproduce.
- (20/10/2021) Code released!

## Abstract
Text-based person search (TBPS) aims at retrieving a target person from an image gallery with a descriptive text query.
Solving such a fine-grained cross-modal retrieval task is challenging, which is further hampered by the lack of large-scale datasets.
In this paper, we present a framework with two novel components to handle the problems brought by limited data.
Firstly, to fully utilize the existing small-scale benchmarking datasets for more discriminative feature learning, we introduce a cross-modal momentum contrastive learning framework to enrich the training data for a given mini-batch. Secondly, we propose to transfer knowledge learned from existing coarse-grained large-scale datasets containing image-text pairs from drastically different problem domains to compensate for the lack of TBPS training data. A transfer learning method is designed so that useful information can be transferred despite the large domain gap. Armed with these components, our method achieves new state of the art on the CUHK-PEDES dataset with significant improvements over the prior art in terms of Rank-1 and mAP.

## Results
![image](https://user-images.githubusercontent.com/37724292/144879635-86ab9c7b-0317-4b42-ac46-a37b06853d18.png)

## Installation
### Setup environment
```bash
conda create -n txtreid-env python=3.7
conda activate txtreid-env
git clone https://github.com/BrandonHanx/TextReID.git
cd TextReID
pip install -r requirements.txt
pre-commit install
```
### Get CUHK-PEDES dataset
- Request the images from [Dr. Shuang Li](https://github.com/ShuangLI59/Person-Search-with-Natural-Language-Description).
- Download the pre-processed captions we provide from [Google Drive](https://drive.google.com/file/d/1V4d8OjFket5SaQmBVozFFeflNs6f9e1R/view?usp=sharing).
- Organize the dataset as following:
```bash
datasets
└── cuhkpedes
├── annotations
│ ├── test.json
│ ├── train.json
│ └── val.json
├── clip_vocab_vit.npy
└── imgs
├── cam_a
├── cam_b
├── CUHK01
├── CUHK03
├── Market
├── test_query
└── train_query
```

### Download CLIP weights
```bash
mkdir pretrained/clip/
cd pretrained/clip
wget https://openaipublic.azureedge.net/clip/models/afeb0e10f9e5a86da6080e35cf09123aca3b358a0c3e3b6c78a7b63bc04b6762/RN50.pt
wget https://openaipublic.azureedge.net/clip/models/8fa8567bab74a42d41c5915025a8e4538c3bdbe8804a470a72f30b0d94fab599/RN101.pt
cd -

```

### Train
```bash
python train_net.py \
--config-file configs/cuhkpedes/moco_gru_cliprn50_ls_bs128_2048.yaml \
--use-tensorboard
```
### Inference
```bash
python test_net.py \
--config-file configs/cuhkpedes/moco_gru_cliprn50_ls_bs128_2048.yaml \
--checkpoint-file output/cuhkpedes/moco_gru_cliprn50_ls_bs128_2048/best.pth
```
You can download our trained models (with CLIP RN50 and RN101) from [Google Drive](https://drive.google.com/drive/folders/1MoceVsLiByg3Sg8_9yByGSvR3ru15hJL?usp=sharing).

## TODO
- [ ] Try larger pre-trained CLIP models.
- [ ] Fix the bug of multi-gpu runninng.
- [ ] Add dataloader for [ICFG-PEDES](https://github.com/zifyloo/SSAN).

## Citation
If you find this project useful for your research, please use the following BibTeX entry.
```
@inproceedings{han2021textreid,
title={Text-Based Person Search with Limited Data},
author={Han, Xiao and He, Sen and Zhang, Li and Xiang, Tao},
booktitle={BMVC},
year={2021}
}
```
41 changes: 41 additions & 0 deletions ai/TextReID/configs/cuhkpedes/baseline_gru_cliprn101_ls_bs128.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
MODEL:
WEIGHT: "imagenet"
FREEZE: False
VISUAL_MODEL: "m_resnet101"
TEXTUAL_MODEL: "bigru"
NUM_CLASSES: 11003
GRU:
ONEHOT: "clip_vit"
EMBEDDING_SIZE: 512
NUM_UNITS: 512
VOCABULARY_SIZE: 512
DROPOUT_KEEP_PROB: 1.0
MAX_LENGTH: 100
RESNET:
RES5_STRIDE: 1
EMBEDDING:
EMBED_HEAD: 'simple'
FEATURE_SIZE: 256
DROPOUT_PROB: 0.0
EPSILON: 0.1
INPUT:
HEIGHT: 384
WIDTH: 128
USE_AUG: True
PIXEL_MEAN: [0.48145466, 0.4578275, 0.40821073]
PIXEL_STD: [0.26862954, 0.26130258, 0.27577711]
DATASETS:
TRAIN: ("cuhkpedes_train", )
TEST: ("cuhkpedes_test", )
SOLVER:
IMS_PER_BATCH: 128
NUM_EPOCHS: 80
BASE_LR: 0.0001
WEIGHT_DECAY: 0.00004
CHECKPOINT_PERIOD: 40
LRSCHEDULER: 'step'
STEPS: (40, 70)
WARMUP_FACTOR: 0.1
WARMUP_EPOCHS: 5
TEST:
IMS_PER_BATCH: 128
41 changes: 41 additions & 0 deletions ai/TextReID/configs/cuhkpedes/baseline_gru_cliprn50_ls_bs128.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
MODEL:
WEIGHT: "imagenet"
FREEZE: False
VISUAL_MODEL: "m_resnet50"
TEXTUAL_MODEL: "bigru"
NUM_CLASSES: 11003
GRU:
ONEHOT: "clip_vit"
EMBEDDING_SIZE: 512
NUM_UNITS: 512
VOCABULARY_SIZE: 512
DROPOUT_KEEP_PROB: 1.0
MAX_LENGTH: 100
RESNET:
RES5_STRIDE: 1
EMBEDDING:
EMBED_HEAD: 'simple'
FEATURE_SIZE: 256
DROPOUT_PROB: 0.0
EPSILON: 0.1
INPUT:
HEIGHT: 384
WIDTH: 128
USE_AUG: True
PIXEL_MEAN: [0.48145466, 0.4578275, 0.40821073]
PIXEL_STD: [0.26862954, 0.26130258, 0.27577711]
DATASETS:
TRAIN: ("cuhkpedes_train", )
TEST: ("cuhkpedes_test", )
SOLVER:
IMS_PER_BATCH: 128
NUM_EPOCHS: 80
BASE_LR: 0.0001
WEIGHT_DECAY: 0.00004
CHECKPOINT_PERIOD: 40
LRSCHEDULER: 'step'
STEPS: (40, 70)
WARMUP_FACTOR: 0.1
WARMUP_EPOCHS: 5
TEST:
IMS_PER_BATCH: 128
39 changes: 39 additions & 0 deletions ai/TextReID/configs/cuhkpedes/baseline_gru_rn50_ls_bs128.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
MODEL:
WEIGHT: "imagenet"
FREEZE: False
VISUAL_MODEL: "resnet50"
TEXTUAL_MODEL: "bigru"
NUM_CLASSES: 11003
GRU:
ONEHOT: "yes"
EMBEDDING_SIZE: 512
NUM_UNITS: 512
VOCABULARY_SIZE: 12000
DROPOUT_KEEP_PROB: 1.0
MAX_LENGTH: 100
RESNET:
RES5_STRIDE: 1
EMBEDDING:
EMBED_HEAD: 'simple'
FEATURE_SIZE: 256
DROPOUT_PROB: 0.0
EPSILON: 0.1
INPUT:
HEIGHT: 384
WIDTH: 128
USE_AUG: True
DATASETS:
TRAIN: ("cuhkpedes_train", )
TEST: ("cuhkpedes_test", )
SOLVER:
IMS_PER_BATCH: 128
NUM_EPOCHS: 80
BASE_LR: 0.0001
WEIGHT_DECAY: 0.00004
CHECKPOINT_PERIOD: 40
LRSCHEDULER: 'step'
STEPS: (40, 70)
WARMUP_FACTOR: 0.1
WARMUP_EPOCHS: 5
TEST:
IMS_PER_BATCH: 128
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
MODEL:
WEIGHT: "imagenet"
FREEZE: False
VISUAL_MODEL: "m_resnet101"
TEXTUAL_MODEL: "bigru"
NUM_CLASSES: 11003
GRU:
ONEHOT: "clip_vit"
EMBEDDING_SIZE: 512
NUM_UNITS: 512
VOCABULARY_SIZE: 512
DROPOUT_KEEP_PROB: 1.0
MAX_LENGTH: 100
RESNET:
RES5_STRIDE: 1
EMBEDDING:
EMBED_HEAD: 'moco'
FEATURE_SIZE: 256
DROPOUT_PROB: 0.0
EPSILON: 0.1
MOCO:
FC: False
K: 2048
INPUT:
HEIGHT: 384
WIDTH: 128
USE_AUG: True
PIXEL_MEAN: [0.48145466, 0.4578275, 0.40821073]
PIXEL_STD: [0.26862954, 0.26130258, 0.27577711]
DATASETS:
TRAIN: ("cuhkpedes_train", )
TEST: ("cuhkpedes_test", )
SOLVER:
IMS_PER_BATCH: 128
NUM_EPOCHS: 80
BASE_LR: 0.0001
WEIGHT_DECAY: 0.00004
CHECKPOINT_PERIOD: 40
LRSCHEDULER: 'step'
STEPS: (40, 70)
WARMUP_FACTOR: 0.1
WARMUP_EPOCHS: 5
TEST:
IMS_PER_BATCH: 128
44 changes: 44 additions & 0 deletions ai/TextReID/configs/cuhkpedes/moco_gru_cliprn50_ls_bs128_2048.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
MODEL:
WEIGHT: "imagenet"
FREEZE: False
VISUAL_MODEL: "m_resnet50"
TEXTUAL_MODEL: "bigru"
NUM_CLASSES: 11003
GRU:
ONEHOT: "clip_vit"
EMBEDDING_SIZE: 512
NUM_UNITS: 512
VOCABULARY_SIZE: 512
DROPOUT_KEEP_PROB: 1.0
MAX_LENGTH: 100
RESNET:
RES5_STRIDE: 1
EMBEDDING:
EMBED_HEAD: 'moco'
FEATURE_SIZE: 256
DROPOUT_PROB: 0.0
EPSILON: 0.1
MOCO:
FC: False
K: 2048
INPUT:
HEIGHT: 384
WIDTH: 128
USE_AUG: True
PIXEL_MEAN: [0.48145466, 0.4578275, 0.40821073]
PIXEL_STD: [0.26862954, 0.26130258, 0.27577711]
DATASETS:
TRAIN: ("cuhkpedes_train", )
TEST: ("cuhkpedes_test", )
SOLVER:
IMS_PER_BATCH: 128
NUM_EPOCHS: 80
BASE_LR: 0.0001
WEIGHT_DECAY: 0.00004
CHECKPOINT_PERIOD: 40
LRSCHEDULER: 'step'
STEPS: (40, 70)
WARMUP_FACTOR: 0.1
WARMUP_EPOCHS: 5
TEST:
IMS_PER_BATCH: 128
37 changes: 37 additions & 0 deletions ai/TextReID/encoding.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import json
import re
import random


def encode(query):
file_path = "./datasets/cuhkpedes/annotations/test.json"
with open(file_path, "r") as file:
data = json.load(file)

word_dict = {} # word : encode
max_onehot = -1

for i in range(len(data["annotations"])):
words = re.sub(r'[^a-zA-Z0-9\s]', '', data["annotations"][i]["sentence"])
words = words.split()
for word, onehot in zip(words, data["annotations"][i]["onehot"]):
if onehot > max_onehot:
max_onehot = onehot
if word.lower() not in word_dict.keys():
word_dict[word.lower()] = onehot

output = []
query = re.sub(r'[^a-zA-Z0-9\s]', '', query)
for w in query.split():
try:
output.append(word_dict[w.lower()])
except KeyError as e:
print("Key %s not found in the dictionary."%{e.args[0]})
"""word_dict[max_onehot+1] = e.args[0]
word_dict[e.args[0]] = max_onehot + 1
output.append(word_dict[w.lower()])"""
output.append("None")
max_onehot += 1

# print(word_dict)
return output
Empty file added ai/TextReID/lib/__init__.py
Empty file.
3 changes: 3 additions & 0 deletions ai/TextReID/lib/config/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from .defaults import _C as cfg

__all__ = ["cfg"]
Loading

0 comments on commit 798b509

Please sign in to comment.