Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeing ValueError: not enough values to unpack (expected 4, got 3) in ppocr/losses/det_db_loss.py when fine-tuning detection model ch_PP-OCRv3 with ch_PP-OCRv3_det_student.yml #13995

Open
3 tasks done
nicolaskodak opened this issue Oct 14, 2024 · 0 comments

Comments

@nicolaskodak
Copy link

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

To fine-tune detection model, I've created a config file (modified from ch_PP-OCRv3_det_student.yml and download ICDAR2015 following this readme.

  • a config file has been created: configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_student.v2.yml
  • data and annotations have been placed accordingly: train_data/icdar2015/train_icdar2015_label.txt, train_data/icdar2015/ch4_training_images, train_data/icdar2015/test_icdar2015_label.txt, train_data/icdar2015/ch4_test_images.

The content of the config file is shown below:

Global:
  debug: false
  use_gpu: true
  epoch_num: 500
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/ch_PP-OCR_V3_det_v2/
  save_epoch_step: 10
  eval_batch_step:
  - 0
  - 40 
  cal_metric_during_train: false
  pretrained_model: /data1/image/models/paddle/ch_PP-OCRv3_det_distill_train/student.pdparams
  checkpoints: null
  save_inference_dir: null
  use_visualdl: false
  infer_img: doc/imgs_en/img_10.jpg
  save_res_path: ./checkpoints/det_db/predicts_db.txt
  distributed: true

Architecture:
  model_type: det
  algorithm: DB
  Transform:
  Backbone:
    name: MobileNetV3
    scale: 0.5
    model_name: large
    disable_se: True
  Neck:
    name: RSEFPN
    out_channels: 96
    shortcut: True
  Head:
    name: DBHead
    k: 50

Loss:
  name: DBLoss
  balance_loss: true
  main_loss_type: DiceLoss
  alpha: 5
  beta: 10
  ohem_ratio: 3
Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Cosine
    learning_rate: 0.0001 ### 0.001 ### edited by kota
    warmup_epoch: 2
  regularizer:
    name: L2
    factor: 5.0e-05
PostProcess:
  name: DBPostProcess
  thresh: 0.3
  box_thresh: 0.6
  max_candidates: 1000
  unclip_ratio: 1.5
Metric:
  name: DetMetric
  main_indicator: hmean
Train:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/icdar2015/
    label_file_list:
      - ./train_data/icdar2015/train_icdar2015_label.txt
    ratio_list: [1.0]
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - DetLabelEncode: null
    - IaaAugment:
        augmenter_args:
        - type: Fliplr
          args:
            p: 0.5
        - type: Affine
          args:
            rotate:
            - -10
            - 10
        - type: Resize
          args:
            size:
            - 0.5
            - 3
    - EastRandomCropData:
        size:
        - 960
        - 960
        max_tries: 50
        keep_ratio: true
    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
    - NormalizeImage:
        scale: 1./255.
        mean:
        - 0.485
        - 0.456
        - 0.406
        std:
        - 0.229
        - 0.224
        - 0.225
        order: hwc
    - ToCHWImage: null
    - KeepKeys:
        keep_keys:
        - image
        - threshold_map
        - threshold_mask
        - shrink_map
        - shrink_mask
  loader:
    shuffle: true
    drop_last: false
    batch_size_per_card: 4 # 8
    num_workers: 4
Eval:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/icdar2015/
    label_file_list:
      - ./train_data/icdar2015/test_icdar2015_label.txt
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - DetLabelEncode: null
    - DetResizeForTest: null
    - NormalizeImage:
        scale: 1./255.
        mean:
        - 0.485
        - 0.456
        - 0.406
        std:
        - 0.229
        - 0.224
        - 0.225
        order: hwc
    - ToCHWImage: null
    - KeepKeys:
        keep_keys:
        - image 
        - shape
        - polys
        - ignore_tags
  loader:
    shuffle: false
    drop_last: false
    batch_size_per_card: 1 # 1
    num_workers: 2

To execute fine-tuning, I've run

python -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_student.v2.yml

During fine-tuning, when it reaches the 40th step, an evaluation using data from valid_dataloader) was run; however, an exception was thrown and the process terminated as below:

Traceback (most recent call last):
  File "tools/train.py", line 208, in <module>
    main(config, device, logger, vdl_writer)
  File "tools/train.py", line 180, in main
    program.train(config, train_dataloader, valid_dataloader, device, model,
  File "/home/kota/ocr/PaddleOCR/tools/program.py", line 387, in train
    cur_metric = eval(
  File "/home/kota/ocr/PaddleOCR/tools/program.py", line 548, in eval
    eval_loss = loss_class( preds, batch)['loss']
  File "/home/kota/py38ocr/lib/python3.8/site-packages/paddle/nn/layer/layers.py", line 1254, in __call__
    return self.forward(*inputs, **kwargs)
  File "/home/kota/ocr/PaddleOCR/ppocr/losses/det_db_loss.py", line 58, in forward
    label_threshold_map, label_threshold_mask, label_shrink_map, label_shrink_mask = labels[
ValueError: not enough values to unpack (expected 4, got 3)

As far as I've investigated, the valid_dataloader doesn't yield labels as a tuple of 5 elements. It has 4 elements instead and the shapes of them don't seem to look like label_threshold_map, label_threshold_mask, label_shrink_map, label_shrink_mask.

Could anyone shed some light on this? Many thanks.

🏃‍♂️ Environment (运行环境)

OS

Distributor ID: Ubuntu
Description:    Ubuntu 20.04.6 LTS
Release:        20.04
Codename:       focal

Device

device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.2, Runtime API Version: 11.7, cuDNN Version: 8.5.

paddle-related packages are shown below

numpy==1.22.0
paddle-bfloat==0.1.7
paddle-serving-app @ https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_app-0.8.3-py3-none-any.whl
paddle-serving-client @ https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_client-0.8.3-cp38-none-any.whl
paddle-serving-server @ https://paddle-serving.bj.bcebos.com/test-dev/whl/paddle_serving_server-0.8.3-py3-none-any.whl
paddle2onnx==1.0.9
paddlefsl==1.1.0
paddlenlp==2.5.2
paddleocr==2.7.0.3
paddlepaddle==2.5.0
paddlepaddle-gpu==2.5.0

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

python -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_student.v2.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant