How to set the flags correctly to reduce CPU usage or GPU usage ? #14497

Azkazar · 2025-01-06T04:39:19Z

Azkazar
Jan 6, 2025

im using paddleocr for a project and i want to limit the resources its using.

The CPU usage of paddle ocr is tied to the image it scans, i want to limit it but how ? many cases where the cpu usage reaches 100% when doing full screen ocr.

this is my specs
CPU: AMD Ryzen 5 5625U Graphics
GPU: AMD Radeon Graphics
SSD: WDC PC SN530 SDBPNPZ-512G-1006 512GB
RAM: Micron 8ATF1G64HZ-3G2R1 2x8GB -
MBD: HP Laptop 14s-fq2xxx

i tried setting the flags but no changes, i dont know if its because i set the flags wrong or that its not what i should expect.

import tkinter as tk
from tkinter import filedialog
from paddleocr import PaddleOCR
import numpy as np
import paddle
import time

Custom DebugPaddleOCR class to include debugging capabilities

class DebugPaddleOCR(PaddleOCR):
def init(self, **kwargs):
super().init(**kwargs)
self.original_postprocess_op = self.text_recognizer.postprocess_op
self.text_recognizer.postprocess_op = self.debug_postprocess_op

def debug_postprocess_op(self, *args, **kwargs):
    preds = args[0]
    print("Debug: First few characters:", self.original_postprocess_op.character[:10])
    try:
        return self.original_postprocess_op(*args, **kwargs)
    except Exception as e:
        print("Debug: Exception occurred:", str(e))
        print("Debug: Pred argmax:", np.argmax(preds, axis=1))
        raise

Configure PaddlePaddle memory and CPU parameters

paddle.set_flags({
"FLAGS_fraction_of_cpu_memory_to_use": 0.1, # Use 10% of total CPU memory
"FLAGS_allocator_strategy": "naive_best_fit", # Optimize memory fragmentation
"FLAGS_eager_delete_scope": True, # Reduce memory usage by deleting scope synchronously
"FLAGS_eager_delete_tensor_gb": 0.0, # Enable garbage collection
"FLAGS_fast_eager_deletion_mode": True, # Use fast garbage collection
"FLAGS_use_pinned_memory": False, # Disable pinned memory for lower CPU usage
"FLAGS_fraction_of_cpu_memory_to_use": 100,
})

Initialize the DebugPaddleOCR instance with additional parameters

ocr = DebugPaddleOCR(lang='id', cpu_threads=1) # Set language to Indonesian and use 1 CPU thread

def run_ocr():
# Open a file dialog to select an image file
file_path = filedialog.askopenfilename(filetypes=[("Image Files", ".png;.jpg;.jpeg;.bmp;*.tiff")])
if not file_path:
return

# Perform OCR
start_time = time.time()
try:
    result = ocr.ocr(file_path, det=True, rec=True)
    end_time = time.time()

    # Extract detected words
    detected_words = [line[1][0] for block in result for line in block]
    output_text = "\n".join(detected_words)

    # Display results in the textbox
    text_box.delete(1.0, tk.END)
    text_box.insert(tk.END, output_text)
    text_box.insert(tk.END, f"\n\nOCR processing time: {end_time - start_time:.2f} seconds")
except Exception as e:
    text_box.delete(1.0, tk.END)
    text_box.insert(tk.END, f"An error occurred during OCR processing:\n{str(e)}")

Create the GUI application

root = tk.Tk()
root.title("OCR GUI")

Add a button to select a file

select_button = tk.Button(root, text="Select Image", command=run_ocr)
select_button.pack(pady=10)

Add a textbox to display OCR results

text_box = tk.Text(root, wrap=tk.WORD, width=80, height=20)
text_box.pack(padx=10, pady=10)

Run the application

root.mainloop()

Answered by Azkazar

Feb 6, 2025

Have you found the solution how to reduce the cpu usage, i am facing the same issue,cpu and ram usuage stays same after processing and eventually reaches to 100% and crashes the server.

oh yeah i used psu util to limit the process usage instead. works like a charm, only for the cpu though. Cant do the sam as ram.

import os
import time
import psutil
from PyQt6.QtCore import QThread, pyqtSignal
from paddleocr import PaddleOCR

class OcrWorker(QThread):
finished = pyqtSignal(list, list, float, float, float)

def __init__(self, images):
    super().__init__()
    self.images = images
    self.peak_cpu = 0
    self.execution_time = 0
    self.process = psutil.Process(os.getpid())

def limit_cp…

View full answer

GreatV · 2025-01-06T04:46:13Z

GreatV
Jan 6, 2025
Maintainer

To reduce PaddleOCR's CPU usage or GPU usage while maintaining performance, you can make the following adjustments:

1. Adjust CPU Threads

The number of CPU threads used by PaddleOCR directly affects resource usage. By default, PaddleOCR uses multiple threads, which can spike CPU usage. You can limit this by reducing the number of threads:

ocr = PaddleOCR(cpu_threads=1)  # Restrict to 1 thread

In your script, you've already set cpu_threads=1, which is good for reducing CPU consumption. You can experiment with increasing this value slightly (e.g., cpu_threads=2) to balance performance and CPU usage.

2. Optimize PaddlePaddle Flags

The paddle.set_flags() configuration in your code may not be working as intended because some flags are designed for internal settings and may not significantly impact CPU usage. Here's a refined version of the configuration:

paddle.set_flags({
    "FLAGS_fraction_of_cpu_memory_to_use": 0.5,  # Limit to 50% of CPU memory
    "FLAGS_use_pinned_memory": False,           # Disable pinned memory for a lower CPU load
})

Note: "FLAGS_fraction_of_cpu_memory_to_use": 100 in your current setup is incorrect because values should be in the range [0.0, 1.0].

3. Use PaddleLite for Lightweight Inference

If you want to further reduce resource usage, consider using PaddleLite, which is optimized for edge devices and constrained environments. PaddleLite is a lightweight inference engine that can reduce both CPU and GPU usage significantly. You can convert your PaddleOCR model to a PaddleLite model and run it with optimized performance.

Guide to PaddleLite

4. Reduce Model Size

If high accuracy isn't critical, you can use a lighter OCR model. PaddleOCR supports multiple lightweight models. For example:

ocr = PaddleOCR(rec_model_dir='path/to/lite_model', det_model_dir='path/to/lite_model', lang='en')

You can download these models from the PaddleOCR Model Zoo.

5. Limit Image Size

The size of the input image directly impacts resource usage. If you're performing OCR on large images, resizing them to a smaller resolution can significantly reduce computation time and memory usage. For example:

from PIL import Image

def preprocess_image(file_path, max_width=1000, max_height=1000):
    image = Image.open(file_path)
    image.thumbnail((max_width, max_height), Image.ANTIALIAS)
    return image

Before passing the image to ocr.ocr(), preprocess it using this function.

6. Batch Process Small Regions

Instead of performing OCR on the entire screen at once, divide the image into smaller regions and process them sequentially. This reduces the peak CPU usage:

def split_image(image, num_splits=4):
    width, height = image.size
    split_height = height // num_splits
    regions = [image.crop((0, i * split_height, width, (i + 1) * split_height)) for i in range(num_splits)]
    return regions

Process each region one at a time using ocr.ocr().

7. Switch to GPU (if available)

Your hardware includes an AMD Radeon GPU. PaddleOCR typically supports NVIDIA GPUs via CUDA, but since you have an AMD GPU, you might not get native GPU acceleration. However, you can still offload some computations to GPU by enabling use_gpu=True:

ocr = PaddleOCR(use_gpu=True)

Check if the GPU is supported by verifying with paddle.device.get_device().

8. Avoid Multiprocessing Overhead

The provided discussion highlights that using multiprocessing with PaddleOCR can increase processing time due to overhead. Stick to single-threaded or multithreaded execution without multiprocessing unless necessary. If you must use multiprocessing, ensure proper initialization of the PaddleOCR object within each process.

9. Measure and Monitor Resource Usage

To debug and optimize further, monitor CPU and memory usage using tools like psutil or system monitors. Example:

import psutil

print("CPU usage:", psutil.cpu_percent())
print("Memory usage:", psutil.virtual_memory().percent)

Additional Reference

The discussion linked (GitHub Discussion) highlights that multiprocessing can increase processing times by up to 5x due to initialization overhead. Avoid multiprocessing unless you're processing multiple independent images concurrently and the initialization overhead is acceptable.

By combining these optimizations, you should be able to significantly limit the CPU and GPU usage of PaddleOCR while maintaining reasonable performance.

Response generated by feifei-bot | chatgpt-4o-latest

0 replies

Azkazar · 2025-01-09T07:23:17Z

Azkazar
Jan 9, 2025
Author

@GreatV i am sorry for the very late response...

1.Adjust Cpu threads
-what is this one supposed to do ? the memory usage stays high doesnt get reduced. it doesnt get limited to what i expect. what am i supposed to expect..? it still reaches 100% or at least 50%

2.Optimize PaddlePaddle Flags.
-i have set the fraction of cpu memory to use at 0.1 yet the usage in task manager still reaches able to reach 50% or 100%

3.PaddleLite for lightweight Interference
-im not planning to use this

4.Reduce Model Size
-i dont think this is the problem, the problem is the cpu usage which always scales with how many texts there is...

5.Limit image size
-this method may affect recognition results

6.batch process small regions
-this may cut words and affect results, i used it anyway, the usage stays high.

7.Switch to GPU, i dont have one

9.Measure and Monitor Resource Usage
-what has been recorded by psutil doesnt match the task manager..

2 replies

awahab116 Feb 5, 2025

Have you found the solution how to reduce the cpu usage, i am facing the same issue,cpu and ram usuage stays same after processing and eventually reaches to 100% and crashes the server.

Azkazar Feb 6, 2025
Author

Have you found the solution how to reduce the cpu usage, i am facing the same issue,cpu and ram usuage stays same after processing and eventually reaches to 100% and crashes the server.

oh yeah i used psu util to limit the process usage instead. works like a charm, only for the cpu though. Cant do the sam as ram.

import os
import time
import psutil
from PyQt6.QtCore import QThread, pyqtSignal
from paddleocr import PaddleOCR

class OcrWorker(QThread):
finished = pyqtSignal(list, list, float, float, float)

def __init__(self, images):
    super().__init__()
    self.images = images
    self.peak_cpu = 0
    self.execution_time = 0
    self.process = psutil.Process(os.getpid())

def limit_cpu_usage(self, percentage):
    total_cores = psutil.cpu_count(logical=True)
    cores_to_use = max(1, int((total_cores * percentage) / 100))
    self.process.cpu_affinity(list(range(cores_to_use)))

def run(self):
    try:
        self.limit_cpu_usage(10)
        ocr = PaddleOCR(use_angle_cls=True, lang='en', det=False, cls=True, cpu_threads=1)
        start_time = time.time()
       
        all_text_regions = []
        all_full_texts = []
       
        for img_path in self.images:
            result = ocr.ocr(img_path, cls=True)
           
            text_regions = []
            full_text = ""
           
            if result[0]:
                for line in result[0]:
                    coords = line[0]  
                    text = line[1][0]  
                    x1 = min(coord[0] for coord in coords)
                    y1 = min(coord[1] for coord in coords)
                    x2 = max(coord[0] for coord in coords)
                    y2 = max(coord[1] for coord in coords)
                    width = x2 - x1
                    height = y2 - y1
                   
                    text_regions.append({
                        'x': x1,
                        'y': y1,
                        'width': width,
                        'height': height,
                        'text': text
                    })
                    full_text += text + "\n"
            
            all_text_regions.append(text_regions)
            all_full_texts.append(full_text)

        self.execution_time = time.time() - start_time
        self.peak_cpu = psutil.cpu_percent()
        self.process.cpu_affinity(list(range(psutil.cpu_count(logical=True))))
       
        self.finished.emit(all_full_texts, all_text_regions, self.peak_cpu, 0, self.execution_time)
    except Exception as e:
        print(f"Error during OCR: {e}")
        self.finished.emit([], [], 0, 0, 0)

if ram usage is high for you just use the v3 model by using it on the parameter...# Example: Switching to PP-OCRv3
ocr_v3 = PaddleOCR(ocr_version='PP-OCRv3', lang='ch')

the performance difference isnt much between v3 and v4

Answer selected by Azkazar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to set the flags correctly to reduce CPU usage or GPU usage ? #14497

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

How to set the flags correctly to reduce CPU usage or GPU usage ? #14497

Azkazar Jan 6, 2025

Custom DebugPaddleOCR class to include debugging capabilities

Configure PaddlePaddle memory and CPU parameters

Initialize the DebugPaddleOCR instance with additional parameters

Create the GUI application

Add a button to select a file

Add a textbox to display OCR results

Run the application

Replies: 2 comments · 2 replies

GreatV Jan 6, 2025 Maintainer

1. Adjust CPU Threads

2. Optimize PaddlePaddle Flags

3. Use PaddleLite for Lightweight Inference

4. Reduce Model Size

5. Limit Image Size

6. Batch Process Small Regions

7. Switch to GPU (if available)

8. Avoid Multiprocessing Overhead

9. Measure and Monitor Resource Usage

Additional Reference

Azkazar Jan 9, 2025 Author

awahab116 Feb 5, 2025

Azkazar Feb 6, 2025 Author

Azkazar
Jan 6, 2025

Replies: 2 comments 2 replies

GreatV
Jan 6, 2025
Maintainer

Azkazar
Jan 9, 2025
Author

Azkazar Feb 6, 2025
Author