[ENG-2320] Refactor browse method to use worker pattern #121

kanapuli · 2025-01-31T11:15:06Z

Description

Benchmark results for the old recursive approach

Benchmark results for the new worker pattern

Summary by CodeRabbit

Refactor
- Implemented a worker pool pattern for node processing.
- Removed level parameter from the Browse function.
- Introduced a new NodeTask type for better node management.
- Added ServerMetrics struct for monitoring OPCUA server performance.
Bug Fixes
- Improved concurrency and error handling in node browsing.
- Enhanced node processing efficiency.
Chores
- Updated function signatures across multiple files.
- Renamed browse function to Browse for consistency.

coderabbitai · 2025-01-31T11:15:13Z

Walkthrough

The pull request introduces a significant refactoring of the OPC UA browsing functionality in the opcua_plugin package. The primary change is the implementation of a worker pool pattern for processing nodes, which aims to control the number of goroutines created during node browsing. The Browse function's signature has been modified to remove the level parameter, and a new NodeTask structure has been introduced to manage node processing more efficiently. The changes enhance the concurrency model of the browsing operation while maintaining the existing functionality.

Changes

File	Change Summary
`opcua_plugin/browse.go`	- Refactored `Browse` and `browse` functions to use worker pool pattern - Removed `level` parameter from function signatures - Added `NodeTask` type to encapsulate node browsing parameters - Implemented concurrent node processing with a dynamic number of workers - Added `browseChildren` helper function
`opcua_plugin/browse_frontend.go`, `opcua_plugin/server_info.go`	- Renamed `browse` function to `Browse`
`opcua_plugin/opcua_opc-plc_test.go`, `opcua_plugin/opcua_unittest_test.go`	- Updated `Browse` function calls to remove `level` parameter

Assessment against linked issues

Objective	Addressed	Explanation
Control number of Goroutines [ENG-2320]	✅

Possibly related PRs

Eng 2008 benthos umh struggles to read out arrays in opc ua v2 #84: The changes in the main PR are related to those in the retrieved PR as both involve modifications to the browse and Browse functions in the opcua_plugin/browse.go file, specifically in how parameters are handled and the overall browsing logic.
[ENG-2320] Reduce cyclomatic complexity of the browse function #119: The changes in the main PR are related to the modifications in the browse function within the retrieved PR, as both involve significant alterations to the same function's signature and logic in opcua_plugin/browse.go.
[ENG-2074] Remove duplicate browse logic #91: The changes in the main PR are related to the modifications in the Browse and browse functions in the retrieved PR, as both involve updates to their signatures and the handling of parameters, specifically the removal of the level parameter in favor of a new structure or channel.

Suggested reviewers

JeremyTheocharis
led0nk
Scarjit

Poem

🐰 Browsing nodes with worker might,
Goroutines dancing, oh so light!
A pool of threads, efficient and neat,
OPC UA's performance now a treat!
Code refactored with rabbit's delight! 🌟

✨ Finishing Touches

📝 Generate Docstrings (Beta)

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (7)

opcua_plugin/browse.go (6)

65-66: Clarify intended usage or visibility

Although documented as "avoid using this function directly," it is exported. Consider renaming or adding explicit comments if its usage is truly meant to be restricted to test scenarios.

69-75: Document struct fields

The NodeTask struct plays a crucial role in controlling recursion depth and mapping parent IDs. Consider adding inline doc comments for each field to enhance maintainability and readability.

77-79: Reduce parameter overload

The browse function’s signature has many parameters. Grouping them into a dedicated configuration struct would improve clarity and maintainability.

81-84: Make worker count configurable

Hardcoding numWorkers = 100 may limit flexibility. Consider allowing a configurable value or an adaptive strategy based on workload.

110-121: Simplify worker signature

Similar to browse, the worker function accepts numerous parameters. You could encapsulate them in a WorkerContext struct to reduce parameter churn.

323-323: Check concurrency impact

Each browse invocation spawns a worker pool of 100 goroutines. Repeated calls in a loop can multiply resource usage. Consider a shared pool or dynamic scaling if performance is a concern.

opcua_plugin/server_info.go (1)

40-42: Potential worker oversubscription

Each parallel Browse call starts a pool of 100 workers, possibly leading to hundreds of concurrent goroutines. Consider running them sequentially or unifying them under one task pool if resource constraints become an issue.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 78dd8fe and 10d85b3.

📒 Files selected for processing (5)

opcua_plugin/browse.go (3 hunks)
opcua_plugin/browse_frontend.go (1 hunks)
opcua_plugin/opcua_opc-plc_test.go (1 hunks)
opcua_plugin/opcua_unittest_test.go (8 hunks)
opcua_plugin/server_info.go (1 hunks)

🔇 Additional comments (15)

opcua_plugin/browse.go (6)

86-89: Worker pool initialization

This concurrency pattern matches a standard worker-pool approach and appears implemented correctly.

92-98: Initial task scheduling

Enqueuing the start node with level 0 is clear and logically sound.

101-108: Graceful shutdown logic

Closing taskChan after taskWg.Wait() and then waiting for workerWg ensures an orderly teardown.

222-251: Child task creation

The recursive scheduling of child nodes via browseChildren is well-structured and ensures deeper levels can be explored.

399-399: Heartbeat browsing

Spawning a separate goroutine for the heartbeat node is consistent with the existing approach. No issues detected.

122-220: Non-blocking send can drop data

Using a default case when writing to opcuaBrowserChan means data could be discarded if the channel is full. Verify whether dropping node definitions is acceptable. If not, remove the default to block until the channel is available.
-   select {
-   case opcuaBrowserChan <- browserDef:
-   default:
-       logger.Debugf("Worker %%d: opcuaBrowserChan blocked, skipping", id)
-   }
+   opcuaBrowserChan <- browserDef
opcua_plugin/browse_frontend.go (1)

44-44: Consistent usage of Browse

Invoking Browse here aligns with the refactored concurrency model. No additional concerns apparent.

opcua_plugin/opcua_unittest_test.go (7)

106-106: LGTM!

The function call has been correctly updated to match the new signature by removing the level parameter.

133-133: LGTM!

The function call has been correctly updated to match the new signature by removing the level parameter.

171-171: LGTM!

The function call has been correctly updated to match the new signature by removing the level parameter.

211-211: LGTM!

The function call has been correctly updated to match the new signature by removing the level parameter.

251-251: LGTM!

The function call has been correctly updated to match the new signature by removing the level parameter.

294-294: LGTM!

The function call has been correctly updated to match the new signature by removing the level parameter.

695-695: LGTM!

The function call has been correctly updated to match the new signature by removing the level parameter.

opcua_plugin/opcua_opc-plc_test.go (1)

1433-1433: LGTM!

The function call has been correctly updated to match the new signature by removing the level parameter.

opcua_plugin/browse.go

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

opcua_plugin/browse.go (2)
77-108: Consider making worker pool size configurable.

While the implementation is solid, hardcoding the number of workers and buffer size might not be optimal for all scenarios. Consider making these values configurable based on system resources or user requirements.
-const numWorkers = 100 // Adjust based on your needs
+// WorkerPoolConfig holds worker pool configuration
+type WorkerPoolConfig struct {
+    NumWorkers int
+    BufferSize int
+}
+
+// DefaultWorkerPoolConfig returns default configuration
+func DefaultWorkerPoolConfig() WorkerPoolConfig {
+    return WorkerPoolConfig{
+        NumWorkers: 100,
+        BufferSize: 1000,
+    }
+}
138-142: Consider making max depth configurable.

The maximum browse depth is hardcoded to 25 levels. Consider making this configurable to accommodate different OPC UA server hierarchies.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 10d85b3 and ef6a8e7.

📒 Files selected for processing (1)

opcua_plugin/browse.go (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: go-test-opcua-plc

🔇 Additional comments (4)

opcua_plugin/browse.go (4)

65-66: LGTM! Clean wrapper implementation.

The public wrapper function is well-documented and correctly implements the new worker pattern interface.

69-75: LGTM! Well-structured task encapsulation.

The NodeTask struct effectively encapsulates all necessary parameters for node processing, making the code more maintainable and easier to understand.

234-252: LGTM! Well-implemented helper functions.

The helper functions are well-structured with proper error handling and task management. The browseChildren function effectively manages the hierarchical browsing process.

187-193: Verify potential data loss in opcuaBrowserChan.

The non-blocking send to opcuaBrowserChan could silently drop nodes if the channel is full. Consider logging metrics about dropped nodes or implementing a backpressure mechanism.

✅ Verification successful

Current implementation of opcuaBrowserChan is appropriate

The non-blocking send with debug logging is an intentional design choice in this worker pool architecture. The implementation:

Prevents worker deadlocks

Provides visibility through debug logging

Operates in a browse context where skipping occasional nodes is preferable to blocking workers
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for other instances of opcuaBrowserChan usage to understand the impact
rg -A 5 "opcuaBrowserChan.*<-"
Length of output: 333

Script:
#!/bin/bash
# Find channel declaration and initialization
rg -A 2 "opcuaBrowserChan\s*:?=.*chan"

# Find consumer/receiver code
rg -B 2 "<-\s*opcuaBrowserChan"

# Get broader context of browse operation
ast-grep --pattern 'func browse($$$) {
  $$$
}'
Length of output: 2739

opcua_plugin/browse.go

JeremyTheocharis · 2025-02-04T06:58:43Z

opcua_plugin/browse.go

-		return
-	}
+// browse uses a worker pool pattern to process nodes
+func browse(ctx context.Context, startNode NodeBrowser, startPath string, logger Logger, parentNodeId string,


what happens if someone has multiple "parent nodes"

@JeremyTheocharis It is handled already inside the constructNodeHierarchy function. As you can see in the image, I have modelled an OPCUA server with a node called Temperature (nodeid= ns=2;i=2) has two parents nodes 1. DMK , 2. Boiler. The Browse operation correctly shows the temperature node then under their corresponding parent nodes.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (7)

opcua_plugin/browse.go (3)
64-68: Enhance function documentation.

The function documentation should be expanded to:

Explain the purpose of each parameter

Document the worker pool pattern implementation

Clarify why direct usage should be avoided

70-76: Add field-level documentation to NodeTask struct.

Each field in the struct should be documented to explain its purpose and any constraints.
 type NodeTask struct {
+    // node represents the OPC UA node to be processed
     node         NodeBrowser
+    // path represents the hierarchical path to this node
     path         string
+    // level represents the current depth in the node hierarchy
     level        int
+    // parentNodeId represents the ID of the parent node
     parentNodeId string
 }
157-281: Consider breaking down the worker function for better maintainability.

The worker function handles multiple responsibilities. Consider extracting these into separate functions:

Node attribute processing

Node class handling

Task processing logic
+func processNodeTask(task NodeTask, def *NodeDef) error {
+    // Extract node attribute processing logic
+}

+func handleNodeClass(def NodeDef, task NodeTask) error {
+    // Extract node class handling logic
+}

 func worker(
     ctx context.Context,
     id uuid.UUID,
     // ... other parameters ...
 ) {
     // ... existing setup ...
     
     for {
         select {
         case task, ok := <-taskChan:
             if !ok {
                 return
             }
             
-            // Current implementation
+            if err := processNodeTask(task, &def); err != nil {
+                sendError(ctx, err, errChan, logger)
+                taskWg.Done()
+                continue
+            }
+            
+            if err := handleNodeClass(def, task); err != nil {
+                sendError(ctx, err, errChan, logger)
+            }
         }
     }
 }
opcua_plugin/browse_workers.go (4)
14-14: Correct the spelling in the comment.

There's a minor typo in the comment for SampleSize: "requsts" should be "requests".

Apply this diff to fix the typo:
-	SampleSize     = 5 // Number of requsts to measure response time
+	SampleSize     = 5 // Number of requests to measure response time
56-84: Consider preserving partial history of response times.

Currently, the responseTimes slice is cleared after each adjustment, discarding historical data. Retaining a small rolling window (instead of resetting to an empty slice) could improve the stability of your average measurements, preventing abrupt changes.

75-82: Refine debug messages for clarity.

Although these debug logs are developer-focused (rather than user-facing error messages), you might consider adding minimal guidance or referencing next steps for developers, e.g., “Consider analyzing network conditions or server load.” This aligns with best practices for actionable logging.

92-101: Optimize the sample buffer handling.

The recordResponseTime function discards only the oldest measurement when capacity is reached. For high-throughput systems, consider using a ring buffer or specialized data structure to manage these durations more efficiently and reduce slice overhead.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef6a8e7 and 1d84ab2.

📒 Files selected for processing (3)

opcua_plugin/browse.go (4 hunks)
opcua_plugin/browse_frontend.go (1 hunks)
opcua_plugin/browse_workers.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

opcua_plugin/browse_frontend.go

🧰 Additional context used

📓 Path-based instructions (2)

`**/*`: When reviewing an error message, ensure it meets the...

**/*: When reviewing an error message, ensure it meets these guidelines. If a message is lacking (e.g., missing clear next steps or context), provide an example of the correct format to help improve clarity and usability.

Quick Guidelines for Writing Effective Error Messages

Structure Your Message with Three Key Parts
• Context:
Explain what the system was doing.
Example: “Couldn’t parse config file: /etc/sample-config.properties.”
• Error Description:
Clearly state what failed without generic wording.
Example: “Invalid snapshot mode ‘nevr’ provided.”
• Mitigation/Next Steps:
Tell the user what to do next. For complex issues, include an error code or documentation link.
Example: “Please update the mode; refer to our docs for valid options.”

Clarity & Conciseness
• Use plain, simple language and avoid unnecessary jargon (unless targeting technical users).
• Aim for short sentences (ideally 8–14 words) for better understanding.

Consistent Tone & Style
• Uniformity: Use the same voice, terminology, and punctuation across all messages.
• Empathy: Maintain a neutral, supportive tone; avoid blaming the user.
• Avoid overly casual language (skip “oops”/“whoops” and excessive exclamation points).

Visibility & Placement (User-Facing Messages)
• Proximity: Place error messages close to the related UI element to reduce cognitive load.
• Design: Use high-contrast text, icons, or subtle animations; follow accessibility guidelines (don’t rely solely on color).

Actionability & Efficiency
• Clearly indicate the next steps (e.g., “Retry later”, “Log in again”, “Check your input”).
• Preserve user input when possible to minimize frustration.
• Where relevant, provide error codes or links for further troubleshooting.

Security & Privacy Considerations
• User-Facing: Avoid exposing internal details or sensitive data.
• Developer/Logging: Include detailed context (like file paths or exception chains) for debugging, but be cautious about sensitive information.

Logging & Exception Best Practices (for Developer/Operations Messages)
• Log the full context—including exception chains—to aid troubleshooting.
• Use structured logging (e.g., JSON) for filtering and analysis.
• Choose a consistent method: either raise an exception or log an error (not both) to avoid redundancy.
• Don’t make error messages an API contract; expose machine-readable error codes or types instead.

Additional Best Practices
• Fail Early: Report errors as soon as possible (e.g., during input validation or startup).
• One Concept, One Term: Use consistent terminology across messages and documentation.
• Localization: For internal logs, use English; if localizing for users, follow a vetted process.

opcua_plugin/browse.go
opcua_plugin/browse_workers.go

`**/*`: Whenever in a new PR a new function or code is intro...

**/*: Whenever in a new PR a new function or code is introduced, ensure that there is at least somewhere as a code comment, in the README or wherer ever explained the "why" behind the decision. So e.g., why something was preferred over other options, or why a certain loop needs to be executed.

opcua_plugin/browse.go
opcua_plugin/browse_workers.go

🔇 Additional comments (5)

opcua_plugin/browse.go (2)

94-96: Address TODO comment about backpressure mechanism.

The current implementation uses a large buffer to avoid blocking, but this might lead to memory issues under high load. Consider implementing a backpressure mechanism using:

A semaphore to limit concurrent tasks

A sliding window approach

Rate limiting based on server response times

Would you like me to propose a backpressure implementation?

283-323: LGTM! Well-documented helper functions.

The helper functions are well-implemented with:

Clear documentation

Proper error handling

Good separation of concerns

opcua_plugin/browse_workers.go (3)

18-25: Explain the rationale behind ServerMetrics.

This newly introduced struct benefits from a brief explanation of “why” the worker-based concurrency approach is chosen, along with any design tradeoffs or alternatives considered. Such context helps maintainers and contributors understand the long-term maintainability and reasoning behind collecting these metrics.

Would you like to add a code comment or README snippet explaining your decision to implement ServerMetrics?

36-44: Verify behavior when adding a worker with an existing ID.

The addWorker method overwrites the stop channel if the given ID already exists in the map, potentially leading to a stale channel or worker confusion. Consider checking for an existing entry before overwriting.

45-54: Removal logic looks good.

The removeWorker function correctly closes the associated channel and removes it from the map, preventing potential leaks. No changes needed.

Refactor browse method to use worker pattern

e6e6c6d

kanapuli self-assigned this Jan 31, 2025

kanapuli added 7 commits January 31, 2025 12:18

Fix calling sync.WaitGroup counters

8f42586

Fix browse_frontend.go

7954492

Fix opcua_opc-plc_test.go

875f6c1

Fix sync.WaitGroup mismatch

9a25eef

Merge branch 'master' into eng-2320-control-go-routines

0171c38

Remove debug logs

a96edf6

Add 100 workers for browse operation

10d85b3

kanapuli marked this pull request as ready for review January 31, 2025 13:17

kanapuli requested a review from a team January 31, 2025 13:17

coderabbitai bot reviewed Jan 31, 2025

View reviewed changes

Scarjit reviewed Jan 31, 2025

View reviewed changes

opcua_plugin/browse.go Outdated Show resolved Hide resolved

opcua_plugin/browse.go Show resolved Hide resolved

opcua_plugin/browse.go Show resolved Hide resolved

Address PR comments

ef6a8e7

kanapuli requested a review from Scarjit January 31, 2025 14:41

Scarjit approved these changes Jan 31, 2025

View reviewed changes

coderabbitai bot reviewed Jan 31, 2025

View reviewed changes

kanapuli requested a review from JeremyTheocharis January 31, 2025 15:26

kanapuli marked this pull request as draft January 31, 2025 15:54

JeremyTheocharis requested changes Feb 4, 2025

View reviewed changes

kanapuli added 8 commits February 4, 2025 15:24

Merge branch 'master' into eng-2320-control-go-routines

45ee2f8

Merge branch 'master' into eng-2320-control-go-routines

34fec62

Fix docs for browseChildren

0ad19c5

Use dynamic worker counts to browse nodes

68523cc

Set defined sample size

ea09c67

Use uuid worker ids for browse workers to keep them unique

ae76955

Increase the taskChan buffer

d86059f

Add Todo for backpressure mechanism

1d84ab2

kanapuli requested a review from JeremyTheocharis February 7, 2025 14:11

kanapuli marked this pull request as ready for review February 7, 2025 14:11

coderabbitai bot reviewed Feb 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENG-2320] Refactor browse method to use worker pattern #121

[ENG-2320] Refactor browse method to use worker pattern #121

kanapuli commented Jan 31, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 31, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

JeremyTheocharis Feb 4, 2025

kanapuli Feb 7, 2025

coderabbitai bot left a comment

[ENG-2320] Refactor browse method to use worker pattern #121

Are you sure you want to change the base?

[ENG-2320] Refactor browse method to use worker pattern #121

Conversation

kanapuli commented Jan 31, 2025 • edited by coderabbitai bot Loading

Description

Benchmark results for the old recursive approach

Benchmark results for the new worker pattern

Summary by CodeRabbit

Summary by CodeRabbit

coderabbitai bot commented Jan 31, 2025 • edited Loading

Walkthrough

Changes

Assessment against linked issues

Possibly related PRs

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

JeremyTheocharis Feb 4, 2025

Choose a reason for hiding this comment

kanapuli Feb 7, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

kanapuli commented Jan 31, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 31, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)