Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kdtree option in knnsearch and rangesearch functions is VERY slow #151

Open
pr0m1th3as opened this issue Jun 26, 2024 · 2 comments
Open

Comments

@pr0m1th3as
Copy link
Member

The implementation of the kdtree nearest search method is extremely slow. I have disable it in the predict method of the ClassificationKNN class until it is fixed.

@pr0m1th3as pr0m1th3as changed the title kdtree option in knnsearch and rangesearch functions is **very** slow kdtree option in knnsearch and rangesearch functions is VERY slow Jun 26, 2024
@Sanj-bot
Copy link

/assign

@Sonu0305
Copy link

Analysis of Slow Performance in kdtree's rangesearch and knnsearch Functions

Hi @pr0m1th3as,

After a thorough analysis of the situation, I've identified a few potential reasons why the kdtree method is performing slowly in the rangesearch and knnsearch functions. Below are my observations along with some possible solutions to address these issues:


1. Method of Construction of the kdtree in rangesearch and knnsearch

Objective:
Ensure that the construction of the kdtree doesn't involve unnecessary recursion or redundant data structure usage.

Proposed Solution:

  • To avoid recomputing the entire tree for every query, we can consider storing intermediate kdtree nodes during tree construction. This can help speed up query operations by avoiding redundant work.

2. Unnecessary Computations and Traversals

Objective:
Reduce the search space during kdtree traversal and avoid unnecessary computations.

Proposed Solution:

  • Implement early stopping conditions during traversal to prune unnecessary branches.

    For example, we can add a check for the distance between the current node and the query point:

    if (current_squared_distance > best_squared_distance)
        % Skip this branch
    end
  • Avoid the recomputation of squared Euclidean distances where the square root is unnecessary. This can improve the speed of distance comparisons.


3. Difficulty in Analyzing Bottlenecks

Objective:
Identify areas where performance can be improved by pinpointing slow functions or excessive time spent in specific parts of the code.

Proposed Solution:

  • To better analyze the performance bottlenecks, we can use profiling tools to measure the time spent in different function calls.

To enable profiling, we can use the following code:

profile on
<code/function call>
profile off
p = profile("info");
p.FunctionTable
for i = 1:length(p.FunctionTable)
    fprintf('Function: %s\n', p.FunctionTable(i).FunctionName);
    fprintf('Total Time: %.4f seconds\n', p.FunctionTable(i).TotalTime);
    fprintf('Number of Calls: %d\n\n', p.FunctionTable(i).NumCalls);
end

This will help us gather insights on which parts of the code are consuming the most time. Below is an example of what the profiling tool summary might look like for a knnsearch function call:

image


This is my initial assessment based on the current understanding of the situation. Implementing the proposed solutions above should help improve the performance of kdtree's rangesearch and knnsearch functions.

I would appreciate any guidance or additional insights you may have on this matter.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants