Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering in data explorer causes column profiles to be computed twice instead of only once #5150

Closed
wesm opened this issue Oct 23, 2024 · 1 comment
Assignees
Labels
area: data explorer Issues related to Data Explorer category. bug Something isn't working

Comments

@wesm
Copy link
Contributor

wesm commented Oct 23, 2024

On the current main branch, as discussed in #5148, the get_column_profiles method for the visible summary stats is being called twice (rather than only once as is needed) after applying a filter:

Screencast.from.2024-10-23.17-18-20.mp4

I'm using a 25x larger version of flights.parquet

flights = pd.read_parquet('/home/wesm/code/testingstuff/flights.parquet')
flights_big = pd.concat([flights] * 25, ignore_index=True)

Here are the console logs

https://gist.githubusercontent.com/wesm/dc4f9c3195d37ee34c2dced21b7cfce8/raw/186381aeb327eb6cf36d3a592553d7b7c8e20d78/gistfile1.txt

You can see that get_column_profiles is computed once at 2024-10-23 17:18:22.983 and then again 5 seconds later at 2024-10-23 17:18:27.992 (after the cache is cleared and the sparklines are blanked out)

@wesm wesm added bug Something isn't working area: data explorer Issues related to Data Explorer category. labels Oct 23, 2024
@wesm wesm added this to the 2024.11.0 Pre-Release milestone Oct 24, 2024
@wesm wesm self-assigned this Oct 24, 2024
wesm added a commit that referenced this issue Oct 24, 2024
#5156)

Attempts to address #5150. In the backend-state-updated event handler,
the profiles were being refreshed only before `fetchData` was called
with an argument to invalidate all caches including the profiles, so the
profiles have to immediately be recomputed. This resolves the double
computation that I observed in #5148.
@testlabauto
Copy link
Contributor

Verified Fixed

Positron Version(s) : 2024.11.0-111
OS Version          : OSX

Test scenario(s)

Only see 1 call to get_column_profiles for filtration in logs

Link(s) to TestRail test cases run or created:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: data explorer Issues related to Data Explorer category. bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants