Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support __getitem__ with JobsCursor objects #1017

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

janbridley
Copy link

@janbridley janbridley commented Jan 28, 2025

Description

Implemented __getitem__ for JobsCursors, allow for indexing and slicing after calling project.find_jobs()

Motivation and Context

JobsCursors are iterable through the _JobsCursorIterator API, but slicing and indexing was not supported. This is atypical for Python types and requires users to unpack the cursor to index it.

This implementation returns a single Job if an integer index is requested, or a _JobsCursorIterator if a slice is requested. New tests have been added to cover the new functionality.

Checklist:

@janbridley janbridley added enhancement New feature or request python Pull requests that update Python code labels Jan 28, 2025
Copy link

codecov bot commented Jan 28, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.25%. Comparing base (82a9c69) to head (bd53150).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1017      +/-   ##
==========================================
+ Coverage   86.24%   86.25%   +0.01%     
==========================================
  Files          20       20              
  Lines        3503     3508       +5     
  Branches      662      663       +1     
==========================================
+ Hits         3021     3026       +5     
  Misses        331      331              
  Partials      151      151              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@janbridley
Copy link
Author

Note that docs are failing due to the ReadTheDocs configuration change - that will need to be addressed in a separate PR.

@janbridley janbridley changed the title Feature/jobscursor getitem Support __getitem__ with JobsCursor objects Jan 28, 2025
@janbridley janbridley marked this pull request as ready for review January 28, 2025 17:34
@janbridley janbridley requested review from a team as code owners January 28, 2025 17:34
@janbridley janbridley requested review from bdice and tommy-waltmann and removed request for a team January 28, 2025 17:34
@bdice
Copy link
Member

bdice commented Jan 28, 2025

This part of the code is usually benchmarked thoroughly before changes are accepted. @janbridley Would you be able to run the benchmark suite and share results? Here is a guide: https://github.com/glotzerlab/signac/blob/main/doc/support.rst#benchmarking

@janbridley
Copy link
Author

The asv suite does not seem to run either on my local machine (osx-arm) or a remote computer (Rocky Linux 8.10 x86). It could be a problem with micromamba, although I did try with a uv environment as well. If this is a known behavior please let me know - otherwise I can troubleshoot later this week.

· Creating environments
· Discovering benchmarks
·· Uninstalling from virtualenv-py3.12
·· Building 82a9c691 <main> for virtualenv-py3.12....
·· Installing 82a9c691 <main> into virtualenv-py3.12
·· Failed to build the project and import the benchmark suite.

@bdice
Copy link
Member

bdice commented Feb 4, 2025

@janbridley I was able to reproduce the failure you're seeing. I tried $ asv run main..feature/jobscursor-getitem -v but the verbose output didn't help. I think this is the same as airspeed-velocity/asv#1453.

@bdice
Copy link
Member

bdice commented Feb 4, 2025

@janbridley I found that using asv==0.6.1 works but newer versions do not work. I reported this upstream. airspeed-velocity/asv#1453 (comment)

@janbridley
Copy link
Author

janbridley commented Feb 6, 2025

ASV results. This always failed if a micromamba environment was active, but a clean asv==0.6.1 install with no mamba/conda/uv did work.

· Creating environments
· Discovering benchmarks
· Running 21 total benchmarks (3 commits * 1 environments * 7 benchmarks)
[ 0.00%] · For signac commit bd531503 <feature/jobscursor-getitem>:
[ 0.00%] ·· Benchmarking virtualenv-py3.12
[ 2.38%] ··· Running (benchmarks.ProjectBench.time_determine_len--)...
[ 9.52%] ··· Running (benchmarks.ProjectBench.time_iterate_single_pass--)...
[16.67%] ··· Running (benchmarks.ProjectRandomJobBench.time_select_by_id--).
[19.05%] ··· benchmarks.ProjectBench.time_determine_len                      ok
[19.05%] ··· ====== ===================== =================== ================ =============== =============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std               
             ------ --------------------- ------------------- ---------------- --------------- -------------
              100             10                   0                100               0           223±80μs  
              1000            10                   0                100               0         2.82±0.07ms 
             ====== ===================== =================== ================ =============== =============

[21.43%] ··· benchmarks.ProjectBench.time_iterate                            ok
[21.43%] ··· ====== ===================== =================== ================ =============== ============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std              
             ------ --------------------- ------------------- ---------------- --------------- ------------
              100             10                   0                100               0         14.3±0.5ms 
              1000            10                   0                100               0          123±10ms  
             ====== ===================== =================== ================ =============== ============

[23.81%] ··· benchmarks.ProjectBench.time_iterate_load_sp                    ok
[23.81%] ··· ====== ===================== =================== ================ =============== ==========
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std            
             ------ --------------------- ------------------- ---------------- --------------- ----------
              100             10                   0                100               0         116±10ms 
              1000            10                   0                100               0         828±70ms 
             ====== ===================== =================== ================ =============== ==========

[26.19%] ··· ...hmarks.ProjectBench.time_iterate_single_pass                 ok
[26.19%] ··· ====== ===================== =================== ================ =============== =============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std               
             ------ --------------------- ------------------- ---------------- --------------- -------------
              100             10                   0                100               0         1.41±0.02ms 
              1000            10                   0                100               0          14.4±0.2ms 
             ====== ===================== =================== ================ =============== =============

[28.57%] ··· ...rojectRandomJobBench.time_search_lean_filter                 ok
[28.57%] ··· ====== ===================== =================== ================ =============== =============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std               
             ------ --------------------- ------------------- ---------------- --------------- -------------
              100             10                   0                100               0         1.17±0.02ms 
              1000            10                   0                100               0           123±30ms  
             ====== ===================== =================== ================ =============== =============

[30.95%] ··· ...rojectRandomJobBench.time_search_rich_filter                 ok
[30.95%] ··· ====== ===================== =================== ================ =============== ============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std              
             ------ --------------------- ------------------- ---------------- --------------- ------------
              100             10                   0                100               0         5.59±0.2ms 
              1000            10                   0                100               0          152±10ms  
             ====== ===================== =================== ================ =============== ============

[33.33%] ··· ...arks.ProjectRandomJobBench.time_select_by_id                 ok
[33.33%] ··· ====== ===================== =================== ================ =============== ============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std              
             ------ --------------------- ------------------- ---------------- --------------- ------------
              100             10                   0                100               0          14.1±7μs  
              1000            10                   0                100               0         7.43±0.9μs 
             ====== ===================== =================== ================ =============== ============

[33.33%] · For signac commit 79f9a157 <feature/jobscursor-getitem~1>:
[33.33%] ·· Building for virtualenv-py3.12
[33.33%] ·· Benchmarking virtualenv-py3.12
[35.71%] ··· Running (benchmarks.ProjectBench.time_determine_len--)...
[42.86%] ··· Running (benchmarks.ProjectBench.time_iterate_single_pass--)...
[50.00%] ··· Running (benchmarks.ProjectRandomJobBench.time_select_by_id--).
[52.38%] ··· benchmarks.ProjectBench.time_determine_len                      ok
[52.38%] ··· ====== ===================== =================== ================ =============== ============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std              
             ------ --------------------- ------------------- ---------------- --------------- ------------
              100             10                   0                100               0          277±80μs  
              1000            10                   0                100               0         2.75±0.5ms 
             ====== ===================== =================== ================ =============== ============

[54.76%] ··· benchmarks.ProjectBench.time_iterate                            ok
[54.76%] ··· ====== ===================== =================== ================ =============== ==========
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std            
             ------ --------------------- ------------------- ---------------- --------------- ----------
              100             10                   0                100               0         13.7±2ms 
              1000            10                   0                100               0         129±30ms 
             ====== ===================== =================== ================ =============== ==========

[57.14%] ··· benchmarks.ProjectBench.time_iterate_load_sp                    ok
[57.14%] ··· ====== ===================== =================== ================ =============== ===========
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std             
             ------ --------------------- ------------------- ---------------- --------------- -----------
              100             10                   0                100               0          110±8ms  
              1000            10                   0                100               0         962±200ms 
             ====== ===================== =================== ================ =============== ===========

[59.52%] ··· ...hmarks.ProjectBench.time_iterate_single_pass                 ok
[59.52%] ··· ====== ===================== =================== ================ =============== =============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std               
             ------ --------------------- ------------------- ---------------- --------------- -------------
              100             10                   0                100               0         1.42±0.02ms 
              1000            10                   0                100               0           13.8±4ms  
             ====== ===================== =================== ================ =============== =============

[61.90%] ··· ...rojectRandomJobBench.time_search_lean_filter                 ok
[61.90%] ··· ====== ===================== =================== ================ =============== ============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std              
             ------ --------------------- ------------------- ---------------- --------------- ------------
              100             10                   0                100               0         1.43±0.1ms 
              1000            10                   0                100               0          112±3ms   
             ====== ===================== =================== ================ =============== ============

[64.29%] ··· ...rojectRandomJobBench.time_search_rich_filter                 ok
[64.29%] ··· ====== ===================== =================== ================ =============== ==========
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std            
             ------ --------------------- ------------------- ---------------- --------------- ----------
              100             10                   0                100               0         9.53±5ms 
              1000            10                   0                100               0         134±10ms 
             ====== ===================== =================== ================ =============== ==========

[66.67%] ··· ...arks.ProjectRandomJobBench.time_select_by_id                 ok
[66.67%] ··· ====== ===================== =================== ================ =============== ============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std              
             ------ --------------------- ------------------- ---------------- --------------- ------------
              100             10                   0                100               0         7.54±0.1μs 
              1000            10                   0                100               0         7.38±0.2μs 
             ====== ===================== =================== ================ =============== ============

[66.67%] · For signac commit 2824cb13 <feature/jobscursor-getitem~2>:
[66.67%] ·· Building for virtualenv-py3.12
[66.67%] ·· Benchmarking virtualenv-py3.12
[69.05%] ··· Running (benchmarks.ProjectBench.time_determine_len--)...
[76.19%] ··· Running (benchmarks.ProjectBench.time_iterate_single_pass--)...
[83.33%] ··· Running (benchmarks.ProjectRandomJobBench.time_select_by_id--).
[85.71%] ··· benchmarks.ProjectBench.time_determine_len                      ok
[85.71%] ··· ====== ===================== =================== ================ =============== ============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std              
             ------ --------------------- ------------------- ---------------- --------------- ------------
              100             10                   0                100               0          288±40μs  
              1000            10                   0                100               0         2.62±0.4ms 
             ====== ===================== =================== ================ =============== ============

[88.10%] ··· benchmarks.ProjectBench.time_iterate                            ok
[88.10%] ··· ====== ===================== =================== ================ =============== ==========
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std            
             ------ --------------------- ------------------- ---------------- --------------- ----------
              100             10                   0                100               0         14.1±4ms 
              1000            10                   0                100               0         108±5ms  
             ====== ===================== =================== ================ =============== ==========

[90.48%] ··· benchmarks.ProjectBench.time_iterate_load_sp                    ok
[90.48%] ··· ====== ===================== =================== ================ =============== ===========
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std             
             ------ --------------------- ------------------- ---------------- --------------- -----------
              100             10                   0                100               0          112±20ms 
              1000            10                   0                100               0         738±100ms 
             ====== ===================== =================== ================ =============== ===========

[92.86%] ··· ...hmarks.ProjectBench.time_iterate_single_pass                 ok
[92.86%] ··· ====== ===================== =================== ================ =============== ============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std              
             ------ --------------------- ------------------- ---------------- --------------- ------------
              100             10                   0                100               0         1.41±0.2ms 
              1000            10                   0                100               0          13.9±3ms  
             ====== ===================== =================== ================ =============== ============

[95.24%] ··· ...rojectRandomJobBench.time_search_lean_filter                 ok
[95.24%] ··· ====== ===================== =================== ================ =============== ============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std              
             ------ --------------------- ------------------- ---------------- --------------- ------------
              100             10                   0                100               0         1.18±0.7ms 
              1000            10                   0                100               0          117±20ms  
             ====== ===================== =================== ================ =============== ============

[97.62%] ··· ...rojectRandomJobBench.time_search_rich_filter                 ok
[97.62%] ··· ====== ===================== =================== ================ =============== ============
               N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std              
             ------ --------------------- ------------------- ---------------- --------------- ------------
              100             10                   0                100               0         5.26±0.5ms 
              1000            10                   0                100               0          143±6ms   
             ====== ===================== =================== ================ =============== ============

[100.00%] ··· ...arks.ProjectRandomJobBench.time_select_by_id                 ok
[100.00%] ··· ====== ===================== =================== ================ =============== ==========
                N     num_statepoint_keys   num_document_keys   data_size_mean   data_size_std            
              ------ --------------------- ------------------- ---------------- --------------- ----------
               100             10                   0                100               0         7.33±2μs 
               1000            10                   0                100               0         7.37±1μs 
              ====== ===================== =================== ================ =============== ==========


@bdice
Copy link
Member

bdice commented Feb 10, 2025

@janbridley I had a bit of trouble interpreting the benchmarks. Were all the numbers present for both before and after? If you can confirm there is no performance change, I will approve.

@janbridley
Copy link
Author

@janbridley I had a bit of trouble interpreting the benchmarks. Were all the numbers present for both before and after? If you can confirm there is no performance change, I will approve.

I updated the benchmark results - I'm not sure why the previous output tested the same commit twice. All large benchmarks (N=1000) are within standard deviation of main

@bdice
Copy link
Member

bdice commented Feb 11, 2025

Great. That’s enough to approve this PR.

If you would like to work on improving performance in the future I recommend running benchmarks with N=10,000 or 100,000 but that’s beyond the scope of what this PR requires.

@bdice
Copy link
Member

bdice commented Feb 11, 2025

@janbridley Would you be able to fix the ReadTheDocs error here?

@joaander
Copy link
Member

@janbridley Would you be able to fix the ReadTheDocs error here?

It is fixed on main: 21f3d64. Merge main into this branch to get the green check.

@janbridley janbridley enabled auto-merge (squash) February 12, 2025 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants