Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in squidpy metrics computation #197

Closed
jimmymathews opened this issue Aug 24, 2023 · 5 comments · Fixed by #185 or #249
Closed

Bug in squidpy metrics computation #197

jimmymathews opened this issue Aug 24, 2023 · 5 comments · Fixed by #185 or #249
Assignees
Labels
bug Something isn't working

Comments

@jimmymathews
Copy link
Collaborator

There is a possibility that the number of "clusters" in the AnnData object we create and pass to squidpy functions is sometimes 1 and not 2, as expected. Here is a relevant error log:

08-24 00:28:59 [  INFO   ] spt ondemand start: Request: b'neighborhood enrichment\x1dMelanoma CyTOF ICI - measurement\x1dCD8A\x1eCD3\x1eCD45RA\x1d\x1dSOX10\x1dCD3\x1eMS4A1\x1ePECAM1\x1ePTPRC'
08-24 00:28:59 [  INFO   ] spt ondemand start: Request: b'neighborhood enrichment\x1dMelanoma CyTOF ICI - measurement\x1dCD8A\x1eCD3\x1eCD45RA\x1d\x1dSOX10\x1dCD3\x1eMS4A1\x1ePECAM1\x1ePTPRC'
08-24 00:28:59 [  DEBUG  ] spt ondemand start:149: ['Melanoma CyTOF ICI - measurement', 'CD8A\x1eCD3\x1eCD45RA', '', 'SOX10', 'CD3\x1eMS4A1\x1ePECAM1\x1ePTPRC']
08-24 00:28:59 [  DEBUG  ] spt ondemand start:149: ['Melanoma CyTOF ICI - measurement', 'CD8A\x1eCD3\x1eCD45RA', '', 'SOX10', 'CD3\x1eMS4A1\x1ePECAM1\x1ePTPRC']
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:27: Requesting computation.
08-24 00:28:59 [  DEBUG  ] ondemand.providers.squidpy_provider:64: Creating feature with specifiers: (Melanoma CyTOF ICI - measurement) ["(('CD8A', 'CD3', 'CD45RA'), ())", "(('SOX10',), ('CD3', 'MS4A1', 'PECAM1', 'PTPRC'))"]
08-24 00:28:59 [  DEBUG  ] workflow.common.export_features:302: Inserting specification 276, data_analysis_study Melanoma CyTOF ICI - ondemand computed features
08-24 00:28:59 [  DEBUG  ] workflow.common.export_features:316: Inserting specifier: ('276', "(('CD8A', 'CD3', 'CD45RA'), ())", '1')
08-24 00:28:59 [  DEBUG  ] workflow.common.export_features:316: Inserting specifier: ('276', "(('SOX10',), ('CD3', 'MS4A1', 'PECAM1', 'PTPRC'))", '2')
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:85: Number of values possible to be computed: 72
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:111: Actual number computed: 0
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:37: Not already pending.
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:39: Starting background task.
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:42: Background task just started, is pending.
08-24 00:29:00 [  DEBUG  ] ondemand.providers.squidpy_provider:158: Computed feature value of 276: Mold_14_0, 0.9999999966019605
08-24 00:29:01 [  DEBUG  ] ondemand.providers.squidpy_provider:158: Computed feature value of 276: Mold_62_0, 0.9999999999983203
/usr/local/lib/python3.11/site-packages/squidpy/gr/_nhood.py:188: RuntimeWarning: invalid value encountered in divide
  zscore = (count - perms.mean(axis=0)) / perms.std(axis=0)
08-24 00:29:02 [  DEBUG  ] ondemand.providers.squidpy_provider:158: Computed feature value of 276: Mold_48_0, 0.9997325365091907
08-24 00:29:02 [  DEBUG  ] ondemand.providers.squidpy_provider:158: Computed feature value of 276: Mold_67_0, 0.9999999965373167
08-24 00:29:02 [  DEBUG  ] workflow.common.cell_df_indexer:36: Some KeyError. (1, 1, 1)
08-24 00:29:02 [  DEBUG  ] workflow.common.cell_df_indexer:36: Some KeyError. (1, 0, 0, 0, 0)
Exception in thread Thread-16 (have_feature_computed):
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.11/threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.11/site-packages/spatialprofilingtoolbox/ondemand/providers/squidpy_provider.py", line 151, in have_feature_computed
    value = compute_squidpy_metric_for_one_sample(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spatialprofilingtoolbox/workflow/common/squidpy.py", line 45, in compute_squidpy_metric_for_one_sample
    return _summarize_neighborhood_enrichment(_nhood_enrichment(adata))
                                              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spatialprofilingtoolbox/workflow/common/squidpy.py", line 133, in _nhood_enrichment
    result = nhood_enrichment(adata, 'cluster', copy=True, seed=128, show_progress_bar=False)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/squidpy/gr/_nhood.py", line 174, in nhood_enrichment
    _test = _create_function(n_cls, parallel=numba_parallel)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/squidpy/gr/_nhood.py", line 86, in _create_function
    raise ValueError(f"Expected at least `2` clusters, found `{n_cls}`.")
ValueError: Expected at least `2` clusters, found `1`.

The usage that triggered this was a request for the neighborhood enrichment metric with a certain pair of phenotypes on the Moldoveanu dataset. Here is the HTTP request:

https://<apiserver>/request-spatial-metrics-computation-custom-phenotypes/?study=Melanoma%20CyTOF%20ICI&feature_class=neighborhood%20enrichment&positive_marker=CD8A&positive_marker=CD3&positive_marker=CD45RA&negative_marker=&positive_marker2=SOX10&negative_marker2=CD3&negative_marker2=MS4A1&negative_marker2=PECAM1&negative_marker2=PTPRC
@jimmymathews jimmymathews added the bug Something isn't working label Aug 24, 2023
@CarlinLiao CarlinLiao self-assigned this Aug 31, 2023
@CarlinLiao
Copy link
Collaborator

This will happen if the both the phenotypes you pass have identical values for all cells in the sample. I've updated squidpy.py to account for this by raising a Python warning when only once cluster could be made and returning None when co_occurence errors because of this.

def create_and_transcribe_one_sample(
sample: str,
df: DataFrame,
channel_symbols_by_column_name: dict[str, str],
feature_uploader: ADIFeaturesUploader,
) -> None:
for column, symbol in channel_symbols_by_column_name.items():
criteria = PhenotypeCriteria(positive_markers=[column], negative_markers=[])
value = compute_squidpy_metric_for_one_sample(df, [criteria], 'spatial autocorrelation')
if value is None:
continue
feature_uploader.stage_feature_value((symbol,), sample, value)

Note that the db function skips uploading a record when no co_occurence value is returned. We should consider replacing it with something like a NaN value to indicate that this value cannot be computed. Either way, this will need to be handled on the frontend.

@CarlinLiao CarlinLiao linked a pull request Sep 7, 2023 that will close this issue
@jimmymathews
Copy link
Collaborator Author

This bug was happening with neighborhood enrichment, not co-occurrence, and it happened when 2 distinct phenotypes were used.

@jimmymathews jimmymathews reopened this Sep 12, 2023
@jimmymathews
Copy link
Collaborator Author

Also I would like to check that the production instance no longer exhibits this issue before we close.

@jimmymathews
Copy link
Collaborator Author

This was part way resolved by changes made since the original issue (the warning does appear) , but the current behavior is still pretty much the same. If this 1-cluster issue is encountered during computation of a feature, the error causes no further computations to proceed, and the feature is permanently in a pending computation state.

@jimmymathews
Copy link
Collaborator Author

Fixed by issue197 7158472.

Now when this issue is encountered it is logged, None is returned, and the computation proceeds to the next sample.

...
11-16 21:52:00 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_61_0, 0.08425675935957183
11-16 21:52:01 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_62_0, 0.9999999999983203
11-16 21:52:01 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_63_0, 1.9636129412073801e-90
/usr/local/lib/python3.11/site-packages/spatialprofilingtoolbox/workflow/common/squidpy.py:158: UserWarning: All phenotypes provided had identical values. Only one cluster could be made.
  warn('All phenotypes provided had identical values. Only one cluster could be made.')
11-16 21:52:01 [  ERROR  ] workflow.common.squidpy:57: Got 1 cluster, need 2 to compute neighborhood enrichment. Presuming null.
11-16 21:52:01 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_64_0, None
11-16 21:52:01 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_65_0, 0.9993631557868359
11-16 21:52:01 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_66_0, 0.9282621463508228
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants