Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue/438/cluster wl simple #461

Open
wants to merge 50 commits into
base: master
Choose a base branch
from

Conversation

eduardojsbarroso
Copy link
Contributor

@eduardojsbarroso eduardojsbarroso commented Oct 11, 2024

Description

Implementation of cluster shear profile analysis in Firecrown.

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

The following checklist will make sure that you are following the code style and
guidelines of the project as described in the
contributing page.

  • I have run bash pre-commit-check and fixed any issues
  • I have added tests that prove my fix is effective or that my feature works
  • I have made corresponding changes to the documentation
  • I have 100% test coverage for my changes (please check this after the CI system has verified the coverage)

@@ -107,11 +107,11 @@ def _compute_theory_vector(self, tools: ModelingTools) -> TheoryVector:
if cl_property == ClusterProperty.COUNTS:
theory_vector_list += cluster_counts
continue

if cl_property == ClusterProperty.DELTASIGMA:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before, this was giving an error if the property was neither COUNTS or MEAN_MASS. However, the way the code is structure now, I want to be able to pass two statistics for the likelihood, and this statistics will not do anything with the DELTASIGMA data type. It has not to give an error because the other statistics will read this data (I have one SACC file with all the three data types). Nonetheless, should we test it here if all the properties provided are at least in the ClusterProperty module? Is there a good way to do it besides adding a for loop here? @m-aguena @vitenti

@@ -63,7 +63,7 @@ def get_observed_data_and_indices_by_survey(
# pylint: disable=no-member
data_type = sacc.standard_types.cluster_mean_log_mass
else:
raise NotImplementedError(cluster_property)
continue
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another example from my last comment. I want it to go through the data and just do not do anything if it is not a property of the statiscs. However this should be replaced with another error if the provided properties are not the ones from the ClusterProperty module

@eduardojsbarroso
Copy link
Contributor Author

Hi @vitenti , clmm is now on the conda-forge repository

@vitenti
Copy link
Collaborator

vitenti commented Nov 26, 2024

@eduardojsbarroso Hello Eduardo, I just updated the environment.yaml to include CLMM. You can now resume your work here and once it is in good shape please ask for a review.

Copy link

codecov bot commented Dec 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.0%. Comparing base (8d443c2) to head (31c12e1).

Additional details and impacted files
@@           Coverage Diff            @@
##           master     #461    +/-   ##
========================================
  Coverage   100.0%   100.0%            
========================================
  Files          54       60     +6     
  Lines        4167     4408   +241     
========================================
+ Hits         4167     4408   +241     
Files with missing lines Coverage Δ
firecrown/likelihood/binned_cluster.py 100.0% <100.0%> (ø)
...recrown/likelihood/binned_cluster_number_counts.py 100.0% <100.0%> (ø)
...elihood/binned_cluster_number_counts_deltasigma.py 100.0% <100.0%> (ø)
firecrown/modeling_tools.py 100.0% <100.0%> (ø)
firecrown/models/cluster/abundance.py 100.0% <100.0%> (ø)
firecrown/models/cluster/abundance_data.py 100.0% <100.0%> (ø)
firecrown/models/cluster/binning.py 100.0% <100.0%> (ø)
firecrown/models/cluster/cluster_data.py 100.0% <100.0%> (ø)
firecrown/models/cluster/deltasigma.py 100.0% <100.0%> (ø)
firecrown/models/cluster/deltasigma_data.py 100.0% <100.0%> (ø)
... and 4 more

@eduardojsbarroso eduardojsbarroso marked this pull request as draft December 20, 2024 13:40
@eduardojsbarroso eduardojsbarroso marked this pull request as ready for review December 20, 2024 13:41
@eduardojsbarroso
Copy link
Contributor Author

HI @vitenti , I think the code looks good so far. I will wait for the review before I create more pull request to add complexity to the analysis.

Copy link
Collaborator

@marcpaterno marcpaterno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not sure if there is a problem in the recipe framework, which seems to be the reason you have introduced the base class BinnedCluster. Please look at the comments in the firecrown/likelihood/binned_cluster.py for more details. We would like to discuss this with you.

We will send an invitation in slack.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file declares a main, but it doesn't seem to be an actual program that does something when invoked.
Is there something missing, or should the main be removed?
If it is removed, then the shebang line can also be removed.
The documentation also says this is a notebook -- it appears some of the documentation is stale and in need of review.



def convert_binned_profile_to_sacc(
cluster_counts,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add type hints for the function arguments, so that mypy can verify the use of this function.


# Use CLMM to create a mock DeltaSigma profile to add to the SACC file later
cosmo_clmm = Cosmology()
cosmo_clmm._init_from_cosmo(cosmo)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method _init_from_cosmo is private to the Cosmology object cosmo_cmm, and so we should not call it here. Either there has to be a way to get what is needed from the public interface of cosmo_clmm, or we need to request a change in the clmm package to make this method part of the public API.


richness_inds = np.digitize(cluster_richness, richness_edges) - 1
z_inds = np.digitize(cluster_z, z_edges) - 1
mean_DeltaSigma = np.array(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is complex calculation. We suggest using scipy.stats.binned_statistic to do this calculation. The result should be both efficient and easier to read (and thus to maintain).

for j in range(N_z)
]
)
std_DeltaSigma = np.array(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We expect that binned_statistic (noted above) would also simplify this calculation.

@@ -0,0 +1,292 @@
#!/usr/bin/env python
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script seems to share a great deal of functionality with generate_rich_deltatheta_r_sacc_data.py.
We suggest factoring out the common code to either gen_sacc_aux.py or to a new module, and using the code from there.

The comments about the environment use, the #noqa: E402 directives, and the use of private methods also apply to this module.

The comment about the use of binned_statistic made in the previous module also applies to this module.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have introduced a base class BinnedCluster in order to differentiate between cluster counts and cluster $\Delta_\Sigma$.

The original design intended for the cluster recipe to determine how to calculate the quantity associated with each different cluster property (e.g. counts or deltasigma). Is there something in this original design that is inadequate for this case? If so, we'd like to try fixing that inadequacy, rather than introducing an orthogonal solution for the same problem. If the cluster recipe design can not be fixed to handle this, then it should be removed, because it is not doing what it was designed to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants