Update mean and sum functions #643

aleexarias · 2025-01-13T18:13:29Z

Update mean and sum functions for FData, FDataGrid, FDataIrregular and FDataBasis to correctly handle NaN values in coefficients.

Fixes #642

Describe the proposed changes

Edit the mean function from FData so that it only becomes a parameter check, leaving the checks as it is.
Add an auxiliar function in FDataGrid that works for mean, sum and var, and simply calls the relevant np.sum/nansum, mean/nanmean, var/nanvar when relevant depending on the skipna parameter, have the mean and sum function work with this auxiliar function.
Add a mean function in FDataBasis that calculates the means for the coefficients when the functions have no nan values in the coefficients, otherwise it is not considered for the calculations.
Add a mean function in FDataIrregular that calculates the mean based on the mean_counts parameter and depending on skipna or not.

I have performed a self-review of my code
The code conforms to the style used in this package
The code is fully documented and typed (type-checked with Mypy)
I have added thorough tests for the new/changed functionality

…correctly handle NaN values in coefficients irreg Updated mean an sum functions for FData, FDataGrid, FDataBasis and FDataIrregular to correctly handle NaN values in coefficients

vnmabus · 2025-02-14T13:04:48Z

skfda/representation/basis/_fdatabasis.py

+            A FDataBasis object with just one sample representing
+            the mean of all the samples in the original object.
+        """
+        super().mean(axis=axis, dtype=dtype, out=out, keepdims=keepdims, 


I am no longer sure that we want to do any validation in the abstract class. It is confusing. I would rather move the validation to the subclasses, or, if we do not want to repeat code, to a function in _utils or in a (maybe private for now) function in misc.validation.

vnmabus · 2025-02-14T15:15:05Z

skfda/representation/grid.py

+        if min_count > 0:
+            valid = ~np.isnan(self.data_matrix)
+            n_valid = np.sum(valid, axis=0)
+            data[n_valid < min_count] = np.nan


Wouldn't a conditional be more clear?

I do not seem to understand where and how you are suggesting to use a conditional, the code does seem clear to me (as the author, I might be biased)

vnmabus · 2025-02-14T15:19:58Z

skfda/representation/grid.py

+        return self._compute_aggregate(operation='sum', skipna=skipna, 
+                                       min_count=min_count)


For multiline expressions, our style guide is to put each parameter starting a line of its own, and the matching delimiter starting its own line (at the same indentation level as the line in which it is opened:

Suggested change

return self._compute_aggregate(operation='sum', skipna=skipna,

min_count=min_count)

return self._compute_aggregate(

operation='sum',

skipna=skipna,

min_count=min_count,

)

Please, do the same in the other cases you edited.

vnmabus · 2025-02-14T16:01:04Z

skfda/representation/irregular.py

+        if skipna:
+            count_values = np.sum(~np.isnan(common_values), axis=0)
+        else:
+            count_values = np.full(sum_values.shape, self.n_samples)


Isn't this just self.n_samples?

To operate with sum_values, it is needed in array form to fit seamlessly with the flow of the case where skipna is specified

vnmabus · 2025-02-14T16:04:44Z

skfda/representation/basis/_fdatabasis.py

+        out: None = None,
+        keepdims: bool = False,
+        skipna: bool = False,
+        min_count: int = 0,


It seems to me that min_count is not being used here. Why is that?

It is left for compatibility with the mean functions of FDataIrregular and Grid, but it does not make sense to use it, as you do not have measurements for each observation, but simply the observations approximated by functions.

vnmabus · 2025-02-14T16:05:33Z

skfda/representation/_functional_data.py

@@ -882,6 +882,7 @@ def mean(
        out: None = None,
        keepdims: bool = False,
        skipna: bool = False,
+        min_count: int = 0,


Why is min_count removed?

vnmabus · 2025-02-14T16:06:16Z

skfda/representation/grid.py

+
+        data = agg_func(self.data_matrix, axis=0, keepdims=True)
+
+        if min_count > 0:


This should only be done if skipna == True.

vnmabus · 2025-02-14T16:06:58Z

skfda/representation/irregular.py

+        else:
+            count_values = np.full(sum_values.shape, self.n_samples)
+
+        if min_count > 0:


This should only be done if skipna == True.

allcontributors bot and others added 6 commits January 13, 2025 19:01

update CONTRIBUTORS.md

dd69045

update .all-contributorsrc

c5df2f0

Updated mean an sum functions for FData, FDataGrid and FDataBasis to …

f76a037

…correctly handle NaN values in coefficients irreg Updated mean an sum functions for FData, FDataGrid, FDataBasis and FDataIrregular to correctly handle NaN values in coefficients

Merge branch 'GAA-UAM:develop' into skipna_issue_fixed

1e3914e

Merge branch 'GAA-UAM:develop' into skipna_issue_fixed

a675cd3

Merge branch 'develop' into skipna_issue_fixed

88e18b0

vnmabus changed the title ~~Update mean an sum functions~~ Update mean and sum functions Feb 14, 2025

vnmabus requested changes Feb 14, 2025

View reviewed changes

Changes suggested by author

e3911ad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update mean and sum functions #643

Update mean and sum functions #643

aleexarias commented Jan 13, 2025 •

edited by vnmabus

Loading

vnmabus Feb 14, 2025

vnmabus Feb 14, 2025

aleexarias Mar 1, 2025

vnmabus Feb 14, 2025

vnmabus Feb 14, 2025

aleexarias Mar 1, 2025

vnmabus Feb 14, 2025

aleexarias Mar 1, 2025

vnmabus Feb 14, 2025

vnmabus Feb 14, 2025

vnmabus Feb 14, 2025

		return self._compute_aggregate(operation='sum', skipna=skipna,
		min_count=min_count)


		data = agg_func(self.data_matrix, axis=0, keepdims=True)

		if min_count > 0:

Update mean and sum functions #643

Are you sure you want to change the base?

Update mean and sum functions #643

Conversation

aleexarias commented Jan 13, 2025 • edited by vnmabus Loading

Describe the proposed changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aleexarias commented Jan 13, 2025 •

edited by vnmabus

Loading