Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Fix errors in pandas.Series.argmax #32019

Merged
merged 9 commits into from
Feb 26, 2020

Conversation

farhanreynaldo
Copy link
Contributor

output of python scripts/validate_docstrings.py pandas.Series.argmax:

################################################################################
################################## Validation ##################################
################################################################################

@@ -942,7 +948,22 @@ def argmax(self, axis=None, skipna=True, *args, **kwargs):

See Also
--------
numpy.ndarray.argmax
numpy.ndarray.argmax : Returns the indices of the maximum values along an axis.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a pandas argmin that is worth adding here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See Also
--------
numpy.ndarray.argmax : Returns the indices of the maximum values along an axis.
pandas.Series.argmin : Return a ndarray of the minimum argument indexer.

Would this be sufficient?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'm not sure if the pandas. prefix is needed, may be you can check other docstrings and see.

And for the descriptions, it could make sense to say something like "Equivalent method for numpy arrays.", "Same, but returning the minimum". I think for users could be easier to understand what those methods do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've noticed through scripts/validate_docstrings.py that I could safely omit pandas. to match the guidelines.

I agree with your proposed descriptions, as current description is redundant with each function description

Examples
--------
>>> s = pd.Series(data=[1, None, 5, 4, 5],
... index=['A', 'B', 'C', 'D', 'E'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think you can find some data that looks more "real". In Series.combine we've got a small Series with animal speeds that I think could be more appropriate to illustrate this: pd.Series({'falcon': 345.0, 'eagle': 200.0, 'duck': 30.0})

Also, may be worth adding a note saying that the 2 means that largest value is in the third element, since it's zero-indexed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm gonna use cereal calories dataset as an example and adding that note.

@@ -942,7 +948,27 @@ def argmax(self, axis=None, skipna=True, *args, **kwargs):

See Also
--------
numpy.ndarray.argmax
numpy.ndarray.argmax : Equivalent method for numpy arrays.
Series.argmin : Similar method, but returning the minimum.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Series.idxmax, Series.idxmin might also make sense here

Consider dataset containing cereal calories

>>> s = pd.Series({'Corn Flakes': 100.0, 'Almond Delight': 110.0,
... 'Cinnamon Toast Crunch': 120.0, 'Cocoa Puff': 110.0})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datapythonista do we care about getting an extra space before the quote here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what you're looking for?

>>> s = pd.Series({'Corn Flakes' : 100.0, 'Almond Delight' : 110.0,
...               'Cinnamon Toast Crunch' : 120.0, 'Cocoa Puff' : 110.0})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i meant an extra space after the ... on the second line, so that the single-quote in Cinnamon Toast Crunch aligned with the single-quote in Corn Flakes. Feel free to ignore, unless @datapythonista chimes in on the policy

@@ -929,11 +929,17 @@ def argmax(self, axis=None, skipna=True, *args, **kwargs):
"""
Return an ndarray of the maximum argument indexer.

If multiple values equal the maximum, the first row label with that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the maximum is achieved in multiple locations, the first such location is returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should go with location instead of row position?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"row position" sounds good. Definitely not "row label".

@@ -929,11 +929,17 @@ def argmax(self, axis=None, skipna=True, *args, **kwargs):
"""
Return an ndarray of the maximum argument indexer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this sentence accurate? it looks like we return an int, not ndarray

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also relevant for the Returns section)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this okay?

Return row position of the maximum argument indexer.

And also change for the Returns section:

Returns
-------
int
    row position of the maximum values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"maximum argument indexer" is awkward. how about "Returns int position of the largest value in the Series"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"maximum argument indexer" is awkward. how about "Returns int position of the largest value in the Series"

I think this one is easier to understand.

@datapythonista datapythonista merged commit 52a63ab into pandas-dev:master Feb 26, 2020
@datapythonista
Copy link
Member

Thanks for the nice docstring @farhanreynaldo

@farhanreynaldo farhanreynaldo deleted the argmax-doc branch February 27, 2020 07:31
@farhanreynaldo farhanreynaldo restored the argmax-doc branch February 27, 2020 07:35
roberthdevries pushed a commit to roberthdevries/pandas that referenced this pull request Mar 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants