Formatter: np.array
formatting
#8452
Replies: 8 comments 18 replies
-
Personally I would favour computing the minimum width for each column, as our arrays for Biopython are mainly in test cases, which rarely change. |
Beta Was this translation helpful? Give feedback.
-
Also I like the suggested layout: np.array([
# rows here
]) i.e. Putting the ( and [ on the same line not to waste vertical space. |
Beta Was this translation helpful? Give feedback.
-
Answering the Open Questions:
|
Beta Was this translation helpful? Give feedback.
-
Should this also handle arrays >2D? I guess the innermost list can be formatted as rows? Note that the numpy array representation does something similar by default (each row on a new line, columns right justified & all padded to the width of the maximum value)
The relevant function I think is https://numpy.org/doc/stable/reference/generated/numpy.array_repr.html, which has a lot of options and handling for different data types (see https://numpy.org/doc/stable/reference/generated/numpy.set_printoptions.html#numpy.set_printoptions) |
Beta Was this translation helpful? Give feedback.
-
Numpy isn't the only array library out there, although it is the most widespread one. For example, I work on Scipp which adds a few extra properties to arrays. We face similar issues with formatting 2d arrays. So I was wondering if it is possible to find a common solution. Two solution that come to my mind are
The latter might provide a better out-of-the-box experience. Conceptually, I don't see why a numpy array constructed from a list would be treated differently from a plain list. (Of course there may be technical reason for why detecting lists of lists of numbers might not be feasible.) For context, in Scipp, creating an array might look like this: sc.array(
dims=['x', 'y'],
values=[[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]],
unit='m',
) and potentially even with two lists: sc.array(
dims=['x', 'y'],
values=[[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]],
variances=[[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]],
unit='m',
) In this case, all arguments are required to be keywords. And there are more functions that take lists like this as arguments. |
Beta Was this translation helpful? Give feedback.
-
While
# 2D with trailing comma
np.array(
[[1, 0],
[0, 1]],
)
# 2D without trailing comma
np.array([[1, 0],
[0, 1]])
# With dtype and trailing comma
np.array(
[[1, 0],
[0, 1]],
dtype='int64',
)
# With dtype and without trailing comma
np.array([[1, 0],
[0, 1]], dtype='int64') This already gets us sensible behavior of array-like literals and keyword arguments in the same constructor. But we don't need to special case sc.array(
dims=['x', 'y'],
values=[[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]],
variances=[[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]],
unit='m',
) I know they're out-of-scope, but expand here to see the extension to 3- and 4D arrays:# 3D
array(
[[[1., 0.],
[0., 1.]],
[[0., 1.],
[1., 0.]]],
)
# np.arange(48).reshape((2, 3, 2, 4))
array(
[[[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23]]],
[[[24, 25, 26, 27],
[28, 29, 30, 31]],
[[32, 33, 34, 35],
[36, 37, 38, 39]],
[[40, 41, 42, 43],
[44, 45, 46, 47]]]],
) Applying the rule to the arguments of a list literal gives us 3D array-likes almost by accident: numbers = [
[[1, 2, 3, 4],
[5, 6, 7, 8]],
[[ 9, 10, 11, 12],
[13, 14, 15, 16]],
] And if we use parentheses to indicate that the top-level list is to be treated as a single ND array instead of 2 2D arrays: numbers = (
[[[ 1, 2, 3, 4],
[ 5, 6, 7, 8]],
[[ 9, 10, 11, 12],
[13, 14, 15, 16]]]
) It's getting even more out of the scope of |
Beta Was this translation helpful? Give feedback.
-
Thank you all for the excellent feedback. I don't know when I'll get time to look into this closer but I certainly have a better understanding for the problem than before and it might be worth trying to find a formatting that works for all multi dimensional arrays. Altough this may require more time to stabilize, which could justify special casing for scientific usages for now to remove the most common need for suppression comments in the short term. |
Beta Was this translation helpful? Give feedback.
-
The
I wonder if the same logic can be used by ruff to format list-of-lists. |
Beta Was this translation helpful? Give feedback.
-
Broader adoption of ruff's formatter in the scientific community is the
np.array
(see this thread in the Beta Feedback discussion). The solution used by the community today is to suppress the formatting withfmt:off
/fmt:on
orfmt:skip
. However, Ruff's lack of support for expression-level suppressions results in unnecessarily large suppressed ranges (see this comment in the Suppression Comments discussion).The thread discusses two options:
np.array
np.array
, but without an explicit proposal.I prefer not to disable formatting because it means the formatter fails its main goal --- guaranteed consistent code formatting. It also takes the pressure from us to come up with a style that works well for the majority of cases, leaving the ecosystem in a worse state overall.
I want to use this issue to discuss a concrete style for improving
np.array
formatting. I have never usednp.array
, so I'll make incorrect assumptions. This is why I would love to hear feedback on the proposal from people who have usednp.array
. Please also correct me if the goal needs refinement.Goal: Improve formatting of the object passed to
np.array
calls if it is a two dimensional array.Non-Goal: Improve formatting for one or three and more dimensional arrays. One-dimensional list formatting may need an overhaul overall, but this is not specific to
np.array
Proposal
### New Style Applicability
When should the new style apply (rather than using the generic list formatting)
CallExpression
where the function is anAttributeExpr(name=np, attribute=array)
skip-magic-trailing-comma=false
: No row has a trailing comma (indicating that the values should be split on their ownline)
skip-magic-trailing-comma=true
: Ignore the style regardless whether a trailing comma is present or notIt is okay for:
Open Questions
np
?Proposed style
Example:
Note: The example assumes the preview "hug list preview style".
Alternatives
Instead of padding all cells to the same width, compute the padding for each column.
Considerations
Beta Was this translation helpful? Give feedback.
All reactions