Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose capacity in an array #11507

Open
beta-ziliani opened this issue Nov 29, 2021 · 10 comments
Open

Expose capacity in an array #11507

beta-ziliani opened this issue Nov 29, 2021 · 10 comments

Comments

@beta-ziliani
Copy link
Member

If #11485 sees the light, it would be handy to have a way of knowing the capacity of the array. Currently, there is an undocumented public method remaining_capacity, which returns @capacity - @offset_to_buffer. This is not ideal, as it only calculates how many elements can be added at the end of the array. I suggest to deprecate it, and instead have a public getter for @capacity. Then, the number of elements that can be added to an array a can be calculated as a.capacity - a.size.

@straight-shoota
Copy link
Member

straight-shoota commented Nov 29, 2021

I'm actually not sure how much telling the raw capacity is without taking the complexity of the buffer offset into account.

#remaining_capacity is useful as remaining_capacity - size shows exactly how many elements I can append without requiring a realloc. Exposing @capacity is inconclusive in that regard. You don't know how much of the capacity is taken by the buffer offset, i.e. how the free capacity is divided between front and back. Free capacity at the front won't delay reallocation unless it's more than half the capacity. Then the contents are copied to the front, which isn't realloc but memcpy, so it has similar cost. Taking the offset into accoutn just adds a lot of complexity.

When I only care about adding elements to the back, remaining_capacity can be useful, @capacity is pretty useless. When I only care about adding elements to the front, I don't get any useful information from either remaining_capacity nor @capacity. Only together they can tell how much of the total capacity is available for unshift.
For use cases where you never shift/unshift, remaining_capacity and @capacity have the same value because the offset is always zero. For that it wouldn't matter which one you have.
@capacity replacing remaining_capacity doesn't improve any of these use cases.

I can't think of any new use case it could help solving, but I'm happy to hear about them.

@beta-ziliani
Copy link
Member Author

Note that remaining_capacity is broken as is: it doesn't say how many elements you can append; you still need to subtract @size to get that. One option is to fix it (that is, substract @size even if a breaking change), and then add another one remaining_capacity_for_unshift to get @offset_to_buffer.

@straight-shoota
Copy link
Member

I don't think it's broken. You can definitely use it. The definition of "remaining" is just ambiguous: It can be understood as the free capacity between last element and the end of the capacity. However, it can also be the part of the capacity that remains after subtracting the offset. The former is probably better because that's what you're most likely interested in. But both are possible and usable, it's just a matter of subtracting or adding size. The key is to define and document this properly.

I wouldn't oppose changing the defintion to mean the free space at the end. But I don't think we can do that in place, even in a major release. It would be a silent breaking change. If we change the definition, we need to rename the method.

@HertzDevil
Copy link
Contributor

HertzDevil commented Dec 3, 2021

I think the public API should return the free spaces before and after the elements in a 2-Tuple:

class Array(T)
  def free_capacity
    {@offset_to_buffer, @capacity - @offset_to_buffer - @size}
  end
end

x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# realloc'd, 10 spaces to the right
x << 11
x << 12
# realloc'd, 20 spaces to the left
x.unshift 13
x.free_capacity # => {19, 8}

Then #trim_to_size could take arguments in both directions in a similar fashion.

I would personally deprecate #remaining_capacity even for the private API, due to the direction ambiguity.

@Fryguy
Copy link
Contributor

Fryguy commented Dec 3, 2021

@HertzDevil I think the issue there is that it exposes the internal implementation to the API, which most users don't even know about. Then, if we choose a different internal implementation in the future, then we're setting ourselves up for a breaking change on the API.

Perhaps it makes sense to name it more verbose with something like remaining_capacity_before_realloc, since avoiding reallocs seems to be one of the main motivations behind this?

@HertzDevil
Copy link
Contributor

HertzDevil commented Dec 3, 2021

My point is that any public API of Array should support both the left and right directions to the furthest extent possible, and not just one of them.

@beta-ziliani
Copy link
Member Author

I'm with @HertzDevil. If in the future we decide to not have @offset_to_buffer, then it will return 0 to the left and still be correct. remaining_capacity_before_realloc doesn't tell you how much space you have on either side, so you might end up misjudging what to do.

@Fryguy
Copy link
Contributor

Fryguy commented Dec 3, 2021

Yeah that's fair - if we remove the "left side" capacity in the future, then the API would just expose 0

@asterite
Copy link
Member

asterite commented Dec 3, 2021

Just my 2 cents, but please don't expose internal implementation details in the API. If you do that, you will never be able to change the internal implementation without introducing at least one breaking change.

I can't think of any use case where knowing the remaining capacity, or any capacity at all, helps you take a decision in order to X.

@rdp
Copy link
Contributor

rdp commented Dec 22, 2021

I could see having an ensureCapacity type method. Not sure how useful knowing which side is left vs. right would be...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants