Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use size_t for sequence sizes #1943

Open
ciaran2 opened this issue Oct 10, 2024 · 6 comments
Open

Use size_t for sequence sizes #1943

ciaran2 opened this issue Oct 10, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@ciaran2
Copy link

ciaran2 commented Oct 10, 2024

Proposal

The sizes and capacities of sequences like list, str, bytes, etc. are mostly implemented in C with type int. It would be better if possible to use size_t for such object size quantities.

Motivation

Use of size_t for sequence sizes would make interoperation with established C libraries more natural and avoid artificially limiting sequence sizes with a signed type that may be smaller than the real maximum supported by memory.

Alternatives

Casting to size_t when integrating with C libraries that expect it does work to an extent but has the limitations described above.

@ciaran2 ciaran2 added the enhancement New feature or request label Oct 10, 2024
@plajjan
Copy link
Contributor

plajjan commented Oct 11, 2024

@sydow can you comment on this? I think @ciaran2 proposal sounds reasonable but I really don't know this stuff very well. OK to switch to size_t or is there a good reason why we use int?

@nordlander
Copy link
Contributor

Isn't the purpose of size_t to provide a reasonable size measurement type adapted to each platform? That means size_t is highly platform dependent as far as I can see, and we've tried to avoid such C types elsewhere in favor of explicit bit widths in order to get a consistent behavior across all targets. So I don't know. I can see the merit of using size_t internally since that's the type used by many C library functions, but the type we choose will nevertheless have to relate to the Acton type that len() returns. And that type ought have a platform-independent bit width, I think.

@nordlander
Copy link
Contributor

But if the question is just size_t versus int, I'm all in favor of size_t. Though picking a fixed bit width type has additional merits, as it were.

@plajjan
Copy link
Contributor

plajjan commented Oct 11, 2024

@nordlander Right, so since int is of platform-dependent size as well, size_t is probably better than int... but you're saying that perhaps uint64_t is even better, right? Using uint64_t would solve the problem of arbitrary cutting the max limit in half, but it still means that whenever we do C lib integrations we prolly have to keep casting stuff.

@nordlander
Copy link
Contributor

Yep, that summarizes it well. And the casting that must remain isn't just a nuisance, it actually represents our responsibility to correctly map Acton sizes (as represented by an uint64_t, say) to whatever size_t means on each particular platform. And vice versa, of course.

@sydow
Copy link
Collaborator

sydow commented Oct 11, 2024

I don't have more to add; at some point we should go through everything and replace all int types with explicitly sized types, but size_t is probably an improvement for now.

A minor thing is that it disturbs me a little to use a 64 bit field to hold e.g. the length of a list. A 32 bit field is enough for lists of four billion elements, but I will not fight for it, since it is probably a disturbance due to starting as a programmer in the late sixties...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants