Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Swift] Adds a way to store external data and use them directly within the ByteBuffer #8478

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

mustiikhalil
Copy link
Collaborator

Addresses an issue where we keep copying the data instead of using the data from the already created objects. This would fix the issue of using too much memory since it would copy a big object while already having that object in memory.

The solution for this was retaining a reference to the object within the ByteBuffer, and then using a pointer to the underlying data.

@github-actions github-actions bot added the swift label Jan 10, 2025
@mustiikhalil mustiikhalil force-pushed the mmk/address-copy-on-data branch 2 times, most recently from 28ae97d to a6fa7e8 Compare January 10, 2025 18:56
@mustiikhalil mustiikhalil marked this pull request as draft January 10, 2025 18:57
@mustiikhalil mustiikhalil force-pushed the mmk/address-copy-on-data branch 2 times, most recently from cc99c16 to 7178cfe Compare January 10, 2025 19:34
@mustiikhalil mustiikhalil self-assigned this Jan 10, 2025
@mustiikhalil
Copy link
Collaborator Author

mustiikhalil commented Jan 10, 2025

@liuliu we are changing how one of the API's for the ByteBuffer is working, would be nice to validate it on your end too since you were the one who requested it. And if this is merged, before you reply and something breaks you can just open a PR/ticket

@mustiikhalil mustiikhalil force-pushed the mmk/address-copy-on-data branch from 02a8557 to 0367c11 Compare January 10, 2025 21:33
@liuliu
Copy link
Contributor

liuliu commented Jan 10, 2025

This seems to be good to me? My request (as implemented) is the assumingMemoryBound API that avoids the copy. And this change (seems to me) extends that to other types (Data, etc). Our original intention is to wrap SQLite returned blob pointer, which has a guaranteed lifetime for us (hence no need to wrap into Data and then use). As long as that API is kept, it seems great?

@mustiikhalil
Copy link
Collaborator Author

This seems to be good to me? My request (as implemented) is the assumingMemoryBound API that avoids the copy. And this change (seems to me) extends that to other types (Data, etc). Our original intention is to wrap SQLite returned blob pointer, which has a guaranteed lifetime for us (hence no need to wrap into Data and then use). As long as that API is kept, it seems great?

If that's the case all good! Now in theory it should not copy any of the data structures and simply retain them within the buffer & point towards their underlying memory.

Data, and arrays should be deallocated when the ByteBuffer is deallocated, but with assumingMemoryBound we will have to invoke the deallocate manually from the origin point of the pointer (which is the case currently too).

@mustiikhalil mustiikhalil marked this pull request as ready for review January 10, 2025 23:55
@liuliu
Copy link
Contributor

liuliu commented Jan 11, 2025

Yeah, makes a lot of sense. In our case, the lifetime of that pointer is managed by SQLite. The only gotcha of this implementation might be holding Data for arbitrarily long time, especially if Data is a memory-mapped of big file (a few GiB) and only part of these are flatbuffers encoded (you could imagine a zip file with some metadata in flatbuffers, and the whole file is loaded in Data), that could be troublesome. But for people have that problem, they can always explicitly make a copy of the subdata at higher level.

…teBuffer

Adds a way to store external data and use them direclty within them
ByteBuffer instead of copying memory into the internal memory of the
ByteBuffer

Removed unowned from the initializer

Fixes a bug where array's of bytes are still being copied
@mustiikhalil mustiikhalil force-pushed the mmk/address-copy-on-data branch from 4fbc8c0 to 8b96c52 Compare January 11, 2025 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants