Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory use: Use ref-counted buffers for images and meshes #1373

Closed
emilk opened this issue Feb 21, 2023 · 2 comments
Closed

Memory use: Use ref-counted buffers for images and meshes #1373

emilk opened this issue Feb 21, 2023 · 2 comments
Labels
📉 performance Optimization, memory use, etc ⛃ re_datastore affects the datastore itself

Comments

@emilk
Copy link
Member

emilk commented Feb 21, 2023

We currently store all incoming log messages (so we can save them as .rrd files) while also indexing and storing their contents in the data store. This basically duplicate all data.

We should make sure that big components, such as meshes and tensors, use Arc under the hood so that the log messages and the store can share the memory.

Currently we use Vec when storing the contents of an image (TensorData::U8) and a mesh (RawMesh3D).

We want to use ref-counted types like Arc or arrow2::Buffer, otherwise we duplicate our memory use.

The natural thing to use is arrow2::Buffer<u8> or Arc<[u8], but neither of those implement ArrowSerialize/ArrowDeserialize yet (see below).

See also:

An alternative solution is to not store LogMsges at all and instead the data store to file.

@emilk emilk added ⛃ re_datastore affects the datastore itself 📉 performance Optimization, memory use, etc labels Feb 21, 2023
@emilk emilk mentioned this issue Feb 21, 2023
@nikolausWest
Copy link
Member

While I definitely agree with these improvements I think this issue also points to a higher level question: should we really be keeping all the incoming log messages in memory at all?

First of all we should probably make it easier to set up ways to stream messages directly to a file, both from the SDK and as it hits the viewer / server.

Still, it's very useful to be able to decide to save a recording (.rrd file) after the fact. Shouldn't we be able to to quite simply serialize and deserialize the data store to achieve the same ends?

@teh-cmc
Copy link
Member

teh-cmc commented Apr 18, 2023

Fixed:

  • We don't store data LogMsgs anymore.
  • TensorData, EndcodedMesh3D and RawMesh3D all use Buffers under the hood.

@teh-cmc teh-cmc closed this as completed Apr 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📉 performance Optimization, memory use, etc ⛃ re_datastore affects the datastore itself
Projects
None yet
Development

No branches or pull requests

3 participants