-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge memory taken for each field when exporting #1240
Comments
cc @wgtmac , @stiga-huang , @coderex2522 |
@LouisClt To support the zero-copy mechanism, class BufferedOutputStream will have an internal data buffer. And the default capacity of the internal data buffer is 1MB. This default capacity size should be able to be modified, but here's a hint that if the buffer capacity is set too small, it may cause the buffer to expand and trigger memcpy function frequently. |
We may replace the DataBuffer by a new Buffer implementation with a much smarter memory management to automatically grow and shrink its size according to actual usage. This management can happen on the column basis. |
Thanks everyone for your answers. I understand the possible performances issues linked with lowering too much the size of the buffer (on my testing it was OK in my case though). |
I have created a JIRA to track the progress: https://issues.apache.org/jira/browse/ORC-1280 |
@dongjoon-hyun @wgtmac @LouisClt I will follow up on this issues(ORC-1280) and implement a much smarter memory management. |
Thank you, @coderex2522 . |
…OutputStream ### What changes were proposed in this pull request? This PR can solve the huge memory taken by BufferedOutputStream and refactor the write data logic in class CompressionBase. ### Why are the changes needed? This patch use BlockBuffer to replace DataBuffer of class BufferedOutputStream in order to solve the [issue](#1240). ### How was this patch tested? The UTs in TestBufferedOutputStream.cc and TestCompression.cc can cover this patch. Add the TestBlockBuffer.write_to UT. Closes #1275 from coderex2522/ORC-1280-PART2. Authored-by: coderex2522 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
Hello, it seems there were commits referencing this issue. Is this issue now fixed ? |
@LouisClt Thanks for your follow-up. We have implemented a block-based buffer called IMO, the next step is to use it to replace the input buffer of the |
To be precise, the I have created a JIRA to track it: https://issues.apache.org/jira/browse/ORC-1365 cc @coderex2522 |
Thanks for your reply @wgtmac and the implementation of the |
Hi, @LouisClt . FYI, according to the Apache ORC release cycle, newly developed features will be delivered via v1.9.0 on September 2023 (if they are merged to Apache ORC before.) |
Understood, and thanks for your answer ! |
I will work on it. |
…OutputStream ### What changes were proposed in this pull request? This PR can solve the huge memory taken by BufferedOutputStream and refactor the write data logic in class CompressionBase. ### Why are the changes needed? This patch use BlockBuffer to replace DataBuffer of class BufferedOutputStream in order to solve the [issue](apache#1240). ### How was this patch tested? The UTs in TestBufferedOutputStream.cc and TestCompression.cc can cover this patch. Add the TestBlockBuffer.write_to UT. Closes apache#1275 from coderex2522/ORC-1280-PART2. Authored-by: coderex2522 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
Hello,
Using arrow adapter, I became aware that the memory (RAM) footprint of the export (exporting an orc file) was very huge for each field. For instance, exporting a table with 10000 fields can take up to 30Go, even if there is only 10 records.
Even for 100 fields, that could take 100Mo+.
The "issue" seems to be coming from here :
orc/c++/src/ColumnWriter.cc
Line 59 in 432a7aa
When we create a writer with the "createWriter" (
orc/c++/src/Writer.cc
Lines 681 to 684 in 432a7aa
Is there a reason the BufferedOutputStream initial capacity is that high ? I circumvented my problem by lowering it to 1Ko (it didn't change much the performance according to my testing, but it may depend on usecases). Could it be envisaged to put a global variable (or static one) to parametrize this to allow changing this hard coded parameter ?
Thanks
The text was updated successfully, but these errors were encountered: