[Enhancement] Limit memory used by ParallelIterable in Iceberg #54219

zhaohehuhu · 2024-12-23T08:20:49Z

Why I'm doing:

The ConcurrentLinkedQueue doesn't have a size limitation in Iceberg 1.6.0. When the iceberg table is large, the queue will also become really large, which will cause OOM in the FE.

As per the above screenshot, the ConcurrentLinkedQueue is consuming 91%+ of the heap memory and causing the Frontend (FE) to go down, this indicates a significant memory management issue due to unbounded growth of the queue.

This issue was resolved with Iceberg 1.6.1(#10691)

What I'm doing:

as title

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Bugfix cherry-pick branch check:

zhaohehuhu · 2024-12-23T08:31:50Z

@stephen-shelby @Youngwb plz help review

github-actions · 2024-12-27T09:08:18Z

[FE Incremental Coverage Report]

✅ pass : 0 / 0 (0%)

stephen-shelby · 2025-01-02T02:48:04Z

how many manifest files and data files in your case? I think if you use version 3.3 or above, this case may go to distributed plan, not local plan.

zhaohehuhu · 2025-01-02T08:08:42Z

how many manifest files and data files in your case? I think if you use version 3.3 or above, this case may go to distributed plan, not local plan.
Total Size of metadata is 439MB, while data files occupies 2TB space. Iceberg already released 1.6.1 to fix this issue, so it would be nice for us to update the version of Iceberg.

stephen-shelby · 2025-01-03T07:32:45Z

how many manifest files and data files in your case? I think if you use version 3.3 or above, this case may go to distributed plan, not local plan.
Total Size of metadata is 439MB, while data files occupies 2TB space. Iceberg already released 1.6.1 to fix this issue, so it would be nice for us to update the version of Iceberg.

could you try to set plan_mode=distributed then retry this query? you can observe if memory still high

gengjun-git · 2025-01-03T09:29:30Z

@Mergifyio rebase

Signed-off-by: zhaohehuhu <[email protected]>

mergify · 2025-01-03T09:29:53Z

rebase

✅ Branch has been successfully rebased

sonarqubecloud · 2025-01-03T09:35:52Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

zhaohehuhu · 2025-01-03T09:43:12Z

how many manifest files and data files in your case? I think if you use version 3.3 or above, this case may go to distributed plan, not local plan.
Total Size of metadata is 439MB, while data files occupies 2TB space. Iceberg already released 1.6.1 to fix this issue, so it would be nice for us to update the version of Iceberg.

could you try to set plan_mode=distributed then retry this query? you can observe if memory still high

OK. I will take a try.

github-actions · 2025-01-03T10:08:09Z

[Java-Extensions Incremental Coverage Report]

✅ pass : 0 / 0 (0%)

github-actions · 2025-01-03T10:09:12Z

[BE Incremental Coverage Report]

✅ pass : 0 / 0 (0%)

zhaohehuhu · 2025-01-08T02:19:46Z

Can we merge this pr firstly ? Iceberg has fixed some internal issues, so it’s fine to update the Iceberg version, just like Trino did.@stephen-shelby @gengjun-git

zhaohehuhu · 2025-01-09T08:33:43Z

set plan_mode=distributed

I failed to set plan_mode to distributed due to the issue(Failed to open the off-heap table scanner.)

stephen-shelby · 2025-01-09T11:58:47Z

set plan_mode=distributed

I failed to set plan_mode to distributed due to the issue(Failed to open the off-heap table scanner.)

you could check more detail msg in the fe.log.

zhaohehuhu · 2025-01-09T12:20:22Z

set plan_mode=distributed

I failed to set plan_mode to distributed due to the issue(Failed to open the off-heap table scanner.)

you could check more detail msg in the fe.log.

SQL Error [1064] [42000]: Failed to execute metadata collection job. Failed to open the off-heap table scanner. java exception details: java.lang.NoClassDefFoundError: Could not initialize class de.javakaffee.kryoserializers.UnmodifiableCollectionsSerializer
at com.starrocks.connector.iceberg.IcebergMetadataScanner.initSerializer(IcebergMetadataScanner.java:207)
at com.starrocks.connector.iceberg.IcebergMetadataScanner.open(IcebergMetadataScanner.java:132)

When the execution plan mode is switched to distributed, the issue occurs like above. This is a compatibility issue between JDK 17 and Kryo(may be fixed by #55016)

zhaohehuhu · 2025-01-22T07:26:34Z

closed. someone did it.

github-actions bot added the title needs [type] label Dec 23, 2024

mergify bot assigned zhaohehuhu Dec 23, 2024

zhaohehuhu changed the title ~~Upgrade Iceberg to 1.6.1 to limit memory used by ParallelIterable~~ [Enhancement] Upgrade Iceberg to 1.6.1 to limit memory used by ParallelIterable Dec 23, 2024

github-actions bot added 3.4 3.3 and removed title needs [type] labels Dec 23, 2024

gengjun-git previously approved these changes Dec 24, 2024

View reviewed changes

zhaohehuhu dismissed gengjun-git’s stale review via c53d510 December 25, 2024 03:05

zhaohehuhu force-pushed the dev-1223 branch from 9d5e87d to c53d510 Compare December 25, 2024 03:05

zhaohehuhu requested a review from gengjun-git December 25, 2024 03:06

github-actions bot removed the 3.3 label Dec 25, 2024

gengjun-git approved these changes Dec 25, 2024

View reviewed changes

zhaohehuhu changed the title ~~[Enhancement] Upgrade Iceberg to 1.6.1 to limit memory used by ParallelIterable~~ [Enhancement] Limit memory used by ParallelIterable in Iceberg Dec 27, 2024

Upgrade Iceberg to 1.6.1 to limit memory used by ParallelIterable

def0f7d

Signed-off-by: zhaohehuhu <[email protected]>

gengjun-git force-pushed the dev-1223 branch from c53d510 to def0f7d Compare January 3, 2025 09:29

zhaohehuhu closed this Jan 22, 2025

zhaohehuhu deleted the dev-1223 branch January 22, 2025 07:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement] Limit memory used by ParallelIterable in Iceberg #54219

[Enhancement] Limit memory used by ParallelIterable in Iceberg #54219

zhaohehuhu commented Dec 23, 2024 •

edited

Loading

zhaohehuhu commented Dec 23, 2024 •

edited

Loading

github-actions bot commented Dec 27, 2024

stephen-shelby commented Jan 2, 2025

zhaohehuhu commented Jan 2, 2025

stephen-shelby commented Jan 3, 2025

gengjun-git commented Jan 3, 2025

mergify bot commented Jan 3, 2025

sonarqubecloud bot commented Jan 3, 2025

zhaohehuhu commented Jan 3, 2025

github-actions bot commented Jan 3, 2025

github-actions bot commented Jan 3, 2025

zhaohehuhu commented Jan 8, 2025 •

edited

Loading

zhaohehuhu commented Jan 9, 2025

stephen-shelby commented Jan 9, 2025

zhaohehuhu commented Jan 9, 2025 •

edited

Loading

zhaohehuhu commented Jan 22, 2025 •

edited

Loading

[Enhancement] Limit memory used by ParallelIterable in Iceberg #54219

[Enhancement] Limit memory used by ParallelIterable in Iceberg #54219

Conversation

zhaohehuhu commented Dec 23, 2024 • edited Loading

Why I'm doing:

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

zhaohehuhu commented Dec 23, 2024 • edited Loading

github-actions bot commented Dec 27, 2024

[FE Incremental Coverage Report]

stephen-shelby commented Jan 2, 2025

zhaohehuhu commented Jan 2, 2025

stephen-shelby commented Jan 3, 2025

gengjun-git commented Jan 3, 2025

mergify bot commented Jan 3, 2025

✅ Branch has been successfully rebased

sonarqubecloud bot commented Jan 3, 2025

Quality Gate passed

zhaohehuhu commented Jan 3, 2025

github-actions bot commented Jan 3, 2025

[Java-Extensions Incremental Coverage Report]

github-actions bot commented Jan 3, 2025

[BE Incremental Coverage Report]

zhaohehuhu commented Jan 8, 2025 • edited Loading

zhaohehuhu commented Jan 9, 2025

stephen-shelby commented Jan 9, 2025

zhaohehuhu commented Jan 9, 2025 • edited Loading

zhaohehuhu commented Jan 22, 2025 • edited Loading

zhaohehuhu commented Dec 23, 2024 •

edited

Loading

zhaohehuhu commented Dec 23, 2024 •

edited

Loading

zhaohehuhu commented Jan 8, 2025 •

edited

Loading

zhaohehuhu commented Jan 9, 2025 •

edited

Loading

zhaohehuhu commented Jan 22, 2025 •

edited

Loading