-
Notifications
You must be signed in to change notification settings - Fork 96
Troubleshooting
GuolinTan edited this page Sep 26, 2016
·
15 revisions
A component might need to store more data than the maximum heap size for Squall/Storm process. There are several ways how it can be manifested:
-
OutOfMemoryError
is thrown - the topology executes more than usual, because of restarting a worker. This happens when Garbage Collection process is very inefficient, so that when it is triggered, Storm treats the worker as unavailable. Tuple dropping may occur as well. This can be revealed by inspecting the logs of Storm Master for "Reassigning ids[]" messages containing only one task in the brackets.
Note that scalability does not only depend on the database/window size, but also on the query. Some queries might have an operator which have to materialize almost the whole database.
To fix the problem, we should try one or more of the following:
- increase parallelism of the components
- increase
DIP_NUM_ACKERS
parameter (available only when Squall run manually specified query plans)
- Check the Run out of memory? section from above.
- Only in AckEveryTuple mode: The parameters
SystemParameters.MAX_SPOUT_PENDING
andSystemParameters.MESSAGE_TIMEOUT_SECS
refer to the Storm parameters with the same name. In general, decreasingMAX_SPOUT_PENDING
and increasingMESSAGE_TIMEOUT_SECS
might prevent "failing tuples" exceptions to occur.