Skip to content
GuolinTan edited this page Sep 26, 2016 · 15 revisions

Run out of memory?

A component might need to store more data than the maximum heap size for Squall/Storm process. There are several ways how it can be manifested:

  • OutOfMemoryError is thrown
  • the topology executes more than usual, because of restarting a worker. This happens when Garbage Collection process is very inefficient, so that when it is triggered, Storm treats the worker as unavailable. Tuple dropping may occur as well. This can be revealed by inspecting the logs of Storm Master for "Reassigning ids[]" messages containing only one task in the brackets.

Note that scalability does not only depend on the database/window size, but also on the query. Some queries might have an operator which have to materialize almost the whole database.

To fix the problem, we should try one or more of the following:

  • increase parallelism of the components
  • increase DIP_NUM_ACKERS parameter (available only when Squall run manually specified query plans)

I got "failing tuple" exception. What to do?

  1. Check the Run out of memory? section from above.
  2. Only in AckEveryTuple mode: The parameters SystemParameters.MAX_SPOUT_PENDING and SystemParameters.MESSAGE_TIMEOUT_SECS refer to the Storm parameters with the same name. In general, decreasing MAX_SPOUT_PENDING and increasing MESSAGE_TIMEOUT_SECS might prevent "failing tuples" exceptions to occur.

Where can I see the output in cluster mode?