Skip to content
This repository has been archived by the owner on Jan 8, 2019. It is now read-only.

Dashboard instant Gateway Timeout #1679

Open
gruselglatz opened this issue Nov 10, 2015 · 6 comments
Open

Dashboard instant Gateway Timeout #1679

gruselglatz opened this issue Nov 10, 2015 · 6 comments

Comments

@gruselglatz
Copy link

Hi,

Dashboards with many queries over a long time period will lead to an instant Gateway-Timeout until the queries are done.

only log information i get:

2015-11-10 15:25:35,016 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in pool-38-thread-1
Connection refused: /192.168.100.20:12900

2015-11-10 15:25:37,980 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
Connection refused: /127.0.0.1:12900

2015-11-10 15:25:42,981 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
Connection refused: /127.0.0.1:12900

2015-11-10 15:25:47,982 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
Connection refused: /127.0.0.1:12900

2015-11-10 15:25:52,983 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
Connection refused: /127.0.0.1:12900

When the Dashboard gets opened the Server load is High but it is responding normaly.
Http timeouts are changed to 30s

I cant find a Gateway-Timeout to configure, it looks like there is no Timeout, bcs its popping up instantly.

Update: The Widgets only shows N/A and as error that the API-Call failed, after 2 Seconds it says That the Gateway timed out, and after a Page Refresh the Widgets are filled with Data.

Is there some Option to set the Widget timeout or Web-Interface to Server timeout?

Would it be helpful if you didn't trigger all Widgets at the same time? I have no Problem with waiting a few seconds with a rolling symbol than getting 20 Gateway Timeout messages or even get thrown out to the /disconnect page...

Update 2: If you ran into the same issueslb_recognition_period_seconds = 0does the job. The Web Interface then will wait for the Widget to get loaded.

BUT the problem of getting thrown to /disconnect page stays.

Update 3: I ran into the same Problem every time, here are some new Stacktraces and my Server config:

2015-11-13 07:32:51,347 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in pool-234-thread-1
Connection refused: /192.168.100.20:12900

2015-11-13 07:32:52,195 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
Connection refused: /127.0.0.1:12900

2015-11-13 07:32:52,399 - [INFO] - from play in Thread-5
Shutdown application default Akka system.

2015-11-13 07:33:01,485 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in main
Connection refused: /127.0.0.1:12900

2015-11-13 07:33:01,537 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
Connection refused: /127.0.0.1:12900

2015-11-13 07:33:01,600 - [INFO] - from play in main
Application started (Prod)

2015-11-13 07:33:01,687 - [INFO] - from play in main
Listening for HTTP on /127.0.0.1:9000

2015-11-13 07:33:06,546 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
Connection refused: /127.0.0.1:12900

2015-11-13 07:33:10,774 - [INFO] - from play in New I/O worker #18
Starting application default Akka system.

2015-11-13 07:33:11,549 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
Connection refused: /127.0.0.1:12900

2015-11-13 07:33:16,565 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
Connection refused: /127.0.0.1:12900

2015-11-13 07:34:26,696 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in pool-11-thread-1
API call Interrupted
java.lang.InterruptedException: null
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039) ~[na:1.8.0_60]
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) ~[na:1.8.0_60]
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) ~[na:1.8.0_60]
        at com.ning.http.client.providers.netty.future.NettyResponseFuture.get(NettyResponseFuture.java:158) ~[com.ning.async-http-client-1.9.31.jar:na]
        at org.graylog2.restclient.lib.ApiClientImpl$ApiRequestBuilder.executeOnAll(ApiClientImpl.java:608) ~[org.graylog2.graylog2-rest-client--1.2.2-1.2.2.jar:na]
        at controllers.api.MetricsController$PollingJob.run(MetricsController.java:117) [graylog-web-interface.graylog-web-interface-1.2.2.jar:1.2.2]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_60]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_60]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_60]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_60]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_60]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_60]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]

2015-11-13 07:34:39,084 - [ERROR] - from org.graylog2.restclient.lib.ApiClient in servernodes-refresh-0
API call timed out
java.util.concurrent.TimeoutException: null
        at com.ning.http.client.providers.netty.future.NettyResponseFuture.get(NettyResponseFuture.java:159) ~[com.ning.async-http-client-1.9.31.jar:na]
        at org.graylog2.restclient.lib.ApiClientImpl$ApiRequestBuilder.executeOnAll(ApiClientImpl.java:608) ~[org.graylog2.graylog2-rest-client--1.2.2-1.2.2.jar:na]
        at org.graylog2.restclient.lib.ServerNodesRefreshService.resolveConfiguredNodes(ServerNodesRefreshService.java:97) [org.graylog2.graylog2-rest-client--1.2.2-1.2.2.jar:na]
        at org.graylog2.restclient.lib.ServerNodesRefreshService.access$400(ServerNodesRefreshService.java:42) [org.graylog2.graylog2-rest-client--1.2.2-1.2.2.jar:na]
        at org.graylog2.restclient.lib.ServerNodesRefreshService$1.run(ServerNodesRefreshService.java:126) [org.graylog2.graylog2-rest-client--1.2.2-1.2.2.jar:na]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_60]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_60]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_60]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_60]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_60]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_60]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]

Config:

is_master = true
node_id_file = /var/opt/graylog/graylog-server-node-id
root_timezone = Europe/Vienna
plugin_dir = /opt/graylog/plugin
rotation_strategy = time
elasticsearch_max_size_per_index = 1073741824
elasticsearch_max_time_per_index = 24h
elasticsearch_max_number_of_indices = 45
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 1
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = true
allow_highlighting = false
elasticsearch_cluster_name = graylog2
elasticsearch_http_enabled = false
elasticsearch_discovery_zen_ping_unicast_hosts = 127.0.0.1:9300
elasticsearch_cluster_discovery_timeout = 30000
elasticsearch_discovery_initial_state_timeout = 3s
elasticsearch_analyzer = standard
elasticsearch_request_timeout = 2m
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/opt/graylog/data/journal
message_journal_max_size = 1gb
async_eventbus_processors = 2
dead_letters_enabled = false
lb_recognition_period_seconds = 0
alert_check_interval = 60
mongodb_max_connections = 100
mongodb_threads_allowed_to_block_multiplier = 5
rules_file = /opt/graylog/rules/graylog.drl
http_connect_timeout = 30s
http_read_timeout = 30s
http_write_timeout = 30s
dashboard_widget_default_cache_time = 10s
@adrianlyjak
Copy link

+1

@kroepke
Copy link
Contributor

kroepke commented Dec 12, 2015

Currently the only real workaround is to increase the default timeout in the web-interface.conf to something higher.
I believe the parameter is called timeout.DEFAULT and takes standard time values like 5s etc. I'd increase it to 15s and see where it goes.
In the future we will solve this differently

@gruselglatz
Copy link
Author

I can't find a parameter called timeout.DEFAULT. In which .conf file should it be? I am on 1.2.2

@kroepke
Copy link
Contributor

kroepke commented Dec 12, 2015

It's not listed in the default config file.
Put it into the web interface configuration file like:
timeout.DEFAULT = 15s

@gruselglatz
Copy link
Author

OK Thx, that solved the problem. i've set it to 50s

@reighnman
Copy link

I was running into a timeout issue with system/indecies requiring me to bump the timeout to 15s.

I have roughly 400 32gb indexes ~ 2400 shards

This has made large dashboards way less clunky in terms of random errors as well. Thanks @kroepke

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants