Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fusion-classic-rest-service pod could not start #171

Open
cermakp opened this issue Apr 8, 2021 · 0 comments
Open

fusion-classic-rest-service pod could not start #171

cermakp opened this issue Apr 8, 2021 · 0 comments

Comments

@cermakp
Copy link

cermakp commented Apr 8, 2021

Hi,

we are facing issues that fusion-classic-rest-service-0 could not be started and is restarted over and over

fusion-admin-ui-85b6866fbb-hfsxl                                  1/1     Running                 0          26d
fusion-ambassador-8588f45b44-qs976                                1/1     Running                 0          36d
fusion-api-gateway-5d4cf67975-k56nd                               1/1     Running                 0          20d
fusion-argo-ui-8db6b5887-2b5tm                                    1/1     Running                 0          36d
fusion-auth-ui-555cfbbf54-qmzqc                                   1/1     Running                 0          26d
fusion-classic-rest-service-0                                     0/1     Init:CrashLoopBackOff   4699       23d
fusion-devops-ui-6f6c5466bd-r6fvs                                 1/1     Running                 0          26d
fusion-fusion-admin-59cd7d4c96-flxqc                              1/1     Running                 0          26d
fusion-fusion-indexing-54b6474f57-w7wl7                           1/1     Running                 26         26d
fusion-fusion-log-forwarder-66bc598c7-wpfss                       1/1     Running                 0          26d
fusion-insights-6d9cbc5769-p99cj                                  1/1     Running                 0          26d
fusion-job-launcher-5ccc758859-jklbc                              1/1     Running                 0          26d
fusion-job-rest-server-78897f8886-8kgt8                           1/1     Running                 0          26d
fusion-ml-model-service-5c4cffd47d-gq5bd                          1/1     Running                 0          26d
fusion-monitoring-grafana-7f9d5cccf8-6m7bw                        1/1     Running                 0          36d
fusion-monitoring-prometheus-kube-state-metrics-66f6cc4bb-k8pk7   1/1     Running                 0          36d
fusion-monitoring-prometheus-pushgateway-7996489596-r4rs7         1/1     Running                 0          36d
fusion-monitoring-prometheus-server-0                             2/2     Running                 0          36d
fusion-mysql-7b97f56bdc-9rw8s                                     1/1     Running                 0          36d
fusion-pm-ui-747576df49-qqsp6                                     1/1     Running                 0          26d
fusion-pulsar-bookkeeper-0                                        1/1     Running                 0          36d
fusion-pulsar-bookkeeper-1                                        1/1     Running                 0          36d
fusion-pulsar-bookkeeper-2                                        1/1     Running                 0          36d
fusion-pulsar-broker-0                                            1/1     Running                 0          36d
fusion-pulsar-broker-1                                            1/1     Running                 0          36d
fusion-query-pipeline-6dbbf8886c-qswsc                            1/1     Running                 0          26d
fusion-rest-service-6ffc8f9cc4-ndhw4                              1/1     Running                 0          26d
fusion-rpc-service-66b5c4885-cjn4j                                1/1     Running                 0          36d
fusion-rules-ui-9ccb6db59-m74sw                                   1/1     Running                 0          26d
fusion-solr-0                                                     1/1     Running                 0          36d
fusion-solr-exporter-6fccf89d5f-4pdxq                             1/1     Running                 0          36d
fusion-templating-c96f57955-gdh9f                                 1/1     Running                 0          26d
fusion-webapps-69cc458d47-847rj                                   1/1     Running                 0          26d
fusion-workflow-controller-ffc878cc-2scvd                         1/1     Running                 0          36d
fusion-zookeeper-0                                                1/1     Running                 0          36d
fusion-zookeeper-1                                                1/1     Running                 0          36d
fusion-zookeeper-2                                                1/1     Running                 0          36d
milvus-writable-588d6c755d-w2j8m                                  1/1     Running                 0          36d
seldon-controller-manager-86f68fbcd-dk6db                         1/1     Running                 6          36d

When I described failing pod (fusion-classic-rest-service-0) and got the information that the one of init containers ("check-zk") fails

Init Containers:
  check-zk:
    Container ID:  containerd://86cbec8dd8bb25ea8239aaa551f28e7bc8164771f911cb29be0a61a79247e5cc
    Image:         lucidworks/check-fusion-dependency:v1.2.0
    Image ID:      docker.io/lucidworks/check-fusion-dependency@sha256:9829ccb6a0bea76ac92851b51f8fd8451b7f803019adf27865f093d168a6b19e
    Port:          <none>
    Host Port:     <none>
    Args:
      zookeeper
    State:          Running
      Started:      Thu, 08 Apr 2021 14:03:56 +0200
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Thu, 08 Apr 2021 13:56:55 +0200
      Finished:     Thu, 08 Apr 2021 13:58:55 +0200
    Ready:          False
    Restart Count:  4700
    Limits:
      cpu:     200m
      memory:  32Mi
    Requests:
      cpu:     200m
      memory:  32Mi
    Environment:
      ZOOKEEPER_CONNECTION_STRING:  fusion-zookeeper-0.fusion-zookeeper-headless:2181,fusion-zookeeper-1.fusion-zookeeper-headless:2181,fusion-zookeeper-2.fusion-zookeeper-headless:2181
      CHECK_INTERVAL:               5s
      CHECK_TIMEOUT:                2s
      TIMEOUT:                      2m
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from fusion-classic-rest-service-token-hr8sx (ro)

So I got the log from init container

2021/04/08 12:03:56 Check returned error: dial tcp: lookup fusion-zookeeper-2.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:01 Check returned error: dial tcp: lookup fusion-zookeeper-1.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:06 Check returned error: dial tcp: lookup fusion-zookeeper-1.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:11 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:16 Check returned error: dial tcp: lookup fusion-zookeeper-2.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:21 Check returned error: dial tcp: lookup fusion-zookeeper-2.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:26 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:31 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:36 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:41 Check returned error: dial tcp: lookup fusion-zookeeper-2.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:46 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:51 Check returned error: dial tcp: lookup fusion-zookeeper-1.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:04:56 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:01 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:06 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:11 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:16 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:21 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:26 Check returned error: dial tcp: lookup fusion-zookeeper-1.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:31 Check returned error: dial tcp: lookup fusion-zookeeper-1.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:36 Check returned error: dial tcp: lookup fusion-zookeeper-1.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:41 Check returned error: dial tcp: lookup fusion-zookeeper-2.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:46 Check returned error: dial tcp: lookup fusion-zookeeper-0.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:51 Check returned error: dial tcp: lookup fusion-zookeeper-1.fusion-zookeeper-headless on 10.237.48.10:53: dial udp 10.237.48.10:53: connect: network is unreachable
2021/04/08 12:05:56 Error checking zookeeper is running: Timed out waiting for check to complete successfully

Here is a list of all services

NAME                                              TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                               AGE
admin                                             ClusterIP      10.237.52.18    <none>         8765/TCP                              36d
admin-ui                                          ClusterIP      10.237.60.242   <none>         8080/TCP                              36d
auth-ui                                           ClusterIP      10.237.55.132   <none>         8080/TCP                              36d
connector-plugin-service                          ClusterIP      10.237.50.132   <none>         9020/TCP                              36d
connectors                                        ClusterIP      10.237.48.90    <none>         9010/TCP                              36d
connectors-classic                                ClusterIP      None            <none>         9000/TCP                              36d
connectors-rpc                                    ClusterIP      10.237.62.128   <none>         8771/TCP                              36d
devops-ui                                         ClusterIP      10.237.51.36    <none>         8080/TCP                              36d
fusion-ambassador                                 ClusterIP      10.237.60.104   <none>         80/TCP,443/TCP                        36d
fusion-argo-ui                                    ClusterIP      10.237.48.220   <none>         2746/TCP                              36d
fusion-monitoring-grafana                         ClusterIP      10.237.61.14    <none>         80/TCP                                36d
fusion-monitoring-prometheus-kube-state-metrics   ClusterIP      None            <none>         80/TCP,81/TCP                         36d
fusion-monitoring-prometheus-pushgateway          ClusterIP      10.237.49.238   <none>         9091/TCP                              36d
fusion-monitoring-prometheus-server               ClusterIP      10.237.62.189   <none>         80/TCP                                36d
fusion-monitoring-prometheus-server-headless      ClusterIP      None            <none>         80/TCP                                36d
fusion-mysql                                      ClusterIP      10.237.52.91    <none>         3306/TCP                              36d
fusion-pulsar-bookkeeper                          ClusterIP      None            <none>         3181/TCP,8000/TCP                     36d
fusion-pulsar-broker                              ClusterIP      None            <none>         8080/TCP,6650/TCP                     36d
fusion-solr-exporter                              ClusterIP      10.237.54.108   <none>         9983/TCP                              36d
fusion-solr-headless                              ClusterIP      None            <none>         8983/TCP                              36d
fusion-solr-svc                                   ClusterIP      10.237.61.138   <none>         8983/TCP                              36d
fusion-zookeeper                                  ClusterIP      10.237.55.91    <none>         2181/TCP,2281/TCP                     36d
fusion-zookeeper-headless                         ClusterIP      None            <none>         2181/TCP,3888/TCP,2888/TCP,2281/TCP   36d
indexing                                          ClusterIP      10.237.62.46    <none>         8765/TCP                              36d
insights                                          ClusterIP      10.237.53.178   <none>         8080/TCP                              36d
job-launcher                                      ClusterIP      10.237.50.233   <none>         8083/TCP                              36d
job-rest-server                                   ClusterIP      10.237.63.32    <none>         8081/TCP                              36d
milvus                                            ClusterIP      10.237.57.195   <none>         19530/TCP,19121/TCP                   36d
ml-model-grpc                                     ClusterIP      10.237.63.47    <none>         6565/TCP                              36d
ml-model-service                                  ClusterIP      10.237.56.36    <none>         8086/TCP                              36d
pm-ui                                             ClusterIP      10.237.61.241   <none>         8080/TCP                              36d
proxy                                             LoadBalancer   10.237.56.89    20.50.14.165   6764:31028/TCP                        36d
pulsar-broker                                     ClusterIP      None            <none>         8080/TCP,6650/TCP                     36d
query                                             ClusterIP      10.237.50.250   <none>         8787/TCP                              36d
rules-ui                                          ClusterIP      10.237.48.49    <none>         8080/TCP                              36d
seldon-webhook-service                            ClusterIP      10.237.53.165   <none>         443/TCP                               36d
templating                                        ClusterIP      10.237.54.124   <none>         5250/TCP                              36d
webapps                                           ClusterIP      10.237.61.72    <none>         8780/TCP                              36d

And a list of endpoints

NAME                                              ENDPOINTS                                                          AGE
admin                                             10.234.1.47:8765                                                   36d
admin-ui                                          10.234.1.55:8080                                                   36d
auth-ui                                           10.234.1.40:8080                                                   36d
connector-plugin-service                          <none>                                                             36d
connectors                                        10.234.0.132:9010                                                  36d
connectors-classic                                                                                                   36d
connectors-rpc                                    10.234.1.29:8771                                                   36d
devops-ui                                         10.234.0.146:8080                                                  36d
fusion-ambassador                                 10.234.0.151:8443,10.234.0.151:8080                                36d
fusion-argo-ui                                    10.234.1.32:2746                                                   36d
fusion-monitoring-grafana                         10.234.0.233:3000                                                  36d
fusion-monitoring-prometheus-kube-state-metrics   10.234.1.42:8081,10.234.1.42:8080                                  36d
fusion-monitoring-prometheus-pushgateway          10.234.0.139:9091                                                  36d
fusion-monitoring-prometheus-server               10.234.0.145:9090                                                  36d
fusion-monitoring-prometheus-server-headless      10.234.0.145:9090                                                  36d
fusion-mysql                                      10.234.1.53:3306                                                   36d
fusion-pulsar-bookkeeper                          10.234.0.140:8000,10.234.0.246:8000,10.234.1.49:8000 + 3 more...   36d
fusion-pulsar-broker                              10.234.0.141:6650,10.234.1.57:6650,10.234.0.141:8080 + 1 more...   36d
fusion-solr-exporter                              10.234.1.44:9983                                                   36d
fusion-solr-headless                              10.234.0.138:8983                                                  36d
fusion-solr-svc                                   10.234.0.138:8983                                                  36d
fusion-zookeeper                                  10.234.0.157:2181,10.234.0.241:2181,10.234.1.43:2181 + 3 more...   36d
fusion-zookeeper-headless                         10.234.0.157:2888,10.234.0.241:2888,10.234.1.43:2888 + 9 more...   36d
indexing                                          10.234.0.149:8765                                                  36d
insights                                          10.234.0.131:8080                                                  36d
job-launcher                                      10.234.0.235:8083                                                  36d
job-rest-server                                   10.234.0.231:8081                                                  36d
milvus                                            10.234.0.227:19530,10.234.0.227:19121                              36d
ml-model-grpc                                     10.234.0.249:6565                                                  36d
ml-model-service                                  10.234.0.249:8086                                                  36d
pm-ui                                             10.234.0.236:8080                                                  36d
proxy                                             10.234.0.230:6764                                                  36d
pulsar-broker                                     10.234.0.141:6650,10.234.1.57:6650,10.234.0.141:8080 + 1 more...   36d
query                                             10.234.1.45:8787                                                   36d
rules-ui                                          10.234.0.234:8080                                                  36d
seldon-webhook-service                            10.234.1.51:443                                                    36d
templating                                        10.234.0.133:5250                                                  36d
webapps                                           10.234.0.251:8780                                                  36d

And description of endpoint fusion-zookeper-headless

Name:         fusion-zookeeper-headless
Namespace:    fusion
Labels:       app=zookeeper
              app.kubernetes.io/managed-by=Helm
              chart=zookeeper-2.4.2
              heritage=Helm
              release=fusion
              service.kubernetes.io/headless=
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2021-03-02T14:16:05Z
Subsets:
  Addresses:          10.234.0.157,10.234.0.241,10.234.1.43
  NotReadyAddresses:  <none>
  Ports:
    Name       Port  Protocol
    ----       ----  --------
    server     2888  TCP
    client     2181  TCP
    tlsclient  2281  TCP
    election   3888  TCP

Events:  <none>

and description of service fusion-zookeper-headless

Name:              fusion-zookeeper-headless
Namespace:         fusion
Labels:            app=zookeeper
                   app.kubernetes.io/managed-by=Helm
                   chart=zookeeper-2.4.2
                   heritage=Helm
                   release=fusion
Annotations:       meta.helm.sh/release-name: fusion
                   meta.helm.sh/release-namespace: fusion
Selector:          app=zookeeper,release=fusion
Type:              ClusterIP
IP:                None
Port:              client  2181/TCP
TargetPort:        client/TCP
Endpoints:         10.234.0.157:2181,10.234.0.241:2181,10.234.1.43:2181
Port:              election  3888/TCP
TargetPort:        election/TCP
Endpoints:         10.234.0.157:3888,10.234.0.241:3888,10.234.1.43:3888
Port:              server  2888/TCP
TargetPort:        server/TCP
Endpoints:         10.234.0.157:2888,10.234.0.241:2888,10.234.1.43:2888
Port:              tlsclient  2281/TCP
TargetPort:        tlsclient/TCP
Endpoints:         10.234.0.157:2281,10.234.0.241:2281,10.234.1.43:2281
Session Affinity:  None
Events:            <none>

Can somebody please advice me what is wrong and why the init container of "fusion-classic-rest-service-0" tries to reach fusion-zookeper-headless on such strange IP which differs to IP defined in service fusion-zookeper-headless?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant