-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbsmigration server monitoring errors #644
Comments
Yuyi, I doubt it is issue with our monitoring tool, we apply the same tool/command to all DBS servers and it seems to me that only DBS migration server has error in a log. If you log into you pod and call this command:
you'll see that it is nicely return 200 OK which is used for monitoring. But the log is generated by DBS cherrypy server and I think it is issue with DBS Migration server rather the cmsweb-ping command. May be it is issue with WMCore REST server which generates the log entry. For that you need to check WMCore/src/python/WMCore/WebTools/Root.py codebase. |
Yuyi, the actual error seems to come from this line: It can be proven by running pylint over this piece of code e.g.
I suggest that you ping @amaltaro to resolve this generic error in WMCore code. |
It looks like this issue only got exposed because this Go client does not provide a Content-Length header (or it's 0), thus falling into the try/except broken code. Issue seems to be there for almost 2 years now, introduced in: dmwm/WMCore#9197 Yuyi, can you please open a WMCore GH issue and clarify to which branch this fix needs to be backported? I assume you will need a new tag as well, right? Before we actually implement this fix, I'd suggest you to test the fix Valentin suggests in one of the k8s pods (or in your VM), then restart DBSMigrate and see whether the problem gets fixed. |
I opened ticket dmwm/WMCore#10207. The code can be fix as simple as below:
@vkuznet |
Yuyi, the error messages comes from DBS Python server, therefore it is not related to anything in monitoring. But to clarify the subject, the monitoring is applied via livenessProbe declaration in service yaml file. For instance, in dbs-migrate.yaml you'll find it here:
This tool simply makes HTTP GET request to provided url. Since all WMCore services (including DBS) requires authorization the cmsweb-ping takes deployed hmac file which is used to setup fake CMS HTTP headers (which are required for authorization function in WMCore). That's it! We can easily use another tool, e.g. curl to query your service. Since such tool only makes HTTP call it is processed by your service which provides log entries. The error has nothing to do with monitoring. Since the issue is clearly in python code you better to inspect WMCore how it was deployed. There are many explanation I can come up with why it shows in one log and not in another, but all of them are python related and has nothing to do with monitoring/HTTP requests we make for monitoring purposes. |
I saw a lot of errors in the k8s dbsmigration log files like below:
In vm log files, they look like below:
127.0.0.1 - - [01/Jan/2021:02:20:49] "GET / HTTP/1.1" 200 22 "" "ServerMonitor/2.0"
@vkuznet can you take a look into the monitoring ?
The text was updated successfully, but these errors were encountered: