-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial version of daemon mode #103
Conversation
Thanks! I've looked at roughly half the PR and it looks really great. Obviously, with this amount of code I won't be able to scrutinize all the details and find obscure goroutine races etc.
Okay. Are you thinking
That sounds like a good location.
Added an inline comment about this.
That's fine. It'd be good to have them once the dust settles but we don't have to keep them up to date during development.
Okay! |
Thanks for your review, this is very much appreciated. I agree, there are quite some go routines, channels, waitgroups, etc. involved. To find races by just reading the code is very hard. I have some ideas to make this part easier to understand and maintain. I will give it a try.
Me personally, I would stick to the |
452b89f
to
b4209bf
Compare
I fixed some more bugs and added functionality to support I have the latest changes in a separate branch (forked from the branch, this PR is created from: https://github.com/breml/logstash-filter-verifier/tree/add-daemon-mode-continue). @magnusbaeck Do you prefer if I update this PR with my latest changes or do you prefer separate PR? I will now try to compile a list of the missing feature from our initial issues to figure out, where to continue. |
Very cool! Our Logstash configuration is similar in size. I'll try it with your latest code.
Since Github doesn't support dependencies between PRs I'm not sure there's a point in creating a separate PR. I'll do my best to look at the rest of this PR in the next few days. |
@magnusbaeck Just some quick remarks regarding my test:
for f in *.json ; do
jq '. += {"exportMetadata": true}' $f > $f.migrated && mv $f.migrated $f
done I hope, these remarks are of help, if it does not work out of the box with your Logstash config. |
One of the above comments was not completely correct, I only used the filter part of the configuration, but I added the following dummy input/output:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks really great! I just had a few comments and questions.
I tried running it locally (from commit 2b78b2b), but I'm unsure what to expect from daemon start
. Is that command supposed to block? I can dig out the Logstash log from the tempdir and Logstash starts up fin but the LFV command hangs. Backgrounding the process and running daemon run
eventually results in a panic:
waitForState: ready
state change: "ready" -> "setting_up_test" by "setup-test"
waitForState: ready_for_test
state change: "unknown" because of waitForState timeout
waitForState: ready_for_test
shutdown log reader
broadcast shutdown for waitForState
state change: "created" -> "started" by "start"
panic: instance not found in assigned controllers
goroutine 67 [running]:
github.com/magnusbaeck/logstash-filter-verifier/v2/internal/daemon/pool.(*Pool).Return(0xc0000b3980, 0xbb3d40, 0xc00018f110, 0x0)
/home/magnus/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/internal/daemon/pool/pool.go:156 +0x2c6
github.com/magnusbaeck/logstash-filter-verifier/v2/internal/daemon/session.(*Controller).DestroyByID(0xc0000b39e0, 0xc00042ac90, 0x8, 0x0, 0x0)
/home/magnus/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/internal/daemon/session/controller.go:103 +0x178
github.com/magnusbaeck/logstash-filter-verifier/v2/internal/app/daemon.(*Daemon).ExecuteTest.func1(0xc00047db68, 0xc000150b40, 0xc000473570)
/home/magnus/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/internal/app/daemon/daemon.go:345 +0x5b
github.com/magnusbaeck/logstash-filter-verifier/v2/internal/app/daemon.(*Daemon).ExecuteTest(0xc000150b40, 0xbb1420, 0xc000514ae0, 0xc000473570, 0x0, 0xba53a0, 0xc0003a00e0)
/home/magnus/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/internal/app/daemon/daemon.go:357 +0x395
github.com/magnusbaeck/logstash-filter-verifier/v2/internal/daemon/api/grpc._Control_ExecuteTest_Handler(0xa9c2e0, 0xc000150b40, 0xbb1420, 0xc000514ae0, 0xc00046e840, 0x0, 0xbb1420, 0xc000514ae0, 0xc00042c6e0, 0x4f)
/home/magnus/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/internal/daemon/api/grpc/api_grpc.pb.go:153 +0x214
google.golang.org/grpc.(*Server).processUnaryRPC(0xc00011b500, 0xbb6800, 0xc000001800, 0xc0004f3a00, 0xc00019b080, 0xf31010, 0x0, 0x0, 0x0)
/home/magnus/go/pkg/mod/google.golang.org/[email protected]/server.go:1210 +0x522
google.golang.org/grpc.(*Server).handleStream(0xc00011b500, 0xbb6800, 0xc000001800, 0xc0004f3a00, 0x0)
/home/magnus/go/pkg/mod/google.golang.org/[email protected]/server.go:1533 +0xd05
google.golang.org/grpc.(*Server).serveStreams.func1.2(0xc00002a200, 0xc00011b500, 0xbb6800, 0xc000001800, 0xc0004f3a00)
/home/magnus/go/pkg/mod/google.golang.org/[email protected]/server.go:871 +0xa5
created by google.golang.org/grpc.(*Server).serveStreams.func1
/home/magnus/go/pkg/mod/google.golang.org/[email protected]/server.go:869 +0x1fd
logstash-filter-verifier: error: rpc error: code = Unavailable desc = transport is closing
I'm not sure what to expect so I haven't debugged this any further.
@@ -0,0 +1,626 @@ | |||
// Code generated by protoc-gen-go. DO NOT EDIT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we really commit these autogenerated files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me personally prefer to commit these autogenerated files, because it allows others to compile the code without the need of installing all the necessary tools (if the repo would be a library, we would not even discuss this topic, because not committing these files would prevent go get
from working). It also makes the build pipeline on travis (or github actions) simpler, because otherwise you have to make all these tools available in the build pipeline as well in order to be able to build and execute the tests.
I think it does not do any harm to commit them.
ID string `yaml:"pipeline.id"` | ||
Config string `yaml:"path.config"` | ||
Ordered string `yaml:"pipeline.ordered"` | ||
Workers int `yaml:"pipeline.workers"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the dot notation a YAML feature or something Elastic has made up, i.e. will the yaml package understand that
pipeline.ordered: true
pipeline.workers: 3
and
pipeline:
ordered: true
workers: 3
are equivalent to Logstash? (Because it is, right? Or am I conflating this with Elasticsearch index settings?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not able to find a reference, that these "flat keys" are a YAML feature. But in the Logstash documentation, this is mentioned (https://www.elastic.co/guide/en/logstash/current/logstash-settings-file.html), that both, nested and flat keys are supported.
From a Go code point of view, this allows me to keep the struct flat and omit the otherwise necessary nesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, internally we should use the flat keys for convenience but we should support users writing files with nested keys. Since Beats supports both forms, maybe we can grab code from there? Or some other project. We wouldn't be the first ones wanting to flatten Go maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting. I am not sure, how to do this, because as mentioned before, this is not a YAML feature but a special handling of Logstash. So what we might need to do is mirror both possibilities in the pipeline struct and then post-process the struct, such that the value is copied over. This raises the question, what if a user uses both ways (and does not set the same value), like this:
- pipeline.ordered: true
pipeline:
ordered: false
Which value should have precedence in this case?
Yes, My
So currently test only with logstash 7.10.0. With other versions, there might be unknown issues. After successful startup, you should see the following output (I have log level on debug in my config):
Especially the state should change to I recommend you to start with the examples I provide in the repo, e.g. If these tests are successful, you may move on to your own config. When the daemon does timeout or even crash, then there might be a problem with the loading of the logstash config. The logstash log file might give you some hints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for your work on this PR! There are things to work on but the PR is big enough and the code seems to work under at least a narrow set of circumstances so I'll merge it now.
I was able to run your test pipeline without hiccups and after some debugging of my own files I realized that I'd missed your comment about adding the dummy stdin/stdout plugins (present in your basic_pipeline example). I haven't retried my own configuration with such plugins added but I presume it's going to work. Commenting out the stdin input in the basic_pipeline example causes it to exhibit the exact same problem.
What's the long-term plan here? Automatically add dummy inputs and outputs if none are present or detect the problem and alert the user?
@magnusbaeck That is a valid question. If I remember correctly, we had a similar question some weeks ago. Raise an error to the user, if there are no inputs or outputs, should be a pretty straight forward thing to add. Automatically adding inputs and outputs only works, if the user has a single pipeline (or multiple pipelines, which are connected serially, which is then not different to a single pipeline). If this is not the case, it is not possible to figure out, where to add the inputs and outputs. |
I added the check for inputs and outputs in breml@77de4a6. I will try to finish my work on the codec support in the next few days and I will then create a new PR for you to review. |
@magnusbaeck I reached a state, where I felt, I should create a first PR for my work on the daemon mode in order to start discussion about the code and package structure. I also added some unit tests as well as some integration tests (some packages have already pretty decent coverage).
Be aware, this PR does include a lot changes, ~ 5000 new lines of code.
There are still a lot of open topics (some listed as
TODO:
orFIXME:
comments, some in my personal notes).Some specific open issues:
daemon
. E.g.logstash-filter-verifier daemon start
. The run/execute the test cases in daemon mode, the command is namedlogstash-filter-verifier daemon run
.internal/daemon
in order to clearly separate them from the existing standalone code base.protobuf-compiler
in theMakefile
. Theprotobuf-compiler
is required forgo generate
to work in order to generate the protobuf/grpc related code.README.md
is currently completely untouched (also thelogstash-filter-verifier standalone
part is missing)I am looking forward to your feedback.
Related issues: #93, #94, #96