Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial version of daemon mode #103

Merged
merged 17 commits into from
Mar 28, 2021
Merged

Conversation

breml
Copy link
Collaborator

@breml breml commented Feb 16, 2021

@magnusbaeck I reached a state, where I felt, I should create a first PR for my work on the daemon mode in order to start discussion about the code and package structure. I also added some unit tests as well as some integration tests (some packages have already pretty decent coverage).

Be aware, this PR does include a lot changes, ~ 5000 new lines of code.

There are still a lot of open topics (some listed as TODO: or FIXME: comments, some in my personal notes).

Some specific open issues:

  • I decided to put all the sub-commands related to the daemon mode below daemon. E.g. logstash-filter-verifier daemon start. The run/execute the test cases in daemon mode, the command is named logstash-filter-verifier daemon run.
  • Most of the new Go packages, that contain the "business" logic of the daemon, are located in internal/daemon in order to clearly separate them from the existing standalone code base.
  • I am not sure, how to add the dependency check for protobuf-compiler in the Makefile. The protobuf-compiler is required for go generate to work in order to generate the protobuf/grpc related code.
  • Early on, I created some state and sequence diagrams to visualize, how the daemon mode is supposed to work. There have been quite some changes since then and therefore the diagrams are outdated. I might add them later, when things have further stabilized.
  • The README.md is currently completely untouched (also the logstash-filter-verifier standalone part is missing)

I am looking forward to your feedback.

Related issues: #93, #94, #96

Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
integration_test.go Outdated Show resolved Hide resolved
internal/app/daemon/daemon.go Outdated Show resolved Hide resolved
internal/daemon/api/grpc/api.proto Outdated Show resolved Hide resolved
@magnusbaeck
Copy link
Owner

Thanks! I've looked at roughly half the PR and it looks really great. Obviously, with this amount of code I won't be able to scrutinize all the details and find obscure goroutine races etc.

I decided to put all the sub-commands related to the daemon mode below daemon. E.g. logstash-filter-verifier daemon start. The run/execute the test cases in daemon mode, the command is named logstash-filter-verifier daemon run.

Okay. Are you thinking stanealone run for symmetry or just standalone since there probably won't be any other standlone-related verbs?

Most of the new Go packages, that contain the "business" logic of the daemon, are located in internal/daemon in order to clearly separate them from the existing standalone code base.

That sounds like a good location.

I am not sure, how to add the dependency check for protobuf-compiler in the Makefile. The protobuf-compiler is required for go generate to work in order to generate the protobuf/grpc related code.

Added an inline comment about this.

Early on, I created some state and sequence diagrams to visualize, how the daemon mode is supposed to work. There have been quite some changes since then and therefore the diagrams are outdated. I might add them later, when things have further stabilized.

That's fine. It'd be good to have them once the dust settles but we don't have to keep them up to date during development.

The README.md is currently completely untouched (also the logstash-filter-verifier standalone part is missing)

Okay!

@breml
Copy link
Collaborator Author

breml commented Feb 22, 2021

Thanks! I've looked at roughly half the PR and it looks really great. Obviously, with this amount of code I won't be able to scrutinize all the details and find obscure goroutine races etc.

Thanks for your review, this is very much appreciated. I agree, there are quite some go routines, channels, waitgroups, etc. involved. To find races by just reading the code is very hard. I have some ideas to make this part easier to understand and maintain. I will give it a try.

I decided to put all the sub-commands related to the daemon mode below daemon. E.g. logstash-filter-verifier daemon start. The run/execute the test cases in daemon mode, the command is named logstash-filter-verifier daemon run.

Okay. Are you thinking stanealone run for symmetry or just standalone since there probably won't be any other standlone-related verbs?

Me personally, I would stick to the standalone without an additional run, if you want to keep the rest of the cli api as close to the existing one as possible. It would be a different story, if we would refactor the existing cli api as well, for example by introducing a sub-command for the socket mode.

@breml
Copy link
Collaborator Author

breml commented Mar 12, 2021

I fixed some more bugs and added functionality to support keep-envs also in daemon mode. With the fixes done today, I was able to run our main test suite in daemon mode (> 50 test cases, Logstash Config with > 1400 lines). This is a nice milestone / achievement so far.

I have the latest changes in a separate branch (forked from the branch, this PR is created from: https://github.com/breml/logstash-filter-verifier/tree/add-daemon-mode-continue).

@magnusbaeck Do you prefer if I update this PR with my latest changes or do you prefer separate PR?

I will now try to compile a list of the missing feature from our initial issues to figure out, where to continue.

@magnusbaeck
Copy link
Owner

I fixed some more bugs and added functionality to support keep-envs also in daemon mode. With the fixes done today, I was able to run our main test suite in daemon mode (> 50 test cases, Logstash Config with > 1400 lines). This is a nice milestone / achievement so far.

Very cool! Our Logstash configuration is similar in size. I'll try it with your latest code.

Do you prefer if I update this PR with my latest changes or do you prefer separate PR?

Since Github doesn't support dependencies between PRs I'm not sure there's a point in creating a separate PR. I'll do my best to look at the rest of this PR in the next few days.

@breml
Copy link
Collaborator Author

breml commented Mar 15, 2021

@magnusbaeck Just some quick remarks regarding my test:

  • I used the most recent code as of the, which has been the following commit (if I recall correctly): breml@69c9c74
  • In the test suite I mentioned above, we do not use the codec setting. This feature will be ignored by the version 2.0 (the idea is to inherit the codec from the input that is replaced, so there will be a replacement).
  • I only passed the filter part of our configuration to LFV 2.0, because this is the way it is working with v1.0 anyway.
  • It only works, if every plugin in the Logstash config has an ID (I had the add it in some cases, where it was missing)
  • I had to add exportMetadata: true for every test case suite file. This can be done with the following snipped:
for f in *.json ; do
    jq '. += {"exportMetadata": true}' $f > $f.migrated && mv $f.migrated $f
done

I hope, these remarks are of help, if it does not work out of the box with your Logstash config.

@breml
Copy link
Collaborator Author

breml commented Mar 15, 2021

One of the above comments was not completely correct, I only used the filter part of the configuration, but I added the following dummy input/output:

input {
  stdin {
    id => "stdin"
  }
}

output {
  stdout {
    id => "stdout"
  }
}

Copy link
Owner

@magnusbaeck magnusbaeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks really great! I just had a few comments and questions.

I tried running it locally (from commit 2b78b2b), but I'm unsure what to expect from daemon start. Is that command supposed to block? I can dig out the Logstash log from the tempdir and Logstash starts up fin but the LFV command hangs. Backgrounding the process and running daemon run eventually results in a panic:

waitForState: ready
state change: "ready" -> "setting_up_test" by "setup-test"
waitForState: ready_for_test
state change: "unknown" because of waitForState timeout
waitForState: ready_for_test
shutdown log reader
broadcast shutdown for waitForState
state change: "created" -> "started" by "start"
panic: instance not found in assigned controllers

goroutine 67 [running]:
github.com/magnusbaeck/logstash-filter-verifier/v2/internal/daemon/pool.(*Pool).Return(0xc0000b3980, 0xbb3d40, 0xc00018f110, 0x0)
	/home/magnus/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/internal/daemon/pool/pool.go:156 +0x2c6
github.com/magnusbaeck/logstash-filter-verifier/v2/internal/daemon/session.(*Controller).DestroyByID(0xc0000b39e0, 0xc00042ac90, 0x8, 0x0, 0x0)
	/home/magnus/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/internal/daemon/session/controller.go:103 +0x178
github.com/magnusbaeck/logstash-filter-verifier/v2/internal/app/daemon.(*Daemon).ExecuteTest.func1(0xc00047db68, 0xc000150b40, 0xc000473570)
	/home/magnus/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/internal/app/daemon/daemon.go:345 +0x5b
github.com/magnusbaeck/logstash-filter-verifier/v2/internal/app/daemon.(*Daemon).ExecuteTest(0xc000150b40, 0xbb1420, 0xc000514ae0, 0xc000473570, 0x0, 0xba53a0, 0xc0003a00e0)
	/home/magnus/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/internal/app/daemon/daemon.go:357 +0x395
github.com/magnusbaeck/logstash-filter-verifier/v2/internal/daemon/api/grpc._Control_ExecuteTest_Handler(0xa9c2e0, 0xc000150b40, 0xbb1420, 0xc000514ae0, 0xc00046e840, 0x0, 0xbb1420, 0xc000514ae0, 0xc00042c6e0, 0x4f)
	/home/magnus/src/logstash-filter-verifier/src/github.com/magnusbaeck/logstash-filter-verifier/internal/daemon/api/grpc/api_grpc.pb.go:153 +0x214
google.golang.org/grpc.(*Server).processUnaryRPC(0xc00011b500, 0xbb6800, 0xc000001800, 0xc0004f3a00, 0xc00019b080, 0xf31010, 0x0, 0x0, 0x0)
	/home/magnus/go/pkg/mod/google.golang.org/[email protected]/server.go:1210 +0x522
google.golang.org/grpc.(*Server).handleStream(0xc00011b500, 0xbb6800, 0xc000001800, 0xc0004f3a00, 0x0)
	/home/magnus/go/pkg/mod/google.golang.org/[email protected]/server.go:1533 +0xd05
google.golang.org/grpc.(*Server).serveStreams.func1.2(0xc00002a200, 0xc00011b500, 0xbb6800, 0xc000001800, 0xc0004f3a00)
	/home/magnus/go/pkg/mod/google.golang.org/[email protected]/server.go:871 +0xa5
created by google.golang.org/grpc.(*Server).serveStreams.func1
	/home/magnus/go/pkg/mod/google.golang.org/[email protected]/server.go:869 +0x1fd
logstash-filter-verifier: error: rpc error: code = Unavailable desc = transport is closing

I'm not sure what to expect so I haven't debugged this any further.

Makefile Outdated Show resolved Hide resolved
@@ -0,0 +1,626 @@
// Code generated by protoc-gen-go. DO NOT EDIT.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we really commit these autogenerated files?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me personally prefer to commit these autogenerated files, because it allows others to compile the code without the need of installing all the necessary tools (if the repo would be a library, we would not even discuss this topic, because not committing these files would prevent go get from working). It also makes the build pipeline on travis (or github actions) simpler, because otherwise you have to make all these tools available in the build pipeline as well in order to be able to build and execute the tests.

I think it does not do any harm to commit them.

internal/daemon/controller/files.go Show resolved Hide resolved
internal/daemon/logstashconfig/file.go Outdated Show resolved Hide resolved
internal/daemon/logstashconfig/file.go Show resolved Hide resolved
Comment on lines +26 to +29
ID string `yaml:"pipeline.id"`
Config string `yaml:"path.config"`
Ordered string `yaml:"pipeline.ordered"`
Workers int `yaml:"pipeline.workers"`
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the dot notation a YAML feature or something Elastic has made up, i.e. will the yaml package understand that

pipeline.ordered: true
pipeline.workers: 3

and

pipeline:
  ordered: true
  workers: 3

are equivalent to Logstash? (Because it is, right? Or am I conflating this with Elasticsearch index settings?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not able to find a reference, that these "flat keys" are a YAML feature. But in the Logstash documentation, this is mentioned (https://www.elastic.co/guide/en/logstash/current/logstash-settings-file.html), that both, nested and flat keys are supported.

From a Go code point of view, this allows me to keep the struct flat and omit the otherwise necessary nesting.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, internally we should use the flat keys for convenience but we should support users writing files with nested keys. Since Beats supports both forms, maybe we can grab code from there? Or some other project. We wouldn't be the first ones wanting to flatten Go maps.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. I am not sure, how to do this, because as mentioned before, this is not a YAML feature but a special handling of Logstash. So what we might need to do is mirror both possibilities in the pipeline struct and then post-process the struct, such that the value is copied over. This raises the question, what if a user uses both ways (and does not set the same value), like this:

- pipeline.ordered: true
  pipeline:
    ordered: false

Which value should have precedence in this case?

internal/daemon/pipeline/pipeline.go Outdated Show resolved Hide resolved
@breml
Copy link
Collaborator Author

breml commented Mar 22, 2021

The code looks really great! I just had a few comments and questions.

I tried running it locally (from commit 2b78b2b), but I'm unsure what to expect from daemon start. Is that command supposed to block? I can dig out the Logstash log from the tempdir and Logstash starts up fin but the LFV command hangs. Backgrounding the process and running daemon run eventually results in a panic:

I'm not sure what to expect so I haven't debugged this any further.

Yes, daemon start is starting the daemon in the foreground, so this command does sort of "block". You need to open a second window / terminal to execute daemon run ....

My logstash-filter-verifier.yml:

---

loglevel: debug
logstash:
  path: ./3rdparty/logstash-7.10.0/bin/logstash
keep-envs:
  - TESTMODE
metadata-key: __metadata

So currently test only with logstash 7.10.0. With other versions, there might be unknown issues.

After successful startup, you should see the following output (I have log level on debug in my config):

config: socket: /tmp/logstash-filter-verifier.sock
config: logstash-path: ./3rdparty/logstash-7.10.0/bin/logstash
Temporary directory for daemon created in "/tmp/lfv-407717077"
state change: "created" -> "started" by "start"
start stdout scanner
start stderr scanner
Daemon listening on /tmp/logstash-filter-verifier.sock
Waiting for /tmp/lfv-407717077/logstash-instance/FWm0J0Bl/logstash.log to appear...
stdout:  Using bundled JDK: /home/lubr/projects/mustache/logstash-filter-verifier/3rdparty/logstash-7.10.0/jdk
stderr:  OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
stderr:  WARNING: An illegal reflective access operation has occurred
stderr:  WARNING: Illegal reflective access by org.jruby.ext.openssl.SecurityHelper (file:/tmp/jruby-355527/jruby11114804341739218028jopenssl.jar) to field java.security.MessageDigest.provider
stderr:  WARNING: Please consider reporting this to the maintainers of org.jruby.ext.openssl.SecurityHelper
stderr:  WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
stderr:  WARNING: All illegal access operations will be denied in a future release
stdout:  Sending Logstash logs to /tmp/lfv-407717077/logstash-instance/FWm0J0Bl which is now configured via log4j2.properties
taillog: -> pipeline started: stdin
taillog: -> pipeline started: output
Ready to process tests
state change: "started" -> "ready" by "pipeline-ready"
taillog: -> pipeline running: [output stdin]
state change: "ready" -> "ready" by "pipeline-ready"

Especially the state should change to ready. Then the daemon is ready to execute some tests.

I recommend you to start with the examples I provide in the repo, e.g. logstash-filter-verifier daemon run -p testdata/basic_pipeline.yml --pipeline-base testdata/basic_pipeline --testcase-dir testdata/testcases/basic_pipeline.

If these tests are successful, you may move on to your own config.

When the daemon does timeout or even crash, then there might be a problem with the loading of the logstash config. The logstash log file might give you some hints.

Copy link
Owner

@magnusbaeck magnusbaeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for your work on this PR! There are things to work on but the PR is big enough and the code seems to work under at least a narrow set of circumstances so I'll merge it now.

I was able to run your test pipeline without hiccups and after some debugging of my own files I realized that I'd missed your comment about adding the dummy stdin/stdout plugins (present in your basic_pipeline example). I haven't retried my own configuration with such plugins added but I presume it's going to work. Commenting out the stdin input in the basic_pipeline example causes it to exhibit the exact same problem.

What's the long-term plan here? Automatically add dummy inputs and outputs if none are present or detect the problem and alert the user?

@magnusbaeck magnusbaeck merged commit 6891566 into magnusbaeck:master Mar 28, 2021
@breml
Copy link
Collaborator Author

breml commented Mar 30, 2021

What's the long-term plan here? Automatically add dummy inputs and outputs if none are present or detect the problem and alert the user?

@magnusbaeck That is a valid question. If I remember correctly, we had a similar question some weeks ago. Raise an error to the user, if there are no inputs or outputs, should be a pretty straight forward thing to add. Automatically adding inputs and outputs only works, if the user has a single pipeline (or multiple pipelines, which are connected serially, which is then not different to a single pipeline). If this is not the case, it is not possible to figure out, where to add the inputs and outputs.
Therefore I prefer to just warn the user, if there are no inputs or outputs.

@breml
Copy link
Collaborator Author

breml commented Mar 30, 2021

Raise an error to the user, if there are no inputs or outputs, should be a pretty straight forward thing to add.

I added the check for inputs and outputs in breml@77de4a6.

I will try to finish my work on the codec support in the next few days and I will then create a new PR for you to review.

@breml breml deleted the add-daemon-mode branch March 30, 2021 19:39
@breml breml mentioned this pull request Apr 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants