Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi pipeline and pipeline-to-pipeline communication support #93

Open
breml opened this issue Nov 27, 2020 · 12 comments
Open

Multi pipeline and pipeline-to-pipeline communication support #93

breml opened this issue Nov 27, 2020 · 12 comments
Milestone

Comments

@breml
Copy link
Collaborator

breml commented Nov 27, 2020

Since Logstash version 6.0 there is support added for multiple pipelines. Running multiple pipelines allows users to isolate data flow and helps simplify complex configurations. At the moment there is no good way to test Logstash configurations with multiple pipelines in Logstash Filter Verifier (LFV), especially if the pipeline-to-pipeline communication (available since Logstash version 6.3) is used.

The idea of this issue is to discuss, how LFV could be extended such that Logstash configurations with multiple pipelines as well as pipeline-to-pipeline communication could be tested.

The addition of the support of pipeline-to-pipeline communication would also allow LFV to better wrap around the Logstash configuration under test. The flow could go like this:

  • LFV input pipeline with the necessary input to send events.
  • Optional filters to mutate the events according to the test specification, e.g. add fields, add deterministic fake time (normally Logstash adds the current time for the @timestamp field, for test szenarios this is not suitable and it would be beneficial, if a deterministic fake time could be provided)
  • Replace the input of the Logstash configuration under test with pipeline input
  • Replace the output of the Logstash configuration under test with pipeline output to the LFV output pipeline
  • Send event via pipeline-to-pipeline communication to the Logstash configuration under test
  • Post process the event in the LFV output pipeline (e.g. sanitize @metadata field, built in support for metadata fields  #5)
  • Output the event in the LFV output pipeline to the destination, where it is picked up by LFV

The advantage of this would be, that without more modifications to the Logstash configuration (config under test) than today (change inputs and outputs), LFV would be able to adopt to changes to the test suite by only changing the filtering part of the Logstash configuration (because some of the input plugins prevent the reload of the Logstash configuration).
Additionally, the wrapping with pipeline-to-pipeline communication would also allow to send LFV specific controll messages to logstash, which are then received again by LFV. These could signal the start or the end of a test suite run and with this make the whole process more reliable.

@magnusbaeck
Copy link
Owner

Interesting idea! So I guess there are three reasons for doing the pipeline wrapping:

  • When implementing daemon mode (Daemon mode - shorten the time for a test cycle (modify/execute tests) #94) we can prevent reloading of the non-filter plugins (assuming Logstash only tears down the pipeline whose configuration files were actually changed).
  • We can add LFV-specific filters that are guaranteed to hit the messages first or last.
  • We can pass control messages without patching the configuration under test to introduce skip-if-control-message conditionals.

I hope we can do this in a backwards-compatible manner.

@magnusbaeck magnusbaeck added this to the 2.0 milestone Nov 28, 2020
@breml
Copy link
Collaborator Author

breml commented Dec 4, 2020

Interesting idea! So I guess there are three reasons for doing the pipeline wrapping:

  • When implementing daemon mode (Daemon mode - shorten the time for a test cycle (modify/execute tests) #94) we can prevent reloading of the non-filter plugins (assuming Logstash only tears down the pipeline whose configuration files were actually changed).
  • We can add LFV-specific filters that are guaranteed to hit the messages first or last.
  • We can pass control messages without patching the configuration under test to introduce skip-if-control-message conditionals.

I hope we can do this in a backwards-compatible manner.

@magnusbaeck you got this exactly right.

About the backwards compatibility: I guess as part of the cleanup and with the introduction with this and the other features (daemon mode #94, and coverage #95), we will raise the bar in regards to the supported version of Logstash anyways. If we head for a version 2.0, the question would be, what would be the minimal version of Logstash that we need to support.
I would argue, that users of older versions of Logstash need to update anyway at some point and until then they are free to continue using LFV v1.0.

Already now we have some sections in the code base of LFV, that take care of different version of Logstash (e.g. at some point the possibility to pass filter configs to logstash via flags have been dropped and a workaround needed to be found an implemented). Over all I guess, the code readability as well as the maintainability will benefit from dropping support for (very) old versions of Logstash.

@magnusbaeck
Copy link
Owner

You're probably right about the backwards compatibility. It's totally reasonable to only support, say, Logstash 6+ for LFV 2.x. If there are important bugfixes or small features that motivate backporting to 1.x that could always be done on a separate branch. Or if the 2.x stuff should be on a separate branch. I'll have to read up on how Go modules work in this regard. Another LFV 2.x feature could be to change the package structure so that not all packages (maybe none?) are public.

@breml
Copy link
Collaborator Author

breml commented Dec 14, 2020

If we decide to move most (or even all) of the packages into internal packages and the only non internal left would be the main package, we are in fact not really touched by the version implications of Go modules. Go modules come into play, when others use part of the LFV in their code. Are you aware of such cases? From what I see on https://pkg.go.dev/ and https://godoc.org/, this does not seem to be the case. So yes, I think moving the packages into internal is something we should do.

@magnusbaeck
Copy link
Owner

This is somewhat confusing. Go Modules: v2 and Beyond says

Starting with v2, the major version must appear at the end of the module path (declared in the module statement in the go.mod file).

and the go command documentation says

To preserve import compatibility, the go command requires that modules with major version v2 or later use a module path with that major version as the final element. For example, version v2.0.0 of example.com/m must instead use module path example.com/m/v2, ...

so it seems we have to choice but to introduce a v2 subdirectory and move almost everything there, i.e. we'd end up with something like this:

README.md
LICENSE
v2/
    go.mod
    go.sum
    cmd/
        lfvclient/
        lfvdaemon/
    internal/
        logstash/
        testcase/
        ...

But after reading https://therootcompany.com/blog/bump-go-package-to-v2/ I realize that it's only the module path, i.e. the module line in go.mod that needs to end with /v2. That go.mod could still be placed in the repository root. Making such a change (i.e. not introducing a v2 subdirectory) would definitely break anyone who's been importing the package, but as you say there are no obvious importers.

@breml
Copy link
Collaborator Author

breml commented Jan 25, 2021

@magnusbaeck What is you conclusion on this topic? Do you still plan to move all the existing packages into internal package directories? If so, this will break all existing importers anyway.

@magnusbaeck
Copy link
Owner

Yes, I plan to make the changes outlined above except skipping the v2 directory and just updating the module path. I haven't tested it.

I got stuck reworking the tests for the testcase package but that took longer than expected. I'll pause that work and migrate the package structure instead. I should have time for that tomorrow night.

@jgough
Copy link

jgough commented Sep 6, 2021

This looks excellent - I'm trying to get multi-pipelines to work with the Beta but I'm not seeing any logging that can be used for debugging here. Maybe this is a case for adding some more logging?

The first issue I think I had is that my pipelines.yml has

- pipeline.id: prod
  path.config: "pipeline/prod/"

which exited with:
logstash-filter-verifier: error: expect the Logstash config to have at least 1 input and 1 output, got 0 inputs and 0 outputs

Playing around a bit I think I need this format:

- pipeline.id: prod
  path.config: "pipeline/prod/*.conf"

Which seems like a minor bug in LFV if it doesn't accept pipeline paths without wildcards

I now get:
Summary: ☑ All tests: 0/0

So now it is not picking up any of my tests and I'm still trying to debug at the moment. More logging here would be extremely useful in this case! I'm going to try simplifying my test cases and seeing where I get with this.

@jgough
Copy link

jgough commented Sep 6, 2021

Looks like if there is a dash in the pipeline.id then it does not work:

If I set the pipeline.id as follows then it works fine:

- pipeline.id: testprod
  path.config: "pipeline/**/*.conf"

But if I change this to

- pipeline.id: test-prod
  path.config: "pipeline/**/*.conf"

Then the daemon run command returns:

☐ Compare actual event with expected event from main.json:
Expected 2 event(s), got 0 instead.
logstash-filter-verifier: error: rpc error: code = Unknown desc = destroy of session failed: session teardown failed: state unknown, failed to wait for state: ready_for_test, <nil>

and the daemon start command logs the following:

failed to wait for Logstash results: timed out while waiting for state: ready_for_test

breml added a commit to breml/logstash-filter-verifier that referenced this issue Sep 6, 2021
breml added a commit to breml/logstash-filter-verifier that referenced this issue Sep 6, 2021
@breml
Copy link
Collaborator Author

breml commented Sep 6, 2021

Hi @jgough

Thanks for your feedback. I have fixed the issues "dash in pipeline.id" and "path.config pointing to dir in pipelines.yml" in #125.

In order to get more debug information consider the following two options when you start the daemon:

  • Add the flag --loglevel DEBUG to get more verbose debug logging of the daemon it self.

  • Add the flag ---no-cleanup in order to keep the temporary directories created by LFV during execution of Logstash. The folder is logged during startup of the daemon in a log line like this:

    Temporary directory for daemon created in "/tmp/lfv-203350830"
    

    In this directory you will find the a folder logstash-instance with a subfolder for each Logstash instance started. In these folders you will find the file logstash.log with all the log output generated by Logstash.

I hope this is of help for your future debugging.

@jgough
Copy link

jgough commented Sep 7, 2021

Thanks for the advice and for the fixes, that is extremely helpful. Am getting much closer to getting this working now.

Looks like dashes in plugin ids are also not working?
input { stdin { id => "prod-input" } } does not work

but changing the id to prod_input seems to work

breml added a commit that referenced this issue Sep 8, 2021
* Fix pipeline.id with dash
* Fix path.config pointing to dir in pipelines.yml

Solves issue mentioned in: #93 (comment)
@breml
Copy link
Collaborator Author

breml commented Sep 8, 2021

@jgough I just tested an input plugin with a dash (-) in the name and this worked for my with #125 applied. Can you confirm, that this works for you as well?

Sorry, I executed the wrong test. It does not work, I will investigate the issue.

breml added a commit to breml/logstash-filter-verifier that referenced this issue Sep 8, 2021
@breml breml mentioned this issue Sep 8, 2021
breml added a commit to breml/logstash-filter-verifier that referenced this issue Sep 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants