-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nondeterministic results with multiple outputs causing unreliable tests #150
Comments
I am encountering the same issue but I wasn't really able to create a proper reproduction scenario, thanks @jgough for the amazing reproducing steps! And I'd like to add that it is not linked to the use of the ...
- expected:
- EventID: "4765"
input:
- >
{
"SourceUserName": null,
"SourceSid": null,
"TargetUserName": null,
"TargetDomainName": null,
"TargetSid": null,
"SubjectUserSid": null,
"SubjectUserName": null,
"SubjectDomainName": null,
"SubjectLogonId": null,
"PrivilegeList": null,
"SidList": null,
"log_name": "security",
"event_id": "4765"
}
- expected:
- timestamp: "toto"
message: "titi sudo"
log_category: "srv-unix"
input:
- >
{
"timestamp": "toto",
"message": "titi sudo",
"log_category": "srv-unix",
"drop_this_field": "gigi"
} and a filter containing multiple outputs, I get this LFV output: ☐ Comparing message 1 of 2 from splunk_output.test.yaml:
--- /tmp/880506628/splunk_output.test.yaml/1/expected 2021-10-13 14:32:13.643438839 +0200
+++ /tmp/880506628/splunk_output.test.yaml/1/actual 2021-10-13 14:32:13.643438839 +0200
@@ -1,3 +1,5 @@
{
- "EventID": "4765"
+ "log_category": "srv-unix",
+ "message": "titi sudo",
+ "timestamp": "toto"
}
☐ Comparing message 2 of 2 from splunk_output.test.yaml:
--- /tmp/880506628/splunk_output.test.yaml/2/expected 2021-10-13 14:32:13.643438839 +0200
+++ /tmp/880506628/splunk_output.test.yaml/2/actual 2021-10-13 14:32:13.643438839 +0200
@@ -1,5 +1,3 @@
{
- "log_category": "srv-unix",
- "message": "titi sudo",
- "timestamp": "toto"
+ "EventID": "4765"
}
Summary: ☐ All tests: 0/2
☐ splunk_output.test.yaml: 0/2
logstash-filter-verifier: error: failed test cases |
@nicolasreich Great to hear that I'm not the only one encountering this issue! Your issue sounds more like #89, but I think there are some major interlinked issues where multiple outputs affect the order. I tried to fix this in #90 but my patch had some major issues I discovered. I don't know the code well enough to propose a working solution to this. |
Just to clarify, is this issue the same as #153, because in all failing scenarios there has been a filter that modifies some of the |
I haven't encountered situations where it fails without |
I'm guessing that after a I believe #153 is to do when the __lfv fields are completely removed by a plugin (such as |
@jgough I think, this problem might be solved with #157 as well, because #157 ensures, that I will add your testcase to the integration tests. |
I'm still seeing this issue in trunk. I've modified the top of the Dockerfile to build LFV from trunk and you can use this to reproduce:
The output is the same, sometimes passing and sometimes failing:
|
Do I need to do anything to get this reopened or should I file this as another issue? |
From what I can tell, we already did everything possible to ensure a persistent order of the events, but this is still does not yet guarantee the correct order. The problem is, that both events (
I assume, this the above order would also be the order in regards to the necessary effort for the implementation (from low to higher). @jgough What do you think? |
Thanks for reopening. The issue here is the two output plugins. With one output plugin this issue does not occur so I'm guessing this is something deep in the internals of logstash with both outputs running in parallel. I can think of two solutions:
Can you think of logstash use-cases where the output order is significant with one single input and multiple outputs? In my mind, order of events outputted should not really matter. The order that events are outputted is just a side-effect of the logstash pipeline, and not something to be tested. What matters more is the content of the output. If two events are generated, and they match the two events expected (in any order) then that is a success. I don't really view Is that at all helpful? |
I agree, the two outputs are executed concurrently and those the order of the events is not guaranteed nor can it be controlled.
Agree.
Now that I think again about your last test case, I think I know the reason, why it is not deterministic. Can you please run the tests again with
Yes, thanks. |
I can verify that explicitly setting ids on the outputs does indeed fix this issue and the order becomes deterministic. Thanks! |
@jgough Do you agree, that in this case, this is more of a documentation issue than a bug in LFV. |
Not entirely sure here. Is this happening because IDs added from Easy fix would be to document it and remind people to always add IDs to outputs if they are using multiple. What do you feel is best here? |
@jgough Your last comment finally gave me the necessary hint to check again the code, where I discovered an other bug. It was not the intention, that the ID of the output plugins change randomly, but this was what actually happened. The reason is, that in https://github.com/magnusbaeck/logstash-filter-verifier/blob/master/internal/daemon/logstashconfig/file.go#L138 I generate a random ID, as a last resort, if an output does not have an ID. But one should never end up there, because with This should now be fixed with PR #173. |
Verified with your branch that the test above is now deterministic. Thanks for the explanation and the fix! Will this ordering also likely be stable under config changes? i.e. if someone adds another filter to their config (which may change the numbering for the ID counter), then might the ordering change then? Is it sorted alphabetically? If so then two outputs ( This could potentially result in someone having to swap the ordering in the entire test suite if the ordering might suddenly change. If this is the case is it worth zero padding these ids? |
So first of all, omitting the ID is not encouraged with LFV v2 and the flag |
Don't we want %04d for leading zeros?
|
Fixed it in #174 |
If multiple outputs are used in the pipeline config then outputs from Logstash can rearrange themselves. This means tests can either pass or fail randomly as the output from Logstash is nondeterministic.
For example, the following pipeline and test can pass on one execution of LFV, but then fail on the next:
Pipeline:
Test:
What happens here is that sometimes Logstash outputs original_event before cloned_event, and sometimes cloned_event before original_event
This may be related to #89, but that ticket is mostly about outputs being rearranged after config changes. This is simply about nondeterministic outputs without config changes.
Below is a Dockerfile that illustrates this issue. It sets up a pipeline with the config and test as above, then runs the same test 10 times.
Build and run with
Sample output below. Note that sometimes the tests fail and sometimes they pass.
The text was updated successfully, but these errors were encountered: