Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix delayed message null activity #7264

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jpalac
Copy link
Contributor

@jpalac jpalac commented Jan 20, 2025

Addresses #7209

@jpalac jpalac added the Bug label Jan 20, 2025
@jpalac jpalac self-assigned this Jan 20, 2025
@jpalac
Copy link
Contributor Author

jpalac commented Jan 21, 2025

There is a sample provided to test this scenario: https://github.com/jonesr-out/nservicebus-otel-bug

It seems that when the StartActivity runs with default values, there is no active listener so a null activity is returned from this call.

This PR makes a change to pass in the traceFlags of the spanContext into the new root activity context. It seems to solve the issue reported in the sample.

I have created a test to try and reproduce the issue, however I was unable to get the StartActivity code to return null the way it does in the demo.
I believe that may be due to our testing setup of the activity listener.
I'm not sure if there's another way that this could be setup to be tested.

@lailabougria @SzymonPobiega can you see anything wrong with the proposed solution here? Any possible side effects?

@lailabougria
Copy link
Contributor

lailabougria commented Jan 27, 2025

The traceflags represent whether the original span was recorded or not (eg whether it was sampled in or out). When the user instructs to start a new trace, I don't think we should respect the previous decision, as this is a completely new trace, and therefore, a separate decision should be made. By passing the trace flags from the original send span, we're now accepting those previous decisions when I believe we shouldn't.

However, what I find odd is that with the previous code, you ended up with a null activity. Did that code contain any type of sampling settings? Did it have open telemetry enabled?

If a call is more helpful happy to jump on one.

@jpalac
Copy link
Contributor Author

jpalac commented Jan 30, 2025

@lailabougria - further to our discussion the other day, I can get the activity to not be null by either:

  • changing the CreateNewRootActivityContext method to static ActivityContext CreateNewRootActivityContext() => new(default, default, default, default); - so not creating the traceid but using default instead.

OR

  • setting the Activity.Current to null before calling activity = ActivitySources.Main.CreateActivity(name: ActivityNames.IncomingMessageActivityName, ActivityKind.Consumer, remoteParentActivityContext); - so in this instance it's like the example we were looking at, and using default for parentContext instead of trying to create it as above

The second method is similar to what has been suggested in that original post and verified as "safe to use" - this is the code they were referring to: https://github.com/mu88/Repro_OpenTelemetry_Baggage/blob/main/src/Web/ActivityExtensions.cs#L13-L29

I've updated the PR to use the second method - what do you think? Any undesired side effects you can think of?

@lailabougria
Copy link
Contributor

@jpalac If we can validate that:

  • The sampling decision is made again and is independent of the parent context
    • This could be double-checked by splitting up the sample into two endpoints, where the receiving endpoint sets the alwaysonsampler and the sending endpoint sets it to exclude it from the sample
  • The new activity is part of a new trace (different trace id than the one passed in the incoming message)
  • And the new activity links back to the original trace
    Then I believe we're good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants