Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure "received invalid w3c traceparent" has resolved #665

Open
1 of 4 tasks
robrap opened this issue May 29, 2024 · 5 comments
Open
1 of 4 tasks

Ensure "received invalid w3c traceparent" has resolved #665

robrap opened this issue May 29, 2024 · 5 comments

Comments

@robrap
Copy link
Contributor

robrap commented May 29, 2024

Our hypothesis is that logs with "received invalid w3c traceparent" will resolve once New Relic is no longer enabled. We think New Relic may be setting a trace header that gets misinterpreted by Datadog.

We need to do the following in order:

  • Wait until New Relic has been disabled.
  • Ensure "received invalid w3c traceparent" is not appearing in the Live Tail view of logs.
  • If still appearing, work with DD support to resolve.
  • Remove the temporary log filter from edx-logs once fixed.
@robrap robrap added this to Arch-BOM May 29, 2024
@robrap robrap converted this from a draft issue May 29, 2024
@robrap
Copy link
Contributor Author

robrap commented Jun 21, 2024

Update: I updated the exclusion rule from 99% to 99.99%, because it was still allowing too many logs.

@robrap
Copy link
Contributor Author

robrap commented Jul 1, 2024

Update: I moved the retention rule up so it happens before the EKS rules.

Additionally, we may ultimately need a DD support ticket for this, if the DD trace support ticket doesn't resolve this.

@jristau1984
Copy link

Currently, I do not see any instances of this log message in the last one month, however, the exclusion rule is currently set to 100%, so there would be no records kept. I have not seen it appearing in Live Tail after 15 minutes.

I think we can update the retention rule to something like 0% and then check back in 24 hours. If no logs, we can remove the retention rule and call this ticket complete.

@jristau1984 jristau1984 self-assigned this Oct 21, 2024
@jristau1984 jristau1984 moved this from Backlog to In Progress in Arch-BOM Oct 21, 2024
@jristau1984
Copy link

I have updated the retention rule to exclude 0%. Now we wait...

@jristau1984
Copy link

jristau1984 commented Oct 22, 2024

We have logged almost 36k of these since the exclusion was turned off yesterday
image

So, it appears this is not yet fully resolved. Over 95% of the messages are in the edx-analytics-api repo, and a handful % is from LMS.
https://app.datadoghq.com/logs?query=%22received%20invalid%20w3c%20traceparent%22&agg_m=count&agg_m_source=base&agg_q=service&agg_q_source=base&agg_t=count&cols=host%2Cservice&fromUser=true&messageDisplay=inline&refresh_mode=sliding&storage=hot&stream_sort=desc&top_n=10&top_o=top&viz=toplist&x_missing=true&from_ts=1729431977933&to_ts=1729604777933&live=true

I will move the exclusion rule back to 99% so that it is still apparent, but not costing us much $.

@jristau1984 jristau1984 moved this from In Review to Backlog in Arch-BOM Oct 28, 2024
@jristau1984 jristau1984 removed their assignment Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

2 participants