Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New loki.process stage structured_metadata_regex, to move all labels matching a regex to structured metadata #2578

Closed
wants to merge 1 commit into from

Conversation

akevdmeer
Copy link

@akevdmeer akevdmeer commented Jan 30, 2025

PR Description

This adds a stage to loki.process structured_metadata_regex, that takes a string attr regex, e.g.

        stage.structured_metadata_regex { 
                regex = "kubernetes_pod_.*"
        }

All labels that match the regular expression are moved to structured metadata (so removed as labels).

Which issue(s) this PR fixes

Fixes #926

Notes to the Reviewer

Please advise:

Is it reasonable to introduce a new loki.process stage for this?

I considered extending loki.relabel but that is a Loki wrapper around prometheus relabel so I decided against that. Then I considered extending the existing structured_metadata stage of loki.process, but also decided against that to not modify/embed the existing LabelsConfig.

Happy to re-work this per your guidance to get this feature merged.

PR Checklist

  • Tests updated

@CLAassistant
Copy link

CLAassistant commented Jan 30, 2025

CLA assistant check
All committers have signed the CLA.

@akevdmeer akevdmeer marked this pull request as ready for review January 30, 2025 13:50
@akevdmeer akevdmeer requested a review from a team as a code owner January 30, 2025 13:50
@ptodev
Copy link
Collaborator

ptodev commented Jan 31, 2025

Thank you for the PR! I personally would prefer if there is a new optional argument in the structured_metadata block called regex.Any labels which match the regex can then be added. This wouldn't conflict with the existing values argument. All the labels in values will always be added if they exist; then regex can add additional ones. I'll let other maintainers open too, in case there are other ideas :)

@wildum
Copy link
Contributor

wildum commented Jan 31, 2025

Hey, to add to @ptodev's comment, I would recommend adding a stage "label_extract" to move labels to the extracted values map. This way you can combine it with the structured_metadata stage; and any other processing stages since most of them can interact with the extracted values map.

@ptodev
Copy link
Collaborator

ptodev commented Jan 31, 2025

Yes, just like the values argument, a potential regex argument needs to use the extracted map and not the labels. I like the idea of a label_extract stage too. I'd keep the structured_metadata stage simple.

@akevdmeer
Copy link
Author

akevdmeer commented Feb 3, 2025

To add a regex argument to structured_metadata, instead of introducing a new structured_metadata_regex stage is clear.

But the current structured_metadata processes both the extracted map, from which it copies to structured metadata, and the labels from which it moves to structured metadata.

Should the regex argument not work in that same way? If you wish structured_metadata to act only on the extracted map, and introduce a then required label_extract, could we separate that from introducing regex?

Unlike adding regex, changing structured_metadata to no longer move from labels is a breaking change?

@wildum
Copy link
Contributor

wildum commented Feb 3, 2025

@akevdmeer the structured_metadata should only process data from the extracted map.

To move values from labels to the extracted map you should create a new stage "label_extract"

@akevdmeer
Copy link
Author

Clear. But that means that in addition to supporting regex, you want to change the current behaviour of structured_metadata and get rid of moving from labels to structured metadata, and in a single PR?

@wildum
Copy link
Contributor

wildum commented Feb 3, 2025

Aaah sorry I missed this behavior. This is undocumented, I thought that the structured_metadata step was only processing the data from the extracted map and not the labels... Thanks for clarifying. That leaves us with two options:

  • change the behavior to only process data from the extracted values map, add the regex arg to it, and create the new label_extract stage
  • add the regex arg to it, make it behave like the values block, and update the doc (process from extracted values and labels)

I think that the first option offers more modularity + simpler logic. But it's a breaking change (of undocumented behavior but still annoying if a lot of users rely on it) + the new "label_extract" stage would need to be of stability GA right away to avoid forcing users to lower their stability level to achieve the same behavior.

Which options do you prefer? @akevdmeer @ptodev
I will bring it up at an internal meeting that we have later today

@wildum
Copy link
Contributor

wildum commented Feb 3, 2025

Discussed in our proposal review meeting today, we think that the option to not do breaking changes to the structured_metadata stage is the best.

We're ok to have an argument "regex" which will match with the extracted values map and the labels. Ideally it would be nice to have the possibility to modify the label name (like the valuesattribute does). This could be achieved with a similar approach as the loki.relabel component by using capture groups. This can be done separately (one PR brings the regex attribute and the relabeling can come later, especially because as a workaround you can put a loki.relabel component before the loki.process

@akevdmeer
Copy link
Author

Shall we close this PR, and I'll open a new one with the implementation as discussed above?

@wildum
Copy link
Contributor

wildum commented Feb 4, 2025

Shall we close this PR, and I'll open a new one with the implementation as discussed above?

yeah sounds good, you can link this PR in the new PR to keep the context. Thanks for your work, let us know if you need help!

@akevdmeer akevdmeer closed this Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Regex-based Label \ Metadata matching
4 participants