Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document a new "join" block in discovery.relabel #1541

Closed
wants to merge 1 commit into from

Conversation

ptodev
Copy link
Contributor

@ptodev ptodev commented Aug 23, 2024

PR Description

The main use case driving this is adding labels from discovery.kubernetes to targets coming from prometheus.exporter components. If you can also think of other use cases, please drop a comment in the PR.

Instead of writing a proposal for the join block, I decided to just write the documentation for it. If you agree that the proposed functionality is reasonable, I will update this PR with the code implementation of the new block. I could then take the PR out of the draft stage.

Which issue(s) this PR fixes

Fixes ##1443

Notes to the Reviewer

Initially I meant to solve this problem using dynamic pipelines. Unfortunately, that proved to be very difficult. The approach in this PR would be easier to implement.

PR Checklist

  • CHANGELOG.md updated
  • Documentation added
  • Tests updated
  • Config converters updated

@thampiotr
Copy link
Contributor

Since our current approach seems to have standard library functions to manipulate arrays and maps which we use for prometheus targets (such as array.concat), I was wondering if we should instead make this a stdlib concept?

For example an inner_join(<what key>, <merge strategy>, <array>) function that would, for example do something like this:
image

Or any other option similar form in stdlib. It would make it work with potential other telemetry ecosystems.

@ptodev
Copy link
Contributor Author

ptodev commented Aug 28, 2024

@thampiotr The reason I didn't opt for a stdlib function is that the ergonomics might not be as good. This is because not all labels from the SD should be taken into account. Suppose you have these labels from k8s SD:

{"pod_ip" = "1.2.3.4", "namespace" = "a", "container" = "a"}
{"pod_ip" = "1.2.3.4", "namespace" = "a", "container" = "b"}
{"pod_ip" = "1.2.3.4", "namespace" = "a", "container" = "c"}

Let's say you want to join on the "pod_ip" label. How would you join the "container" labels?

The solution I decided to go with is to remove labels which are not in labels_to_join and condition prior to the join, and then to remove any duplicates. This is quite a specific operation... we could add stdlibs for each one of those steps, but it'd be awkward to have to setup all those stdlib functions manually in a config file.

@thampiotr
Copy link
Contributor

thampiotr commented Sep 2, 2024

@thampiotr The reason I didn't opt for a stdlib function is that the ergonomics might not be as good. This is because not all labels from the SD should be taken into account. Suppose you have these labels from k8s SD:

{"pod_ip" = "1.2.3.4", "namespace" = "a", "container" = "a"}
{"pod_ip" = "1.2.3.4", "namespace" = "a", "container" = "b"}
{"pod_ip" = "1.2.3.4", "namespace" = "a", "container" = "c"}

Let's say you want to join on the "pod_ip" label. How would you join the "container" labels?

In this case I would expect to have a stdlib function that would allow me to get the necessary unique tuples of (pod_ip, namespace)... or of any subset of keys that I want to use.

You can also have a different merge strategy for the inner_join function I sketched above, which would discard the keys that vary. Something like inner_join(["pod_ip", "namespace"], "intersection", targets) maybe? not sure about naming and API, but that's just to illustrate the idea.

Copy link
Contributor

github-actions bot commented Oct 3, 2024

This PR has not had any activity in the past 30 days, so the needs-attention label has been added to it.
If you do not have enough time to follow up on this PR or you think it's no longer relevant, consider closing it.
The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your PR will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity.
Thank you for your contributions!

@ptodev
Copy link
Contributor Author

ptodev commented Oct 4, 2024

@thampiotr I think you're right - probably an inner join is a better way to go. I opened #1826 for it, and I'm going to close this PR because it's unlikely to be the solution we go with. Thank you for the feedback!

@ptodev ptodev closed this Oct 4, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants