Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom SDK implementation #954

Open
4 tasks
MrAlias opened this issue Jul 16, 2024 · 15 comments
Open
4 tasks

Custom SDK implementation #954

MrAlias opened this issue Jul 16, 2024 · 15 comments
Assignees

Comments

@MrAlias
Copy link
Contributor

MrAlias commented Jul 16, 2024

Initial support for interoperability with manual instrumentation was added in #523. This approach has issues when trying to implement methods like IsRecording, RecordError, and AddEvent, as well as the inability to capture the SpanStartOptions which contain important definitions about the span.

Provide our own SDK

An alternate solution for this interoperability was proposed in #352 (comment):

Provide a TracerProvider implementation from auto-instrumentation that manual instrumentation can explicitly use to say they want their data to go to the same pipeline. This means the auto-instrumentation will control the type that manual instrumentation uses to create, edit, and end spans if they want the data to be sent to the auto-instrumentation pipeline

Providing an SDK implementation like this would allow us to design minimal overhead in the Go implementation to resolve all the issues found with the current approach. Namely:

  • For IsRecording, the span implementation could read its value from the span struct field that could be set in the eBPF space
  • RecordError could call a method like `func (s span) recordError(errMsg, errName string, cfg trace.EventConfig). This method would have all the needed information for the probe already resolved.
  • Similar to RecordError, AddEvent could call a addEvent method designed to have all the information for the probe already resolved in the call arguments.

Interoperability

The remaining issues with providing a custom SDK, is how this would be integrated into users code?

The original idea was to offer this SDK as a package under go.opentelemetry.io/auto. This could still be provided, but if this is the only way this solution is integrated it means user code needs to be updated to inter-op with auto-instrumentation.

Requiring user code to be updated to work with auto-instrumentation is not desired. The whole point of auto-instrumentation is this kind of action would not be required.

Instead, we would like to have the default global otel implementation inter-op with this new SDK:

  • A flag of some sort (i.e. a package level boolean var) would be added go.opentelemetry.io/otel/internal/global.
  • This flag would be false by default
  • When the Go auto-instrumentation loads, it will set this flag to true
  • When the global Tracer implementation creates a new span it will check this added flag (i.e. here), and if true will return our custom SDK implementation of the Span instead of the nonRecordingSpan

Action Items

  • Prototype interoperability proposal to test its viability
  • Present proposal to Go SIG
    • Re-evaluate and adjust after feedback from Go SIG
  • Build custom SDK
  • Publish custom SDK
@RonFed
Copy link
Contributor

RonFed commented Jul 28, 2024

@MrAlias What do you think should be included in the prototype?

@MrAlias
Copy link
Contributor Author

MrAlias commented Aug 6, 2024

Ideally, something like this:

package main

import (
	"fmt"
	"time"
)

var Flag bool

func main() {
	for {
		fmt.Printf("Flag: %t", Flag)
		time.Sleep(time.Second)
	}
	// Output: Flag: false
	// Auto-instrumentation added
	// Output: Flag: true
}

@MrAlias
Copy link
Contributor Author

MrAlias commented Aug 6, 2024

We may also need to look into having the global package call something like:

func (t *tracer) start(eBPFFlag *bool, ctx context.Context, name string, opts ...trace.SpanStartOption) (context.Context, trace.Span) {
/* ... */
}

and instrument that instead of Start so we can find the flag location.

@RonFed
Copy link
Contributor

RonFed commented Aug 9, 2024

@MrAlias I did a POC for the second option we talked about. It is in main...RonFed:opentelemetry-go-instrumentation_fork:flag_POC

Here are some logs from the app being instrumented and the agent:

rolldice-1 | 2024-08-09T15:42:14.291Z INFO app/main.go:64 starting http server {"port": ":8080"}
rolldice-1 | 2024-08-09T15:42:14.291Z INFO app/main.go:52 probed function called with flag unset
go-auto-1 | {"level":"info","ts":1723218134.4202611,"logger":"go.opentelemetry.io/auto","caller":"cli/main.go:86","msg":"building OpenTelemetry Go instrumentation ...","globalImpl":false}
rolldice-1 | 2024-08-09T15:42:15.295Z INFO app/main.go:52 probed function called with flag unset
rolldice-1 | 2024-08-09T15:42:16.297Z INFO app/main.go:52 probed function called with flag unset
go-auto-1 | {"level":"info","ts":1723218136.4233193,"logger":"Instrumentation.Analyzer","caller":"process/discover.go:67","msg":"found process","pid":12030}
go-auto-1 | {"level":"info","ts":1723218136.4287593,"logger":"Instrumentation.Allocate","caller":"process/allocate.go:73","msg":"Detaching from process","pid":12030}
go-auto-1 | {"level":"info","ts":1723218136.4289303,"logger":"Instrumentation","caller":"app/instrumentation.go:144","msg":"target process analysis completed","pid":12030,"go_version":"1.22.5","dependencies":{"go.uber.org/multierr":"1.10.0","go.uber.org/zap":"1.27.0","std":"1.22.5"},"total_functions_found":3}
go-auto-1 | {"level":"info","ts":1723218136.428993,"logger":"go.opentelemetry.io/auto","caller":"cli/main.go:119","msg":"starting instrumentation..."}
go-auto-1 | {"level":"info","ts":1723218136.4319375,"logger":"Instrumentation.Manager","caller":"instrumentation/manager.go:198","msg":"loading probe","name":"net/http/server"}
go-auto-1 | {"level":"info","ts":1723218136.4832294,"logger":"Instrumentation.Manager","caller":"instrumentation/manager.go:198","msg":"loading probe","name":"main/client"}
go-auto-1 | {"level":"info","ts":1723218136.4842162,"logger":"go.opentelemetry.io/auto","caller":"cli/main.go:115","msg":"instrumentation loaded successfully"}
rolldice-1 | 2024-08-09T15:42:17.299Z INFO app/main.go:50 probed function called with flag set
rolldice-1 | 2024-08-09T15:42:18.300Z INFO app/main.go:50 probed function called with flag set
rolldice-1 | 2024-08-09T15:42:19.301Z INFO app/main.go:50 probed function called with flag set
rolldice-1 | 2024-08-09T15:42:20.302Z INFO app/main.go:50 probed function called with flag set
rolldice-1 | 2024-08-09T15:42:21.303Z INFO app/main.go:50 probed function called with flag set
rolldice-1 | 2024-08-09T15:42:22.304Z INFO app/main.go:50 probed function called with flag set
rolldice-1 | 2024-08-09T15:42:23.305Z INFO app/main.go:50 probed function called with flag set
rolldice-1 | 2024-08-09T15:42:24.307Z INFO app/main.go:50 probed function called with flag set
rolldice-1 | 2024-08-09T15:42:25.308Z INFO app/main.go:50 probed function called with flag set
rolldice-1 | 2024-08-09T15:42:26.309Z INFO app/main.go:50 probed function called with flag set

@MrAlias
Copy link
Contributor Author

MrAlias commented Aug 9, 2024

Awesome! Let me see about getting an issue created in https://github.com/open-telemetry/opentelemetry-go to make the proposal based on these findings.

@MrAlias
Copy link
Contributor Author

MrAlias commented Aug 9, 2024

opentelemetry-go proposal: open-telemetry/opentelemetry-go#5702

@RonFed PTAL

@RonFed
Copy link
Contributor

RonFed commented Aug 10, 2024

@MrAlias LGTM,
In the suggested change, do you think we should probe the tracer.newSpan (which has the flag pointer) or the interop.newSpan (which has the configuration ready)?
Ideally, we should build it in a way that we can handle both in a single eBPF program if it is possible.

@MrAlias
Copy link
Contributor Author

MrAlias commented Aug 10, 2024

@MrAlias LGTM, In the suggested change, do you think we should probe the tracer.newSpan (which has the flag pointer) or the interop.newSpan (which has the configuration ready)? Ideally, we should build it in a way that we can handle both in a single eBPF program if it is possible.

I was thinking tracer.newSpan as that would allow us to set the bool and trace in one probe. Right?

@RonFed
Copy link
Contributor

RonFed commented Aug 10, 2024

I was thinking tracer.newSpan as that would allow us to set the bool and trace in one probe. Right?

Yea, that sounds good, we'll need to have a return probe for that function as well simillar to what we have today - in order to save the returned span pointer in a map.

@MrAlias
Copy link
Contributor Author

MrAlias commented Aug 19, 2024

I was thinking tracer.newSpan as that would allow us to set the bool and trace in one probe. Right?

Yea, that sounds good, we'll need to have a return probe for that function as well simillar to what we have today - in order to save the returned span pointer in a map.

Correct.

@MrAlias MrAlias moved this from Todo to In Progress in Go Auto Instrumentation: Beta Aug 20, 2024
@MrAlias MrAlias mentioned this issue Aug 27, 2024
6 tasks
MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Sep 4, 2024
Adds a new go.opentelemetry.io/auto/sdk module that holds the
OpenTelemetry SDK implementation used by auto-instrumentation.

Part of open-telemetry#954
Split from open-telemetry#1045
MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Sep 4, 2024
Adds a new go.opentelemetry.io/auto/sdk module that holds the
OpenTelemetry SDK implementation used by auto-instrumentation.

Part of open-telemetry#954
Split from open-telemetry#1045
@damemi
Copy link
Contributor

damemi commented Sep 9, 2024

With the custom sdk we're providing, do we still need the probes for the global instrumentation that we already have? Or could we just implement those functions in our sdk (Start, End, SetAttributes, etc)?

In that case we would only need to auto-instrument the custom sdk's ended function so that the object gets passed to the auto-agent's export pipeline.

@MrAlias
Copy link
Contributor Author

MrAlias commented Sep 9, 2024

With the custom sdk we're providing, do we still need the probes for the global instrumentation that we already have? Or could we just implement those functions in our sdk (Start, End, SetAttributes, etc)?

In that case we would only need to auto-instrument the custom sdk's ended function so that the object gets passed to the auto-agent's export pipeline.

Correct. After we have opentelemetry-go use our SDK we won't need to instrument the nonRecordingSpan or tracer from go.openetelemetry.io/otel/internal/global.

@MrAlias
Copy link
Contributor Author

MrAlias commented Sep 9, 2024

We will still need to instrument the initial call to tracer.Start to set the bool there stating the auto-instrumentation is registered.

@MrAlias
Copy link
Contributor Author

MrAlias commented Sep 9, 2024

We will still need to instrument the initial call to tracer.Start to set the bool there stating the auto-instrumentation is registered.

Hmm, we could also look into just instrumenting a bool in our SDK that the global SDK would look up via our API. Might be something we should look into.

MrAlias added a commit that referenced this issue Sep 12, 2024
* Add the auto-instrumentation SDK wireframe

Adds a new go.opentelemetry.io/auto/sdk module that holds the
OpenTelemetry SDK implementation used by auto-instrumentation.

Part of #954
Split from #1045

* Add dependabot entry

* Ignore the sdk module until it is finished

* Set Go mod go directive to 1.21.0

* Fix lint
MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Sep 18, 2024
* Add the auto-instrumentation SDK wireframe

Adds a new go.opentelemetry.io/auto/sdk module that holds the
OpenTelemetry SDK implementation used by auto-instrumentation.

Part of open-telemetry#954
Split from open-telemetry#1045

* Add dependabot entry

* Ignore the sdk module until it is finished

* Set Go mod go directive to 1.21.0

* Fix lint
MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Sep 19, 2024
MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Sep 19, 2024
MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Sep 19, 2024
MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Sep 19, 2024
MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Sep 19, 2024
MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Sep 19, 2024
MrAlias added a commit that referenced this issue Sep 20, 2024
* Test SDK's Span.IsRecording

Part of #954

* Fix merge
MrAlias added a commit that referenced this issue Sep 25, 2024
MrAlias added a commit that referenced this issue Sep 25, 2024
* Test the SDK's Span.SpanContext method

Part of #954

* Fix merge
MrAlias added a commit that referenced this issue Sep 27, 2024
* Implement the Span.SetName method

Part of #954

* Update sdk/trace_test.go
@MrAlias
Copy link
Contributor Author

MrAlias commented Oct 22, 2024

With #1195 merged, the remaining tasks for this are to publish the sdk module and integrate it with the global API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

3 participants