Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the capturing of environment variables #327

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions host/host.go
Original file line number Diff line number Diff line change
Expand Up @@ -60,4 +60,5 @@ type Trace struct {
APMTraceID libpf.APMTraceID
APMTransactionID libpf.APMTransactionID
CPU int
EnvVars map[string]string
}
38 changes: 38 additions & 0 deletions processmanager/processinfo.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ import (
"fmt"
"os"
"path"
"slices"
"strconv"
"strings"
"syscall"
"time"

Expand All @@ -36,6 +39,15 @@ import (
"go.opentelemetry.io/ebpf-profiler/util"
)

type ProcessManagerConfig struct {
extractEnvVars []string
}

var pm_cfg = ProcessManagerConfig{extractEnvVars: []string{
"PIPELINE_PPOID",
"PIPELINE_SOFTWARENAME",
"PIPELINE_JOBID"}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest using a map for faster (O(1)) lookup.
Also, this needs to be configurable, as you already said.
This map of env vars could be part of tracer/Config and be passed to processmanager.New().
Then it needs a configuration option exposed to the user, see cli_flags.go.

The scanning of the environ file should only happen if that map of env variables is not empty.


// assignTSDInfo updates the TSDInfo for the Interpreters on given PID.
// Caller must hold pm.mu write lock.
func (pm *ProcessManager) assignTSDInfo(pid libpf.PID, tsdInfo *tpbase.TSDInfo) {
Expand Down Expand Up @@ -88,11 +100,26 @@ func (pm *ProcessManager) updatePidInformation(pid libpf.PID, m *Mapping) (bool,
if name, err := os.ReadFile(fmt.Sprintf("/prod/%d/comm", pid)); err == nil {
processName = string(name)
}

envVarMap := make(map[string]string)
if envVars, err := os.ReadFile(fmt.Sprintf("/proc/%d/environ", pid)); err == nil {
splittedVars := strings.Split(string(envVars), "\000")
fmt.Println("EnvVars for PID" + strconv.Itoa(int(pid)))
for _, envVar := range splittedVars {
keyValuePair := strings.Split(envVar, "=")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct as values can themselves contain "=".

Suggested change
keyValuePair := strings.Split(envVar, "=")
keyValuePair := strings.SplitN(envVar, "=", 2)

if slices.Contains(pm_cfg.extractEnvVars, keyValuePair[0]) {
envVarMap[keyValuePair[0]] = keyValuePair[1]
fmt.Println(envVar)
}
}
}

info = &processInfo{
name: processName,
executable: exePath,
mappings: make(map[libpf.Address]*Mapping),
mappingsByFileID: make(map[host.FileID]map[libpf.Address]*Mapping),
envVariables: envVarMap,
tsdInfo: nil,
}
pm.pidToProcessInfo[pid] = info
Expand Down Expand Up @@ -730,3 +757,14 @@ func (pm *ProcessManager) ProcessedUntil(traceCaptureKTime times.KTime) {

// Compile time check to make sure we satisfy the interface.
var _ tracehandler.TraceProcessor = (*ProcessManager)(nil)

func (pm *ProcessManager) EnvVarsForPID(pid libpf.PID) map[string]string {
var envVars map[string]string

pm.mu.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How critical are information about environment variables, that they should hold a global lock on the ProcessManager?

Copy link
Contributor

@rockdaboot rockdaboot Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a user requires this information, it has the same criticality as data from other *ForPID functions.

In loadBpfTrace() we call now three of these functions when initializing the host.Trace{} instance. This means we do an RLock three times.

I'd suggest to fetch the data with a single function call / single lock in a follow-up PR.

defer pm.mu.RUnlock()
if procInfo, ok := pm.pidToProcessInfo[pid]; ok {
envVars = procInfo.envVariables
}
return envVars
}
2 changes: 2 additions & 0 deletions processmanager/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ type processInfo struct {
mappingsByFileID map[host.FileID]map[libpf.Address]*Mapping
// C-library Thread Specific Data information
tsdInfo *tpbase.TSDInfo
// process env vars from /proc/PID/environ
envVariables map[string]string
}

// addMapping adds a mapping to the internal indices.
Expand Down
1 change: 1 addition & 0 deletions reporter/base_reporter.go
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ func (b *baseReporter) ReportTraceEvent(trace *libpf.Trace, meta *samples.TraceE
MappingFileOffsets: trace.MappingFileOffsets,
Timestamps: []uint64{uint64(meta.Timestamp)},
OffTimes: []int64{meta.OffTime},
EnvVars: meta.EnvVars,
}
}

Expand Down
5 changes: 5 additions & 0 deletions reporter/internal/pdata/generate.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import (
log "github.com/sirupsen/logrus"
"go.opentelemetry.io/collector/pdata/pcommon"
"go.opentelemetry.io/collector/pdata/pprofile"
"go.opentelemetry.io/otel/attribute"
semconv "go.opentelemetry.io/otel/semconv/v1.4.0"

"go.opentelemetry.io/ebpf-profiler/libpf"
Expand Down Expand Up @@ -206,6 +207,10 @@ func (p *Pdata) setProfile(
attrMgr.AppendInt(sample.AttributeIndices(),
semconv.ProcessPIDKey, traceKey.Pid)

for key, value := range traceInfo.EnvVars {
attrMgr.AppendOptionalString(sample.AttributeIndices(), attribute.Key("env."+key), value)
}

if p.ExtraSampleAttrProd != nil {
extra := p.ExtraSampleAttrProd.ExtraSampleAttrs(attrMgr, traceKey.ExtraMeta)
sample.AttributeIndices().Append(extra...)
Expand Down
3 changes: 3 additions & 0 deletions reporter/samples/samples.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ type TraceEventMeta struct {
CPU int
Origin libpf.Origin
OffTime int64
EnvVars map[string]string
}

// TraceEvents holds known information about a trace.
Expand All @@ -27,6 +28,7 @@ type TraceEvents struct {
MappingFileOffsets []uint64
Timestamps []uint64 // in nanoseconds
OffTimes []int64 // in nanoseconds
EnvVars map[string]string
}

// TraceAndMetaKey is the deduplication key for samples. This **must always**
Expand All @@ -44,6 +46,7 @@ type TraceAndMetaKey struct {
ProcessName string
// Executable path is retrieved from /proc/PID/exe
ExecutablePath string

// ExtraMeta stores extra meta info that may have been produced by a
// `SampleAttrProducer` instance. May be nil.
ExtraMeta any
Expand Down
3 changes: 1 addition & 2 deletions support/ebpf/errors.h
Original file line number Diff line number Diff line change
Expand Up @@ -121,8 +121,7 @@ typedef enum ErrorCode {
// Native: Unable to read the IRQ stack link
ERR_NATIVE_CHASE_IRQ_STACK_LINK = 4010,

// Native: Unexpectedly encountered a kernel mode pointer while attempting to unwind user-mode
// stack
// Native: Unexpectedly encountered a kernel mode pointer while attempting to unwind user-mode stack
ERR_NATIVE_UNEXPECTED_KERNEL_ADDRESS = 4011,

// Native: Unable to locate the PID page mapping for the current instruction pointer
Expand Down
1 change: 1 addition & 0 deletions tracehandler/tracehandler.go
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ func (m *traceHandler) HandleTrace(bpfTrace *host.Trace) {
ExecutablePath: bpfTrace.ExecutablePath,
Origin: bpfTrace.Origin,
OffTime: bpfTrace.OffTime,
EnvVars: bpfTrace.EnvVars,
}

if !m.reporter.SupportsReportTraceEvent() {
Expand Down
1 change: 1 addition & 0 deletions tracer/tracer.go
Original file line number Diff line number Diff line change
Expand Up @@ -990,6 +990,7 @@ func (t *Tracer) loadBpfTrace(raw []byte, cpu int) *host.Trace {
OffTime: int64(ptr.offtime),
KTime: times.KTime(ptr.ktime),
CPU: cpu,
EnvVars: t.processManager.EnvVarsForPID(pid),
}

if trace.Origin != support.TraceOriginSampling && trace.Origin != support.TraceOriginOffCPU {
Expand Down