Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logs Pipeline is not working for OTLP logs #5356

Open
atul-r opened this issue Jan 14, 2025 · 3 comments
Open

Logs Pipeline is not working for OTLP logs #5356

atul-r opened this issue Jan 14, 2025 · 3 comments
Labels
C-bug Category Bugs

Comments

@atul-r
Copy link

atul-r commented Jan 14, 2025

What type of bug is this?

Unexpected error

What subsystems are affected?

Distributed Cluster

Minimal reproduce step

My otelcollector config looks like below

receivers:
  filelog:
    ...
  otlp:
    ...   
processors:
  k8sattributes:
    ...        
  batch:
    ...   
  filter/body:
    logs:
      exclude:
        match_type: regexp
        bodies:
        - ".*/health"
        - ".*/healthz"     
  memory_limiter:
    ...         
  resource/logs:
    attributes:
      - key: k8s_pod_name
        from_attribute: k8s.pod.name
        action: insert
      - key: k8s_container_name
        from_attribute: k8s.container.name
        action: insert
      - key: k8s_container_restart_count
        from_attribute: k8s.container.restart_count
        action: insert
      - key: k8s_pod_uid
        from_attribute: k8s.pod.uid
        action: insert
      - key: k8s_namespace_name
        from_attribute: k8s.namespace.name
        action: insert
      - key: podclusterid
        value: myycluster
        action: insert
      - key: type
        value: resource_logs
        action: insert
      - key: BUILD_VERSION
        value: Observability-v1.0.0-96176
        action: insert
exporters:
  debug:
    verbosity: detailed 
  otlphttp/glogs:
    logs_endpoint:
      http://greptimedb-cluster-frontend.greptimedemo.svc.cluster.local:4000/v1/otlp/v1/logs?db=public_logs
    timeout: 30s
    headers:
      Authorization: "Basic base64encodedCreds"
      X-Greptime-DB-Name: public_logs
      X-Greptime-Log-Table-Name: logs
      X-Greptime-Log-Extract-Keys: BUILD_VERSION,k8s_container_name,k8s_namespace_name,k8s_pod_name,podclusterid,type,container.image.name,container.image.tag
    tls:
      insecure: true
service:
  telemetry:
    logs:
      level: debug 
    metrics:
      address: 0.0.0.0:8888
  pipelines:
    logs:
      receivers:
        - filelog
        - otlp
      processors:
        - memory_limiter
        - k8sattributes
        - resource/logs
        - filter/body
        - batch
      exporters:
        - debug
        - otlphttp/glogs

It has an exporter for exporting logs to greptime (otlphttp/glogs)
When I extract keys using the header X-Greptime-Log-Extract-Keys the table logs created in greptime has those keys as columns, which is right

But I am not able to do the same with pipeline.

otlphttp/glogs:
    logs_endpoint:
      http://greptimedb-cluster-frontend.greptimedemo.svc.cluster.local:4000/v1/otlp/v1/logs?db=public_logs
    timeout: 30s
    headers:
      Authorization: "Basic base64encodedCreds"
      X-Greptime-DB-Name: public_logs
      X-Greptime-Log-Table-Name: logs
      X-Greptime-Log-Pipeline-Name: otel_pipeline
    tls:
      insecure: true

My sample pipeline yaml otelpipeline.yaml is as below

processors:
  - dissect:
      fields:
        - resource_attributes
      patterns:
        - '{"k8s_pod_name":"%{k8s_pod_name}","k8s_namespace_name":"%{k8s_namespace_name}","k8s_container_name":"%{k8s_container_name}","podclusterid":"%{podclusterid}","type":"%{type}",*}'
      ignore_missing: true
  - date:
      fields:
        - timestamp
      formats:
        - "%d/%b/%Y:%H:%M:%S %z"
transform:
  - fields:
      - k8s_pod_name
      - k8s_namespace_name
      - k8s_container_name
      - podclusterid
      - type
    type: string
    index: tag
  - fields:
      - timestamp
    type: time
    index: timestamp

I created a pipeline using

curl -X "POST" "http://localhost:4000/v1/events/pipelines/otel_pipeline" \
     -H "Authorization: Basic base64encodedCreds" \
     -F "[email protected]"

The logs of my daemon set shows

Resource attributes:
-> k8s.container.restart_count: Str(0)
-> k8s.pod.uid: Str(acce252c-4bf3-4c25-bb21-be5fe13afd56)
-> k8s.container.name: Str(otel-collector)
-> k8s.namespace.name: Str(observability-client)
-> k8s.pod.name: Str(otelcollector-client-daemonset-q2rzk)
-> k8s.pod.start_time: Str(2025-01-14T10:02:33Z)
-> k8s.node.name: Str(aks-workers-15253729-vmss00001m)
-> k8s.label.app: Str(opentelemetry)
-> container.image.name: Str(docker.repo.abc.com/otel/opentelemetry-collector-contrib)
-> container.image.tag: Str(0.110.0)
-> k8s_pod_name: Str(otelcollector-client-daemonset-q2rzk)
-> k8s_container_name: Str(otel-collector)
-> k8s_container_restart_count: Str(0)
-> k8s_pod_uid: Str(acce252c-4bf3-4c25-bb21-be5fe13afd56)
-> k8s_namespace_name: Str(observability-client)
-> podclusterid: Str(mycluster)
-> type: Str(resource_logs)
-> BUILD_VERSION: Str(Observability-v1.0.0-96176)
ScopeLogs #0
ScopeLogs SchemaURL:
InstrumentationScope
LogRecord #0
ObservedTimestamp: 2025-01-14 10:05:01.988398411 +0000 UTC
Timestamp: 2025-01-14 10:04:09.806059621 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str(Timestamp: 2025-01-14 10:03:15.973538925 +0000 UTC)
Attributes:
-> time: Str(2025-01-14T10:04:09.806059621Z)
-> logtag: Str(F)
-> log.iostream: Str(stderr)
-> log.file.path: Str(/var/log/pods/observability-client_otelcollector-client-daemonset-q2rzk_acce252c-4bf3-4c25-bb21-be5fe13afd56/otel-collector/0.log)
Trace ID:
Span ID:
Flags: 0
LogRecord #1
ObservedTimestamp: 2025-01-14 10:05:01.988410644 +0000 UTC
Timestamp: 2025-01-14 10:04:09.806062637 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str(SeverityText:)
Attributes:
-> log.iostream: Str(stderr)
-> log.file.path: Str(/var/log/pods/observability-client_otelcollector-client-daemonset-q2rzk_acce252c-4bf3-4c25-bb21-be5fe13afd56/otel-collector/0.log)
-> time: Str(2025-01-14T10:04:09.806062637Z)
-> logtag: Str(F)

So the keys Resource attributes, Attributes, Body etc should be available

However, I see the below error in greptimedb-cluster-frontend pod

2025-01-14T10:09:07.712063Z WARN servers::error: Failed to handle HTTP request err=0: OpenTelemetry log error, at src/servers/src/otlp/logs.rs:75:22
1: Processor dissect: missing field: resource_attributes, at src/pipeline/src/etl/processor/dissect.rs:838:26

2025-01-14T10:09:07.763310Z WARN servers::error: Failed to handle HTTP request err=0: OpenTelemetry log error, at src/servers/src/otlp/logs.rs:75:22
1: Processor dissect: missing field: Resource attributes, at src/pipeline/src/etl/processor/dissect.rs:838:26

I tried changing the processors.dissect[0].fields to Attributes and Body but I still get the errors

Processor dissect: missing field: Attributes, at src/pipeline/src/etl/processor/dissect.rs:838:26

Processor dissect: missing field: Body, at src/pipeline/src/etl/processor/dissect.rs:838:26

I even tried using the events endpoint in my exporter to push logs, still getting the same errors

logs_endpoint:
    http://greptimedb-cluster-frontend.greptimedemo.svc.cluster.local:4000/v1/events/logs?db=public_logs&table=logs&pipeline_name=otel_pipeline

So I am not able to push logs from otelcollector daemonset to greptime using pipelines

What did you expect to see?

With a proper pipeline yaml I expected to see a table logs created in the public_logs database of greptime with below description

Column Type Key Null Default Semantic Type
timestamp TimestampNanosecond PRI NO TIMESTAMP
trace_id String YES FIELD
span_id String YES FIELD
severity_text String YES FIELD
severity_number Int32 YES FIELD
body String YES FIELD
log_attributes Json YES FIELD
trace_flags UInt32 YES FIELD
scope_name String PRI YES TAG
scope_version String YES FIELD
scope_attributes Json YES FIELD
scope_schema_url String YES FIELD
resource_attributes Json YES FIELD
resource_schema_url String YES FIELD
BUILD_VERSION String PRI YES TAG
container.image.name String PRI YES TAG
container.image.tag String PRI YES TAG
k8s_container_name String PRI YES TAG
k8s_namespace_name String PRI YES TAG
k8s_pod_name String PRI YES TAG
podclusterid String PRI YES TAG
type String PRI YES TAG

What did you see instead?

Errors in greptimedb-cluster-frontend pod

2025-01-14T10:09:07.712063Z WARN servers::error: Failed to handle HTTP request err=0: OpenTelemetry log error, at src/servers/src/otlp/logs.rs:75:22
1: Processor dissect: missing field: resource_attributes, at src/pipeline/src/etl/processor/dissect.rs:838:26

2025-01-14T10:09:07.763310Z WARN servers::error: Failed to handle HTTP request err=0: OpenTelemetry log error, at src/servers/src/otlp/logs.rs:75:22
1: Processor dissect: missing field: Resource attributes, at src/pipeline/src/etl/processor/dissect.rs:838:26
Processor dissect: missing field: Attributes, at src/pipeline/src/etl/processor/dissect.rs:838:26

Processor dissect: missing field: Body, at src/pipeline/src/etl/processor/dissect.rs:838:26

What operating system did you use?

aks cluster (OS: Ubuntu 22.04.4 LTS)

What version of GreptimeDB did you use?

0.11.2

Relevant log output and stack trace

No response

@atul-r atul-r added the C-bug Category Bugs label Jan 14, 2025
@waynexia
Copy link
Member

Can you check if resource_attributes field exists in the input log string from otlpcollector?

@atul-r
Copy link
Author

atul-r commented Jan 14, 2025

Can you check if resource_attributes field exists in the input log string from otlpcollector?

No but I used Attributes and Body as well in the pipeline dissect field. Those attributes exists but still I get similar error

Processor dissect: missing field: Attributes, at src/pipeline/src/etl/processor/dissect.rs:838:26

Processor dissect: missing field: Body, at src/pipeline/src/etl/processor/dissect.rs:838:26

@paomian
Copy link
Contributor

paomian commented Jan 15, 2025

I've made some modifications to this demo. I extracted only the Body field and wrote it to GreptimeDB. it works fine.

pipeline file

processors:
transform:
  - fields:
      - Body,body
    type: string

alloy config

otelcol.exporter.otlphttp "greptimedb_logs" {
  client {
    endpoint = "${GREPTIME_SCHEME:=http}://${GREPTIME_HOST:=greptimedb}:${GREPTIME_PORT:=4000}/v1/otlp/"
    headers  = {
      "X-Greptime-DB-Name" = "${GREPTIME_DB:=public}",
      "x-greptime-log-table-name" = "alloy_meta_logs",
      // "x-greptime-log-extract-keys" = "hostname",
      "X-Greptime-Log-Pipeline-Name" = "test",
    }
    auth     = otelcol.auth.basic.credentials.handler
  }
}

I can get this table in GreptimeDB

mysql> select * from alloy_meta_logs\G
*************************** 1. row ***************************
              body: {"ts":"2025-01-15T04:18:04.605693944Z","level":"error","msg":"Exporting failed. Dropping data.","component_path":"/","component_id":"otelcol.exporter.otlphttp.greptimedb_logs","error":"not retryable error: Permanent error: rpc error: code = InvalidArgument desc = error exporting items, request to http://greptimedb:4000/v1/otlp/v1/logs responded with HTTP Status Code 400","dropped_items":1}

greptime_timestamp: 2025-01-15 04:18:04.605572
*************************** 2. row ***************************
              body: {"ts":"2025-01-15T04:18:19.321469329Z","level":"info","msg":"Done replaying WAL","component_path":"/","component_id":"prometheus.remote_write.metrics_service","subcomponent":"rw","remote_name":"d2df62","url":"http://greptimedb:4000/v1/prometheus/write?db=public","duration":17008218529}

greptime_timestamp: 2025-01-15 04:18:19.323454
2 rows in set (0.01 sec)

But one thing not mentioned in the documentation is. The key name used in OpenTelemetry is as follows

  • Timestamp
  • ObservedTimestamp
  • TraceId
  • SpanId
  • TraceFlags
  • SeverityText
  • SeverityNumber
  • Body
  • ResourceSchemaUrl
  • ResourceAttributes
  • ScopeSchemaUrl
  • ScopeName
  • ScopeVersion
  • ScopeAttributes
  • LogAttributes

Can you provide the binary data sent by otelcollector.
I'm not sure if it's the log binary data being sent that's the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category Bugs
Projects
None yet
Development

No branches or pull requests

3 participants