Skip to content

Commit

Permalink
[pkg/stanza] Adopt semantic convention for the log file path attribute (
Browse files Browse the repository at this point in the history
#37210)

#### Description

This PR adopts [the semantic convention for the log file path
attribute](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/logs.md#log-file),
which should be `attributes["log.file.path"]`.

It fixes the default value for the `recombine` operator's
`source_identifier`.
  • Loading branch information
douglascamata authored Feb 3, 2025
1 parent 68b24ea commit 9b520cc
Show file tree
Hide file tree
Showing 8 changed files with 169 additions and 135 deletions.
28 changes: 28 additions & 0 deletions .chloggen/semantic_source_identifier.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: bug_fix

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: pkg/stanza

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Fix default source identifier in recombine operator

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [37210]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
Its defualt value is now aligned with the semantic conventions: `attributes["log.file.path"]`
# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
32 changes: 16 additions & 16 deletions pkg/stanza/docs/operators/recombine.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,22 @@ The `recombine` operator combines consecutive logs into single logs based on sim

### Configuration Fields

| Field | Default | Description |
| --- | --- | --- |
| `id` | `recombine` | A unique identifier for the operator. |
| `output` | Next in pipeline | The connected operator(s) that will receive all outbound entries. |
| `on_error` | `send` | The behavior of the operator if it encounters an error. See [on_error](../types/on_error.md). |
| `is_first_entry` | | An [expression](../types/expression.md) that returns true if the entry being processed is the first entry in a multiline series. |
| `is_last_entry` | | An [expression](../types/expression.md) that returns true if the entry being processed is the last entry in a multiline series. |
| `combine_field` | required | The [field](../types/field.md) from all the entries that will be recombined. |
| `combine_with` | `"\n"` | The string that is put between the combined entries. This can be an empty string as well. When using special characters like `\n`, be sure to enclose the value in double quotes: `"\n"`. |
| `max_batch_size` | 1000 | The maximum number of consecutive entries that will be combined into a single entry. |
| `max_unmatched_batch_size` | 100 | The maximum number of consecutive entries that will be combined into a single entry before the match occurs (with `is_first_entry` or `is_last_entry`), e.g. `max_unmatched_batch_size=0` - all entries combined, `max_unmatched_batch_size=1` - all entries uncombined until the match occurs, `max_unmatched_batch_size=100` - entries combined into 100-entry-packages until the match occurs |
| `overwrite_with` | `newest` | Whether to use the fields from the `oldest` or the `newest` entry for all the fields that are not combined. |
| `force_flush_period` | `5s` | Flush timeout after which entries will be flushed aborting the wait for their sub parts to be merged with. |
| `source_identifier` | `$attributes["file.path"]` | The [field](../types/field.md) to separate one source of logs from others when combining them. |
| `max_sources` | 1000 | The maximum number of unique sources allowed concurrently to be tracked for combining separately. |
| `max_log_size` | 0 | The maximum bytes size of the combined field. Once the size exceeds the limit, all received entries of the source will be combined and flushed. "0" of max_log_size means no limit. |
| Field | Default | Description |
| --- | --- | --- |
| `id` | `recombine` | A unique identifier for the operator. |
| `output` | Next in pipeline | The connected operator(s) that will receive all outbound entries. |
| `on_error` | `send` | The behavior of the operator if it encounters an error. See [on_error](../types/on_error.md). |
| `is_first_entry` | | An [expression](../types/expression.md) that returns true if the entry being processed is the first entry in a multiline series. |
| `is_last_entry` | | An [expression](../types/expression.md) that returns true if the entry being processed is the last entry in a multiline series. |
| `combine_field` | required | The [field](../types/field.md) from all the entries that will be recombined. |
| `combine_with` | `"\n"` | The string that is put between the combined entries. This can be an empty string as well. When using special characters like `\n`, be sure to enclose the value in double quotes: `"\n"`. |
| `max_batch_size` | 1000 | The maximum number of consecutive entries that will be combined into a single entry. |
| `max_unmatched_batch_size` | 100 | The maximum number of consecutive entries that will be combined into a single entry before the match occurs (with `is_first_entry` or `is_last_entry`), e.g. `max_unmatched_batch_size=0` - all entries combined, `max_unmatched_batch_size=1` - all entries uncombined until the match occurs, `max_unmatched_batch_size=100` - entries combined into 100-entry-packages until the match occurs |
| `overwrite_with` | `newest` | Whether to use the fields from the `oldest` or the `newest` entry for all the fields that are not combined. |
| `force_flush_period` | `5s` | Flush timeout after which entries will be flushed aborting the wait for their sub parts to be merged with. |
| `source_identifier` | attributes["log.file.path"] | The [field](../types/field.md) to separate one source of logs from others when combining them. |
| `max_sources` | 1000 | The maximum number of unique sources allowed concurrently to be tracked for combining separately. |
| `max_log_size` | 0 | The maximum bytes size of the combined field. Once the size exceeds the limit, all received entries of the source will be combined and flushed. "0" of max_log_size means no limit. |

Exactly one of `is_first_entry` and `is_last_entry` must be specified.

Expand Down
3 changes: 2 additions & 1 deletion pkg/stanza/operator/input/file/input_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import (
"github.com/stretchr/testify/require"

"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/entry"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/attrs"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/testutil"
)

Expand Down Expand Up @@ -62,7 +63,7 @@ func TestAddFileResolvedFields(t *testing.T) {

e := waitForOne(t, logReceived)
require.Equal(t, filepath.Base(symLinkPath), e.Attributes["log.file.name"])
require.Equal(t, symLinkPath, e.Attributes["log.file.path"])
require.Equal(t, symLinkPath, e.Attributes[attrs.LogFilePath])
require.Equal(t, filepath.Base(resolved), e.Attributes["log.file.name_resolved"])
require.Equal(t, resolved, e.Attributes["log.file.path_resolved"])
if runtime.GOOS != "windows" {
Expand Down
3 changes: 2 additions & 1 deletion pkg/stanza/operator/parser/container/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,15 @@ import (

"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/entry"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/errors"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/attrs"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/transformer/recombine"
)

const (
operatorType = "container"
recombineSourceIdentifier = "log.file.path"
recombineSourceIdentifier = attrs.LogFilePath
recombineIsLastEntry = "attributes.logtag == 'F'"
removeOriginalTimeFieldFeatureFlag = "filelog.container.removeOriginalTimeField"
)
Expand Down
3 changes: 2 additions & 1 deletion pkg/stanza/operator/parser/container/parser.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import (

"github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal/timeutils"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/entry"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/attrs"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper"
)
Expand All @@ -30,7 +31,7 @@ const (
crioPattern = "^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$"
containerdPattern = "^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$"
logpathPattern = "^.*(\\/|\\\\)(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\\-]+)(\\/|\\\\)(?P<container_name>[^\\._]+)(\\/|\\\\)(?P<restart_count>\\d+)\\.log$"
logPathField = "log.file.path"
logPathField = attrs.LogFilePath
crioTimeLayout = "2006-01-02T15:04:05.999999999Z07:00"
goTimeLayout = "2006-01-02T15:04:05.999Z"
)
Expand Down
Loading

0 comments on commit 9b520cc

Please sign in to comment.