Skip to content

Commit

Permalink
[apm] Update Applications nav subsection from "Services" to "Service …
Browse files Browse the repository at this point in the history
…inventory" (#4517)

* update references to services item in serverless side bar

* update references to services item in stateful side bar

* scale back serverless updates

(cherry picked from commit a24b240)

# Conflicts:
#	docs/en/serverless/apm/apm-find-transaction-latency-and-failure-correlations.asciidoc
#	docs/en/serverless/apm/apm-get-started.asciidoc
#	docs/en/serverless/apm/apm-ui-services.asciidoc
  • Loading branch information
colleenmcginnis authored and mergify[bot] committed Nov 8, 2024
1 parent 8eb38f6 commit 0d50603
Show file tree
Hide file tree
Showing 5 changed files with 352 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ and enable the **Service groups feature**.

To create a service group:

. Navigate to **Observability** → **Applications** → **Services**.
. Navigate to **Observability** → **Applications** → **Service inventory**.
. Switch to **Service groups**.
. Click **Create group**.
. Specify a name, color, and description.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,7 @@ kubectl apply -f demo.yml
[discrete]
=== View your traces in {kib}

To view your application's trace data, open {kib} and go to *{observability} → Services*.
To view your application's trace data, open {kib} and go to *{observability} → Service inventory*.

The Applications UI allows you to monitor your software services and applications in real-time:
visualize detailed performance information on your services, identify and analyze errors,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
[[observability-apm-find-transaction-latency-and-failure-correlations]]
= Find transaction latency and failure correlations

// :keywords: serverless, observability, how-to

preview:[]

Correlations surface attributes of your data that are potentially correlated
with high-latency or erroneous transactions. For example, if you are a site
reliability engineer who is responsible for keeping production systems up and
running, you want to understand what is causing slow transactions. Identifying
attributes that are responsible for higher latency transactions can potentially
point you toward the root cause. You may find a correlation with a particular
piece of hardware, like a host or pod. Or, perhaps a set of users, based on IP
address or region, is facing increased latency due to local data center issues.

To find correlations:

. In your {obs-serverless} project, go to **Applications** → **Service Inventory**.
. Select a service.
. Select the **Transactions** tab.
. Select a transaction group in the **Transactions** table.

[NOTE]
====
Active queries _are_ applied to correlations.
====

[discrete]
[[observability-apm-find-transaction-latency-and-failure-correlations-find-high-transaction-latency-correlations]]
== Find high transaction latency correlations

The correlations on the **Latency correlations** tab help you discover which
attributes are contributing to increased transaction latency.

[role="screenshot"]
image::images/transactions/correlations-hover.png[APM latency correlations]

The progress bar indicates the status of the asynchronous analysis, which
performs statistical searches across a large number of attributes. For large
time ranges and services with high transaction throughput, this might take some
time. To improve performance, reduce the time range.

The latency distribution chart visualizes the overall latency of the
transactions in the transaction group. If there are attributes that have a
statistically significant correlation with slow response times, they are listed
in a table below the chart. The table is sorted by correlation coefficients that
range from 0 to 1. Attributes with higher correlation values are more likely to
contribute to high latency transactions. By default, the attribute with the
highest correlation value is added to the chart. To see the latency distribution
for other attributes, select their row in the table.

If a correlated attribute seems noteworthy, use the **Filter** quick links:

* `+` creates a new query in the Applications UI for filtering transactions containing
the selected value.
* `-` creates a new query in the Applications UI to filter out transactions containing
the selected value.

You can also click the icon beside the field name to view and filter its most
popular values.

In this example screenshot, there are transactions that are skewed to the right
with slower response times than the overall latency distribution. If you select
the `+` filter in the appropriate row of the table, it creates a new query in
the Applications UI for transactions with this attribute. With the "noise" now
filtered out, you can begin viewing sample traces to continue your investigation.

[discrete]
[[correlations-error-rate]]
== Find failed transaction correlations

The correlations on the **Failed transaction correlations** tab help you discover
which attributes are most influential in distinguishing between transaction
failures and successes. In this context, the success or failure of a transaction
is determined by its {ecs-ref}/ecs-event.html#field-event-outcome[event.outcome]
value. For example, APM agents set the `event.outcome` to `failure` when an HTTP
transaction returns a `5xx` status code.

The chart highlights the failed transactions in the overall latency distribution
for the transaction group. If there are attributes that have a statistically
significant correlation with failed transactions, they are listed in a table.
The table is sorted by scores, which are mapped to high, medium, or low impact
levels. Attributes with high impact levels are more likely to contribute to
failed transactions. By default, the attribute with the highest score is added
to the chart. To see a different attribute in the chart, select its row in the
table.

For example, in the screenshot below, there are attributes such as a specific
node and pod name that have medium impact on the failed transactions.

[role="screenshot"]
image::images/correlations/correlations-failed-transactions.png[Failed transaction correlations]

Select the `+` filter to create a new query in the Applications UI for transactions
with one or more of these attributes. If you are unfamiliar with a field, click
the icon beside its name to view its most popular values and optionally filter
on those values too. Each time that you add another attribute, it is filtering
out more and more noise and bringing you closer to a diagnosis.
186 changes: 186 additions & 0 deletions docs/en/serverless/apm/apm-get-started.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
[[observability-apm-get-started]]
= Get started with traces and APM

// :description: Learn how to collect Application Performance Monitoring (APM) data and visualize it in real time.
// :keywords: serverless, observability, how-to

preview:[]

:role: Admin
:goal: send APM data to Elastic
include::../partials/roles.asciidoc[]
:role!:

:goal!:

In this guide you'll learn how to collect and send Application Performance Monitoring (APM) data
to Elastic, then explore and visualize the data in real time.

[discrete]
[[add-apm-integration-agents]]
== Step 1: Add data

You'll use APM agents to send APM data from your application to Elastic. Elastic offers APM agents
written in several languages and supports OpenTelemetry. Which agent you'll use depends on the language used in your service.

To send APM data to Elastic, you must install an APM agent and configure it to send data to
your project:

. <<observability-create-an-observability-project,Create a new {obs-serverless} project>>, or open an existing one.
. To install and configure one or more APM agents, do one of following:
+
** In your Observability project, go to **Add data** → **Monitor my application performance** → **Elastic APM** and follow the prompts.
** Use the following instructions:
+
--
++++
<div class="tabs" data-tab-group="apm-apm-get-started">
<div role="tablist" aria-label="apm-apm-get-started">
<button role="tab" aria-selected="true" aria-controls="apm-apm-get-started-go-panel" id="apm-apm-get-started-go-button">
Go
</button>
<button role="tab" aria-selected="false" aria-controls="apm-apm-get-started-java-panel" id="apm-apm-get-started-java-button" tabindex="-1">
Java
</button>
<button role="tab" aria-selected="false" aria-controls="apm-apm-get-started-net-panel" id="apm-apm-get-started-net-button" tabindex="-1">
.NET
</button>
<button role="tab" aria-selected="false" aria-controls="apm-apm-get-started-nodejs-panel" id="apm-apm-get-started-nodejs-button" tabindex="-1">
Node.js
</button>
<button role="tab" aria-selected="false" aria-controls="apm-apm-get-started-php-panel" id="apm-apm-get-started-php-button" tabindex="-1">
PHP
</button>
<button role="tab" aria-selected="false" aria-controls="apm-apm-get-started-python-panel" id="apm-apm-get-started-python-button" tabindex="-1">
Python
</button>
<button role="tab" aria-selected="false" aria-controls="apm-apm-get-started-ruby-panel" id="apm-apm-get-started-ruby-button" tabindex="-1">
Ruby
</button>
<button role="tab" aria-selected="false" aria-controls="apm-apm-get-started-opentelemetry-panel" id="apm-apm-get-started-opentelemetry-button" tabindex="-1">
OpenTelemetry
</button>
</div>
<div tabindex="0" role="tabpanel" id="apm-apm-get-started-go-panel" aria-labelledby="apm-apm-get-started-go-button">
++++
include::../transclusion/apm/guide/install-agents/go.asciidoc[]

++++
</div>
<div tabindex="0" role="tabpanel" id="apm-apm-get-started-java-panel" aria-labelledby="apm-apm-get-started-java-button" hidden="">
++++
include::../transclusion/apm/guide/install-agents/java.asciidoc[]

++++
</div>
<div tabindex="0" role="tabpanel" id="apm-apm-get-started-net-panel" aria-labelledby="apm-apm-get-started-net-button" hidden="">
++++
include::../transclusion/apm/guide/install-agents/net.asciidoc[]

++++
</div>
<div tabindex="0" role="tabpanel" id="apm-apm-get-started-nodejs-panel" aria-labelledby="apm-apm-get-started-nodejs-button" hidden="">
++++
include::../transclusion/apm/guide/install-agents/node.asciidoc[]

++++
</div>
<div tabindex="0" role="tabpanel" id="apm-apm-get-started-php-panel" aria-labelledby="apm-apm-get-started-php-button" hidden="">
++++
include::../transclusion/apm/guide/install-agents/php.asciidoc[]

++++
</div>
<div tabindex="0" role="tabpanel" id="apm-apm-get-started-python-panel" aria-labelledby="apm-apm-get-started-python-button" hidden="">
++++
include::../transclusion/apm/guide/install-agents/python.asciidoc[]

++++
</div>
<div tabindex="0" role="tabpanel" id="apm-apm-get-started-ruby-panel" aria-labelledby="apm-apm-get-started-ruby-button" hidden="">
++++
include::../transclusion/apm/guide/install-agents/ruby.asciidoc[]

++++
</div>
<div tabindex="0" role="tabpanel" id="apm-apm-get-started-opentelemetry-panel" aria-labelledby="apm-apm-get-started-opentelemetry-button" hidden="">
++++
include::../transclusion/apm/guide/open-telemetry/otel-get-started.asciidoc[]

++++
</div>
</div>
++++
--
+
While there are many configuration options, all APM agents require:
+
|===
| Option | Description

| **Service name**
a| The APM integration maps an instrumented service's name — defined in
each {apm-agent}'s configuration — to the index where its data is stored.
Service names are case-insensitive and must be unique.

For example, you cannot have a service named `Foo` and another named `foo`.
Special characters will be removed from service names and replaced with underscores (`_`).

| **Server URL**
a| The host and port that the managed intake service listens for events on.

To find the URL for your project:

. Go to the https://cloud.elastic.co/[Cloud console].
. Next to your project, select **Manage**.
. Next to _Endpoints_, select **View**.
. Copy the _APM endpoint_.

| **API key**
a| Authentication method for communication between {apm-agent} and the managed intake service.

You can create and delete API keys in Applications Settings:

. Go to any page in the _Applications_ section of the main menu.
. Click **Settings** in the top bar.
. Go to the **Agent keys** tab.

| **Environment**
a| The name of the environment this service is deployed in, for example "production" or "staging".

Environments allow you to easily filter data on a global level in the UI.
It's important to be consistent when naming environments across agents.
|===
. If you're using the step-by-step instructions in the UI, after you've installed and configured an agent,
you can click **Check Agent Status** to verify that the agent is sending data.

To learn more about APM agents, including how to fine-tune how agents send traces to Elastic,
refer to <<observability-apm-send-data-to-elastic>>.

[discrete]
[[view-apm-integration-data]]
== Step 2: View your data

After one or more APM agents are installed and successfully sending data, you can view
application performance monitoring data in the UI.

In the _Applications_ section of the main menu, select **Service Inventory**.
This will show a high-level overview of the health and general performance of all your services.

Learn more about visualizing APM data in <<observability-apm-view-and-analyze-traces>>.

// TO DO: ADD SCREENSHOT

[TIP]
====
Not seeing any data? Find helpful tips in <<observability-apm-troubleshooting,Troubleshooting>>.
====

[discrete]
[[observability-apm-get-started-next-steps]]
== Next steps

Now that data is streaming into your project, take your investigation to a
deeper level. Learn how to use <<observability-apm-view-and-analyze-traces,Elastic's built-in visualizations for APM data>>,
<<observability-alerting,alert on APM data>>,
or <<observability-apm-send-data-to-elastic,fine-tune how agents send traces to Elastic>>.
65 changes: 65 additions & 0 deletions docs/en/serverless/apm/apm-ui-services.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
[[observability-apm-services]]
= Services

// :keywords: serverless, observability, reference

preview:[]

The **Services** inventory provides a quick, high-level overview of the health and general
performance of all instrumented services.

To help surface potential issues, services are sorted by their health status:
**critical** → **warning** → **healthy** → **unknown**.
Health status is powered by <<observability-apm-integrate-with-machine-learning,machine learning>>
and requires anomaly detection to be enabled.

In addition to health status, active alerts for each service are prominently displayed in the service inventory table. Selecting an active alert badge brings you to the **Alerts** tab where you can learn more about the active alert and take action.

[role="screenshot"]
image::images/services/apm-services-overview.png[Example view of services table the Applications UI]

[discrete]
[[observability-apm-services-service-groups]]
== Service groups

:role: Editor
:goal: create and manage service groups
include::../partials/roles.asciidoc[]
:role!:

:goal!:

:feature: Service grouping
include::../partials/feature-beta.asciidoc[]
:feature!:

Group services together to build meaningful views that remove noise, simplify investigations across services,
and combine related alerts.

// This screenshot is reused in the alerts docs

// Ensure it has an active alert showing

[role="screenshot"]
image::images/services/apm-service-group.png[Example view of service group in the Applications UI]

To create a service group:

. In your {obs-serverless} project, go to **Applications** → **Service Inventory**.
. Switch to **Service groups**.
. Click **Create group**.
. Specify a name, color, and description.
. Click **Select services**.
. Specify a {kibana-ref}/kuery-query.html[Kibana Query Language (KQL)] query to filter services
by one or more of the following dimensions: `agent.name`, `service.name`, `service.language.name`,
`service.environment`, `labels.<xyz>`. Services that match the query within the last 24 hours will
be assigned to the group.

[discrete]
[[observability-apm-services-examples]]
=== Examples

Not sure where to get started? Here are some sample queries you can build from:

* **Group services by environment**: To group "production" services, use `service.environment : "production"`.
* **Group services by name**: To group all services that end in "beat", use `service.name : *beat`. This will match services named "Auditbeat", "Heartbeat", "Filebeat", and so on.

0 comments on commit 0d50603

Please sign in to comment.