Do research on telemetry collectors and how we can use them, and if we are missing to send some events #2649

dborovcanin · 2025-01-17T09:45:51Z

No description provided.

felixgateru · 2025-02-06T09:24:53Z

Client Telemetry Processing and Aggregation

Current Telemetry Fields

Client telemetry includes the following fields:

First Seen: Updated at client creation.
Last Seen: Updated when the client publishes or subscribes to a topic.
Inbound Messages: Updated when a client publishes a message.
- Increases the count of inbound messages for the publishing client.
Outbound Messages: Updated when a client publishes a message.
- Increases the count of outbound messages for each client subscribed to the publisher’s topic.
Subscriptions: Managed via subscribe and unsubscribe events from messaging.

Storage and Querying

Currently handled by Postgres.

Investigated Alternatives

1. Prometheus

Prometheus stores all data as time series, a stream of timestamped values for the same metric.

Integration Process

Expose Metrics Endpoint: Metrics must be exposed via an HTTP endpoint for Prometheus to scrape.
Define Metrics in Prometheus: Each telemetry field is defined as a Prometheus metric type.
Scrape Configuration: Prometheus is configured to scrape the service for the metrics.

Prometheus Metrics Mapping

Client Telemetry Field	Prometheus Metric Type
inbound_messages	Counter
outbound_messages	Counter
first_seen	Gauge
last_seen	Gauge
subscriptions	Gauge

Go Integration

Prometheus provides a Go client library for defining metrics.

Querying

Prometheus queries are written in PromQL. It supports selecting and aggregating time series data.

Advanatges

Supports real time monitoring and alerting
Will support high cardinality metrics

Disadvantages

Lacks long term data storage so will require maintaining Postgres instane.
Eventual consistency
Lacks relational queries so aggregation may be difficult

References

PromQL Basics

2. Elasticsearch

Elasticsearch stores data in the form of JSON documents, making schema design straightforward.

Suggested Schema

{
  "mappings": {
    "properties": {
      "client_id": { "type": "keyword" },
      "domain_id": { "type": "keyword" },
      "inbound_messages": { "type": "long" },
      "outbound_messages": { "type": "long" },
      "first_seen": { "type": "date" },
      "last_seen": { "type": "date" }
    }
  }
}

Integration Process

Defining the data structure
Defining the data structure
Setting up Elasticsearch
Data migration
Setting up API calls to elastic
Setting up queries for aggregation
Exporting data for analytics

Querying

Elasticsearch uses Query DSL (based on JSON) for defining queries. These queries can be exposed as endpoints.

Advantages

Powerful search and aggregations
Time series data support
Near real time data availability

Disadvantages

Flagged as resource intensive
Eventually consistent vs strongly consistent Postgres

3. Apache Druid

Real time analytics database designed for fast slice and dice analytics large datasets. The list of common application areas does include what we are trying to achieve.

Suggested Schema

Similar to Elasticsearch

Integration Process

Defining the data structure for telemetry fields in Druid
Setting up Apache Druid
Setting up an ingestion method, supports both real time streaming and batch ingestion
Setting up update and insert queries
Exporting data for analytics

Advantages

1.Flexible ingestion with realtime and batch ingestion
2. Offers flexible queries with targeted optimization for aggregation and reporting queries.
3. Support for high cardinality data columns

Disadvantages

Not suitable for low latency updates of existing records using a primary key.

Reference Links

dborovcanin mentioned this issue Jan 17, 2025

Improve journaling with clients telemetry #2546

Open

2 tasks

dborovcanin assigned felixgateru Jan 17, 2025

dborovcanin added this to SuperMQ Jan 17, 2025

github-project-automation bot moved this to ⛏ Backlog in SuperMQ Jan 17, 2025

felixgateru moved this from ⛏ Backlog to 🚧 In Progress in SuperMQ Jan 20, 2025

felixgateru moved this from 🚧 In Progress to 🩺 Review and testing in SuperMQ Jan 27, 2025

dborovcanin moved this from 🩺 Review and testing to ⛏ Backlog in SuperMQ Jan 29, 2025

felixgateru moved this from ⛏ Backlog to 🚧 In Progress in SuperMQ Jan 30, 2025

felixgateru moved this from 🚧 In Progress to 🩺 Review and testing in SuperMQ Feb 6, 2025

felixgateru moved this from 🩺 Review and testing to 🚧 In Progress in SuperMQ Feb 6, 2025

felixgateru moved this from 🚧 In Progress to ⛏ Backlog in SuperMQ Feb 7, 2025

dborovcanin self-assigned this Feb 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do research on telemetry collectors and how we can use them, and if we are missing to send some events #2649

Do research on telemetry collectors and how we can use them, and if we are missing to send some events #2649

dborovcanin commented Jan 17, 2025

felixgateru commented Feb 6, 2025 •

edited

Loading

Do research on telemetry collectors and how we can use them, and if we are missing to send some events #2649

Do research on telemetry collectors and how we can use them, and if we are missing to send some events #2649

Comments

dborovcanin commented Jan 17, 2025

felixgateru commented Feb 6, 2025 • edited Loading

Client Telemetry Processing and Aggregation

Current Telemetry Fields

Storage and Querying

Investigated Alternatives

1. Prometheus

Integration Process

Prometheus Metrics Mapping

Go Integration

Querying

Advanatges

Disadvantages

References

2. Elasticsearch

Suggested Schema

Integration Process

Querying

Advantages

Disadvantages

3. Apache Druid

Suggested Schema

Integration Process

Advantages

Disadvantages

Reference Links

felixgateru commented Feb 6, 2025 •

edited

Loading