Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add readiness checks #652

Merged
merged 52 commits into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
52b78ef
add readiness checks
ekneg54 Aug 21, 2024
ddbede8
add health endpoint to metrics asgi app
ekneg54 Aug 29, 2024
999aa5b
add health_check function
ekneg54 Aug 30, 2024
81ad55d
add ability to inject health_check_functions
ekneg54 Aug 30, 2024
2f1510d
add health method to all components
ekneg54 Aug 30, 2024
4993fd4
remove unused import
ekneg54 Aug 30, 2024
ada1b10
fix spelling
ekneg54 Aug 30, 2024
6d06d32
make health check the default for helm charts
ekneg54 Aug 30, 2024
284ed05
WIP: start adding health checks
ekneg54 Aug 30, 2024
ad982b4
fix tests for labeler and pseudonymizer
ekneg54 Sep 10, 2024
54b2ede
fix tests for pipeline_manager
ekneg54 Sep 10, 2024
f1415f8
add healthcheck for http_input
ekneg54 Sep 10, 2024
5aa9d0e
finish healthcheck for http_input
ekneg54 Sep 10, 2024
0b25b17
spelling
ekneg54 Sep 10, 2024
d8bb76e
add health_timeout
ekneg54 Sep 11, 2024
21fee87
add health checks
ekneg54 Sep 11, 2024
181e897
optimize asgi health app
ekneg54 Sep 11, 2024
0bf7b93
make kafka_config immutable
ekneg54 Sep 11, 2024
39c7fbf
deactivate testcontroller tests
ekneg54 Sep 11, 2024
46d24bf
skip controller test becaus it blocks
ekneg54 Sep 11, 2024
5a1ce82
add healthcheck for opensearch
ekneg54 Sep 11, 2024
4ec83af
add healthcheck on setup
ekneg54 Sep 11, 2024
57bb56b
test if setup calls waitforhealth
ekneg54 Sep 11, 2024
b781920
fix processor tests
ekneg54 Sep 11, 2024
7afeeca
fix tests
ekneg54 Sep 11, 2024
11df379
fix s3 tests
ekneg54 Sep 11, 2024
1f213cc
fix kafka tests
ekneg54 Sep 11, 2024
524a538
add test
ekneg54 Sep 12, 2024
3e1e61c
use exitcodes enum
ekneg54 Sep 12, 2024
21a9727
add is_running tests to exporter and add threadingserver tests
ekneg54 Sep 12, 2024
5263b24
fix tests for injecting health functions
ekneg54 Sep 12, 2024
ff876d7
revise inject behavior
ekneg54 Sep 12, 2024
0963479
implement restart behavior
ekneg54 Sep 12, 2024
d45f5c7
Refactor PrometheusExporter test to use uvicorn_config for server host
ekneg54 Sep 12, 2024
75f7478
Refactor ThreadingHTTPServer shut_down method to handle case when ser…
ekneg54 Sep 12, 2024
61eadd1
Remove setup method in ConfluentKafkaInput and ConfluentKafkaOutput
ekneg54 Sep 12, 2024
e95b920
Remove wait for health from setup method in Component class
ekneg54 Sep 12, 2024
1ba9f1a
Refactor ThreadingHTTPServer shut_down method to ensure graceful serv…
ekneg54 Sep 12, 2024
c8cceb0
implement update_healthchecks to prometheus_exporter
ekneg54 Sep 13, 2024
8b20a5e
fix pipeline_manager tests
ekneg54 Sep 13, 2024
cece1a3
fix pipeline_manager call order
ekneg54 Sep 13, 2024
32ab246
fix thread does not terminate sometimes
ekneg54 Sep 13, 2024
9c69e39
fix kafka tests
ekneg54 Sep 13, 2024
a181ea5
add comment for component setup method
ekneg54 Sep 13, 2024
ec78e39
add development example setup
ekneg54 Sep 13, 2024
8937904
move configuration to the top level of documentation
ekneg54 Sep 13, 2024
99165fd
add healthcheck documentation
ekneg54 Sep 13, 2024
36c98e4
update changelog
ekneg54 Sep 13, 2024
8e7a0e8
rewrite documentation intro
ekneg54 Sep 13, 2024
6c41bd1
rewrite documentation intro
ekneg54 Sep 13, 2024
db128f5
fix amides test
ekneg54 Sep 16, 2024
75cbedc
reactivate test and add suggestions from code review
ekneg54 Sep 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,22 @@
* remove AutoRuleCorpusTester

### Features

* adds health check endpoint to metrics on path `/health`
* changes helm chart to use new readiness check
* adds `healthcheck_timeout` option to all components to tweak the timeout of healthchecks
* adds `desired_cluster_status` option to opensearch output to signal healthy cluster status
* initially run health checks on setup for every configured component
* make `imagePullPolicy` configurable for helm chart deployments


### Improvements

* remove AutoRuleCorpusTester
* adds support for rust extension development
* adds prebuild wheels for architectures `x86_64` and `aarch64` on `manylinux` and `musllinux` based linux platforms to releases
* add manual how to use local images with minikube example setup to documentation
* move `Configuration` to top level of documentation

### Bugfix

Expand Down
2 changes: 1 addition & 1 deletion charts/logprep/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: "13.2.3"
version: "13.3.0"

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
Expand Down
17 changes: 2 additions & 15 deletions charts/logprep/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ spec:
resources:
{{- toYaml .Values.resources | nindent 12 }}
image: {{ .Values.image.registry }}/{{ .Values.image.repository }}:{{ .Values.image.tag }}
imagePullPolicy: Always
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
{{- if .Values.exporter.enabled }}
- name: exporter
Expand Down Expand Up @@ -106,28 +106,15 @@ spec:
{{- if .Values.extraMounts }}
{{- toYaml .Values.extraMounts | nindent 12 }}
{{- end }}
{{- if or .Values.exporter.enabled (eq .Values.input.type "http_input") }}
{{- if eq .Values.input.type "http_input" }}
{{- if .Values.exporter.enabled }}
readinessProbe:
httpGet:
path: /health
port: {{ .Values.input.uvicorn_config.port }}
initialDelaySeconds: 5
timeoutSeconds: 10
periodSeconds: 5
failureThreshold: 3
{{- else }}
readinessProbe:
httpGet:
path: /metrics
port: {{ .Values.exporter.port }}
initialDelaySeconds: 5
timeoutSeconds: 10
periodSeconds: 5
failureThreshold: 3
{{- end }}
{{- end }}
{{- if .Values.exporter.enabled }}
startupProbe:
httpGet:
path: /metrics
Expand Down
1 change: 1 addition & 0 deletions charts/logprep/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ image:
registry: ghcr.io
repository: fkie-cad/logprep
tag: py3.11-stable
pullPolicy: Always

# The pod resources
resources:
Expand Down
48 changes: 37 additions & 11 deletions doc/source/examples/minikube.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,25 +15,19 @@ with the following commands:
.. code-block:: bash
:caption: Install package prerequisites

sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common

.. code-block:: bash
:caption: Install minikube

sudo curl -Lo /usr/local/bin/minikube \
https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo curl -Lo /usr/local/bin/minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64

sudo chmod +x /usr/local/bin/minikube

.. code-block:: bash
:caption: Install kubectl

sudo curl -Lo /usr/local/bin/kubectl \
"https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo curl -Lo /usr/local/bin/kubectl "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"

sudo chmod +x /usr/local/bin/kubectl

Expand All @@ -56,8 +50,8 @@ with the following commands:
minikube config set driver docker
minikube config set cpus 16
minikube config set memory 16GB
minikube addons enable ingress
minikube start
minikube addons enable ingress

Deploy the example
------------------
Expand Down Expand Up @@ -125,4 +119,36 @@ Test the opensiem connector:
}
2024-07-17 11:15:35 301643 Generator INFO : Execution time: 0.067013 seconds

open your browser and go to `http://dashboards.opensiem`_ to see the generated data in the opensearch dashboards.
open your browser and go to `opensearch dashboard <http://dashboards.opensiem>`_ to see the generated data in the opensearch dashboards.


Use local container images
--------------------------

If you want to use local logprep container images, you can build the images with the following commands:

.. code-block:: bash
:caption: switch docker context to minikube in bash

eval $(minikube docker-env)

for powershell:

.. code-block:: powershell
:caption: switch docker context to minikube in powershell

(minikube docker-env).replace("export ", '$env:') | out-string | Invoke-Expression

Then build the logprep image with the following command:

.. code-block:: bash
:caption: build this image using the Dockerfile in the root of the repository

docker buildx build -t local/logprep:latest --build-arg PYTHON_VERSION=3.11 --build-arg LOGPREP_VERSION=dev .

Then install the opensiem example using the local logprep image:

.. code-block:: bash
:caption: use the local values file to deploy the opensiem example

helm install opensiem examples/k8s --values examples/k8s/values-dev.yaml
24 changes: 16 additions & 8 deletions doc/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,19 +1,27 @@
========================================
Welcome to the Documentation of Logprep!
========================================
======================================
Logprep: The swiss army knife for logs
======================================

This is the documentation for Logprep. The swiss army knife for logs.
It provides tools for:

* **collection** of logs from various sources
* **normalization** via different processors
* **shipping** to different datalake targets
* **generation** of events for load testing
* **pseudonymization** and **depseudonymization** of fields in log data to comply with GDPR

and it is written in **Python**!

.. toctree::
:maxdepth: 2
:caption: Content:
:maxdepth: 3

installation
user_manual/index
configuration/index
development/index
examples/index

==================
Indices and Tables
==================

* :ref:`genindex`
* :ref:`modindex`
21 changes: 21 additions & 0 deletions doc/source/user_manual/execution.rst
Original file line number Diff line number Diff line change
Expand Up @@ -195,3 +195,24 @@ Exit Codes
:undoc-members:
:inherited-members:
:noindex:


Healthchecks
------------

Logprep provides a health endpoint which can be used to check the health of all components.
The asgi app for the healthcheck endpoint is implemented in :code:`logprep.metrics.exporter.make_patched_asgi_app` and
will be recreated on every restart of logprep (e.g. after a configuration change) or on creation of the first pipeline process.
The healthcheck endpoint is available at :code:`/health` if metrics are enabled and can be accessed via HTTP GET.

* On success, the healthcheck endpoint will return a :code:`200` status code and a payload :code:`OK`.
* On failure, the healthcheck endpoint will return a :code:`503` status code and a payload :code:`FAIL`.

Healthchecks are implemented in components via the :code:`health()` method. You have to ensure to call
the :code:`super.health()` method in new implemented health checks.
The health is checked for the first time after the first pipeline process is started and then every 5 seconds.
You can configure the healthcheck timeout on component level with the parameter :code:`health_timeout`.
The default value is 1 second.

Healthchecks are used in the provided helm charts as default for readiness probes.

1 change: 0 additions & 1 deletion doc/source/user_manual/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,4 @@ User Manual
execution
verification
testing_rules
configuration/index
security_best_practices
4 changes: 2 additions & 2 deletions examples/exampledata/config/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -114,13 +114,13 @@ output:
- 127.0.0.1:9200
default_index: processed
error_index: errors
message_backlog_size: 10000
message_backlog_size: 2500
timeout: 10000
flush_timeout: 60
max_retries: 3
parallel_bulk: false
user: admin
secret: admin
desired_cluster_status: ["green", "yellow"]
kafka:
type: confluentkafka_output
default: false
Expand Down
126 changes: 126 additions & 0 deletions examples/k8s/values-dev.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
connector:
image:
# point your docker context to minikube and build the image
# `eval $(minikube docker-env)` or `(minikube docker-env).replace("export ", '$env:') | out-string | Invoke-Expression`
# build this image using the Dockerfile in the root of the repository
# `docker buildx build -t local/logprep:latest --build-arg PYTHON_VERSION=3.11 --build-arg LOGPREP_VERSION=dev .`
registry: local
repository: logprep
tag: latest
pullPolicy: IfNotPresent
replicas: 1
secrets: {}
logger:
level: INFO
input:
type: http_input
message_backlog_size: 150000
collect_meta: True
metafield_name: "@metadata"
uvicorn_config:
host: 0.0.0.0
port: 9000
workers: 1
access_log: true
server_header: false
date_header: false
ws: none
interface: asgi3
backlog: 16384
timeout_keep_alive: 65
endpoints:
/auth-json: json
/json: json
/lab/123/(ABC|DEF)/pl.*: plaintext
/lab/123/ABC/auditlog: jsonl
/health: plaintext
output:
type: confluentkafka_output
topic: consumer
error_topic: errors
flush_timeout: 300
send_timeout: 0
kafka_config:
bootstrap.servers: opensiem-kafka:9092
compression.type: none
statistics.interval.ms: "60000"
queue.buffering.max.messages: "100000000"
queue.buffering.max.kbytes: "1048576"
queue.buffering.max.ms: "10000"
batch.size: "1000000"
request.required.acks: "-1"
ingress:
enabled: true

## for additional configurations see: `https://github.com/bitnami/charts/blob/main/bitnami/kafka/values.yaml`
kafka:
listeners:
client:
protocol: PLAINTEXT
controller:
replicaCount: 3
metrics:
jmx:
enabled: true
provisioning:
enabled: true
replicationFactor: 1
numPartitions: 10
topics:
- name: consumer

logprep:
image:
registry: local
repository: logprep
tag: latest
pullPolicy: IfNotPresent
logger:
level: INFO
input:
type: confluentkafka_input
topic: consumer
kafka_config:
bootstrap.servers: opensiem-kafka:9092
group.id: cgroup3
enable.auto.commit: "true"
auto.commit.interval.ms: "10000"
enable.auto.offset.store: "false"
queued.min.messages: "100000"
queued.max.messages.kbytes: "65536"
statistics.interval.ms: "60000"
preprocessing:
version_info_target_field: Logprep_version_info
log_arrival_time_target_field: event.ingested
hmac:
target: <RAW_MSG>
key: "thisisasecureandrandomkey"
output_field: Full_event
output:
type: opensearch_output
hosts:
- opensiem-opensearch:9200
default_index: processed
error_index: errors
timeout: 10000
message_backlog_size: 2500
parallel_bulk: true
flush_timeout: 60
max_retries: 3
chunk_size: 500
thread_count: 5
user: admin
secret: admin
desired_cluster_status: ["green", "yellow"]
## for additional configurations see: `https://github.com/bitnami/charts/blob/main/bitnami/opensearch/values.yaml`
opensearch:
dashboards:
enabled: true
ingest:
replicaCount: 1
master:
replicaCount: 1
data:
replicaCount: 1
coordinating:
replicaCount: 1
Loading
Loading