From 8c8a6672afccb327840488aca12ba46f8b1e56bd Mon Sep 17 00:00:00 2001 From: Marco Pracucci Date: Thu, 30 Dec 2021 13:05:11 +0100 Subject: [PATCH 1/3] Release 1.10.0 Signed-off-by: Marco Pracucci --- CHANGELOG.md | 3 +++ cortex/images.libsonnet | 6 +++--- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 3032d94b..e17abf50 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,9 @@ `-blocks-storage.bucket-store.chunks-cache.memcached.max-idle-connections`, `-blocks-storage.bucket-store.metadata-cache.memcached.max-idle-connections` to 100 #414 * [CHANGE] Update grafana-builder dependency: use $__rate_interval in qpsPanel and latencyPanel. #372 + +## 1.10.0 / 2021-12-30 + * [CHANGE] `namespace` template variable in dashboards now only selects namespaces for selected clusters. #311 * [CHANGE] Alertmanager: mounted overrides configmap to alertmanager too. #315 * [CHANGE] Memcached: upgraded memcached from `1.5.17` to `1.6.9`. #316 diff --git a/cortex/images.libsonnet b/cortex/images.libsonnet index 87a9dc61..ff039997 100644 --- a/cortex/images.libsonnet +++ b/cortex/images.libsonnet @@ -5,7 +5,7 @@ memcachedExporter: 'prom/memcached-exporter:v0.6.0', // Our services. - cortex: 'cortexproject/cortex:v1.9.0', + cortex: 'cortexproject/cortex:v1.10.0', alertmanager: self.cortex, distributor: self.cortex, @@ -20,7 +20,7 @@ query_scheduler: self.cortex, cortex_tools: 'grafana/cortex-tools:v0.4.0', - query_tee: 'quay.io/cortexproject/query-tee:v1.9.0', - testExporter: 'cortexproject/test-exporter:v1.9.0', + query_tee: 'quay.io/cortexproject/query-tee:v1.10.0', + testExporter: 'cortexproject/test-exporter:v1.10.0', }, } From e3f5fe54e26628859a185dbe6e9e8509abd7cf1c Mon Sep 17 00:00:00 2001 From: Marco Pracucci Date: Thu, 30 Dec 2021 13:18:51 +0100 Subject: [PATCH 2/3] Release 1.11.0 Signed-off-by: Marco Pracucci --- CHANGELOG.md | 52 +++++++++++++++++++++++++++++++++++++++++ cortex/images.libsonnet | 6 ++--- 2 files changed, 55 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index e17abf50..930fc0c2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,8 @@ ## master / unreleased +## 1.11.0 / 2021-12-30 + * [CHANGE] Store gateway: set `-blocks-storage.bucket-store.index-cache.memcached.max-get-multi-concurrency`, `-blocks-storage.bucket-store.chunks-cache.memcached.max-get-multi-concurrency`, `-blocks-storage.bucket-store.metadata-cache.memcached.max-get-multi-concurrency`, @@ -9,6 +11,56 @@ `-blocks-storage.bucket-store.chunks-cache.memcached.max-idle-connections`, `-blocks-storage.bucket-store.metadata-cache.memcached.max-idle-connections` to 100 #414 * [CHANGE] Update grafana-builder dependency: use $__rate_interval in qpsPanel and latencyPanel. #372 +* [CHANGE] Decreased `-server.grpc-max-concurrent-streams` from 100k to 10k. #369 +* [CHANGE] Decreased blocks storage ingesters graceful termination period from 80m to 20m. #369 +* [CHANGE] Changed default `job_names` for query-frontend, query-scheduler and querier to match custom deployments too. #376 +* [CHANGE] Increase the rules per group and rule groups limits on different tiers. #396 +* [CHANGE] Removed `max_samples_per_query` limit, since it only works with chunks and only when using `-distributor.shard-by-all-labels=false`. #397 +* [CHANGE] Removed chunks storage query sharding config support. The following config options have been removed: #398 + * `_config` > `queryFrontend` > `shard_factor` + * `_config` > `queryFrontend` > `sharded_queries_enabled` + * `_config` > `queryFrontend` > `query_split_factor` +* [CHANGE] Split `cortex_api` recording rule group into three groups. This is a workaround for large clusters where this group can become slow to evaluate. #401 +* [CHANGE] Increased `CortexIngesterReachingSeriesLimit` warning threshold from 70% to 80% and critical threshold from 85% to 90%. #404 +* [CHANGE] Rename ruler_s3_bucket_name and ruler_gcs_bucket_name to ruler_storage_bucket_name: #415 +* [CHANGE] Fine-tuned rolling update policy for distributor, querier, query-frontend, query-scheduler. #420 +* [CHANGE] Increased memcached metadata/chunks/index-queries max connections from 4k to 16k. #420 +* [CHANGE] Disabled step alignment in query-frontend to be compliant with PromQL. #420 +* [CHANGE] Do not limit compactor CPU and request a number of cores equal to the configured concurrency. #420 +* [ENHANCEMENT] Add overrides config to compactor. This allows setting retention configs per user. #386 +* [ENHANCEMENT] Added 256MB memory ballast to querier. #369 +* [ENHANCEMENT] Update gsutil command for `not healthy index found` playbook #370 +* [ENHANCEMENT] Update `etcd-operator` to latest version (see https://github.com/grafana/jsonnet-libs/pull/480). #263 +* [ENHANCEMENT] Added Alertmanager alerts and playbooks covering configuration syncs and sharding operation: #377 #378 + * `CortexAlertmanagerSyncConfigsFailing` + * `CortexAlertmanagerRingCheckFailing` + * `CortexAlertmanagerPartialStateMergeFailing` + * `CortexAlertmanagerReplicationFailing` + * `CortexAlertmanagerPersistStateFailing` + * `CortexAlertmanagerInitialSyncFailed` +* [ENHANCEMENT] Add support for Azure storage in Alertmanager configuration. #381 +* [ENHANCEMENT] Add support for running Alertmanager in sharding mode. #394 +* [ENHANCEMENT] Allow to customize PromQL engine settings via `queryEngineConfig`. #399 +* [ENHANCEMENT] Add recording rules to improve responsiveness of Alertmanager dashboard. #387 +* [ENHANCEMENT] Add `CortexRolloutStuck` alert. #405 +* [ENHANCEMENT] Added `CortexKVStoreFailure` alert. #406 +* [ENHANCEMENT] Use configured `ruler` jobname for ruler dashboard panels. #409 +* [ENHANCEMENT] Add ability to override `datasource` for generated dashboards. #407 +* [ENHANCEMENT] Use alertmanager jobname for alertmanager dashboard panels #411 +* [ENHANCEMENT] Added `CortexDistributorReachingInflightPushRequestLimit` alert. #408 +* [ENHANCEMENT] Define Azure object storage ruler args. #416 +* [ENHANCEMENT] Added the following config options to allow to schedule multiple replicas of the same service on the same node: #418 + * `cortex_distributor_allow_multiple_replicas_on_same_node` + * `cortex_ruler_allow_multiple_replicas_on_same_node` + * `cortex_querier_allow_multiple_replicas_on_same_node` + * `cortex_query_frontend_allow_multiple_replicas_on_same_node` +* [BUGFIX] Fixed rollout progress dashboard to include query-scheduler too. #376 +* [BUGFIX] Fixed `-distributor.extend-writes` setting on ruler when `unregister_ingesters_on_shutdown` is disabled. #369 +* [BUGFIX] Upstream recording rule `node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate` renamed. #379 +* [BUGFIX] Treat `compactor_blocks_retention_period` type as string rather than int.#395 +* [BUGFIX] Fixed writes/reads/alertmanager resources dashboards to use `$._config.job_names.gateway`. #403 +* [BUGFIX] Span the annotation.message in alerts as YAML multiline strings. #412 +* [BUGFIX] Pass `-ruler-storage.s3.endpoint` to ruler when using S3. #421 ## 1.10.0 / 2021-12-30 diff --git a/cortex/images.libsonnet b/cortex/images.libsonnet index ff039997..1eb891c4 100644 --- a/cortex/images.libsonnet +++ b/cortex/images.libsonnet @@ -5,7 +5,7 @@ memcachedExporter: 'prom/memcached-exporter:v0.6.0', // Our services. - cortex: 'cortexproject/cortex:v1.10.0', + cortex: 'cortexproject/cortex:v1.11.0', alertmanager: self.cortex, distributor: self.cortex, @@ -20,7 +20,7 @@ query_scheduler: self.cortex, cortex_tools: 'grafana/cortex-tools:v0.4.0', - query_tee: 'quay.io/cortexproject/query-tee:v1.10.0', - testExporter: 'cortexproject/test-exporter:v1.10.0', + query_tee: 'quay.io/cortexproject/query-tee:v1.11.0', + testExporter: 'cortexproject/test-exporter:v1.11.0', }, } From 27b0c6f250b7d47dfc8c83040af9699f4935e0c8 Mon Sep 17 00:00:00 2001 From: Marco Pracucci Date: Thu, 30 Dec 2021 13:20:42 +0100 Subject: [PATCH 3/3] Fixed CHANGELOG Signed-off-by: Marco Pracucci --- CHANGELOG.md | 50 -------------------------------------------------- 1 file changed, 50 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 930fc0c2..3135d97c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -83,23 +83,6 @@ * [CHANGE] Removed `CortexQuerierCapacityFull` alert. #342 * [CHANGE] Changes blocks storage alerts to group metrics by the configured `cluster_labels` (supporting the deprecated `alert_aggregation_labels`). #351 * [CHANGE] Increased `CortexIngesterReachingSeriesLimit` critical alert threshold from 80% to 85%. #363 -* [CHANGE] Decreased `-server.grpc-max-concurrent-streams` from 100k to 10k. #369 -* [CHANGE] Decreased blocks storage ingesters graceful termination period from 80m to 20m. #369 -* [CHANGE] Changed default `job_names` for query-frontend, query-scheduler and querier to match custom deployments too. #376 -* [CHANGE] Increase the rules per group and rule groups limits on different tiers. #396 -* [CHANGE] Removed `max_samples_per_query` limit, since it only works with chunks and only when using `-distributor.shard-by-all-labels=false`. #397 -* [CHANGE] Removed chunks storage query sharding config support. The following config options have been removed: #398 - * `_config` > `queryFrontend` > `shard_factor` - * `_config` > `queryFrontend` > `sharded_queries_enabled` - * `_config` > `queryFrontend` > `query_split_factor` -* [CHANGE] Split `cortex_api` recording rule group into three groups. This is a workaround for large clusters where this group can become slow to evaluate. #401 -* [CHANGE] Increased `CortexIngesterReachingSeriesLimit` warning threshold from 70% to 80% and critical threshold from 85% to 90%. #404 -* [CHANGE] Rename ruler_s3_bucket_name and ruler_gcs_bucket_name to ruler_storage_bucket_name: #415 -* [CHANGE] Fine-tuned rolling update policy for distributor, querier, query-frontend, query-scheduler. #420 -* [CHANGE] Increased memcached metadata/chunks/index-queries max connections from 4k to 16k. #420 -* [CHANGE] Disabled step alignment in query-frontend to be compliant with PromQL. #420 -* [CHANGE] Do not limit compactor CPU and request a number of cores equal to the configured concurrency. #420 -* [ENHANCEMENT] Add overrides config to compactor. This allows setting retention configs per user. #386 * [ENHANCEMENT] cortex-mixin: Make `cluster_namespace_deployment:kube_pod_container_resource_requests_{cpu_cores,memory_bytes}:sum` backwards compatible with `kube-state-metrics` v2.0.0. #317 * [ENHANCEMENT] Cortex-mixin: Include `cortex-gw-internal` naming variation in default `gateway` job names. #328 * [ENHANCEMENT] Ruler dashboard: added object storage metrics. #354 @@ -117,44 +100,11 @@ * "Tenant Configuration Sync" row - information about the configuration sync procedure. * "Sharding Initial State Sync" row - information about the initial state sync procedure when sharding is enabled. * "Sharding Runtime State Sync" row - information about various state operations which occur when sharding is enabled (replication, fetch, marge, persist). -* [ENHANCEMENT] Added 256MB memory ballast to querier. #369 -* [ENHANCEMENT] Update gsutil command for `not healthy index found` playbook #370 -* [ENHANCEMENT] Update `etcd-operator` to latest version (see https://github.com/grafana/jsonnet-libs/pull/480). #263 -* [ENHANCEMENT] Added Alertmanager alerts and playbooks covering configuration syncs and sharding operation: #377 #378 - * `CortexAlertmanagerSyncConfigsFailing` - * `CortexAlertmanagerRingCheckFailing` - * `CortexAlertmanagerPartialStateMergeFailing` - * `CortexAlertmanagerReplicationFailing` - * `CortexAlertmanagerPersistStateFailing` - * `CortexAlertmanagerInitialSyncFailed` -* [ENHANCEMENT] Add support for Azure storage in Alertmanager configuration. #381 -* [ENHANCEMENT] Add support for running Alertmanager in sharding mode. #394 -* [ENHANCEMENT] Allow to customize PromQL engine settings via `queryEngineConfig`. #399 -* [ENHANCEMENT] Add recording rules to improve responsiveness of Alertmanager dashboard. #387 -* [ENHANCEMENT] Add `CortexRolloutStuck` alert. #405 -* [ENHANCEMENT] Added `CortexKVStoreFailure` alert. #406 -* [ENHANCEMENT] Use configured `ruler` jobname for ruler dashboard panels. #409 -* [ENHANCEMENT] Add ability to override `datasource` for generated dashboards. #407 -* [ENHANCEMENT] Use alertmanager jobname for alertmanager dashboard panels #411 -* [ENHANCEMENT] Added `CortexDistributorReachingInflightPushRequestLimit` alert. #408 -* [ENHANCEMENT] Define Azure object storage ruler args. #416 -* [ENHANCEMENT] Added the following config options to allow to schedule multiple replicas of the same service on the same node: #418 - * `cortex_distributor_allow_multiple_replicas_on_same_node` - * `cortex_ruler_allow_multiple_replicas_on_same_node` - * `cortex_querier_allow_multiple_replicas_on_same_node` - * `cortex_query_frontend_allow_multiple_replicas_on_same_node` * [BUGFIX] Fixed `CortexIngesterHasNotShippedBlocks` alert false positive in case an ingester instance had ingested samples in the past, then no traffic was received for a long period and then it started receiving samples again. #308 * [BUGFIX] Alertmanager: fixed `--alertmanager.cluster.peers` CLI flag passed to alertmanager when HA is enabled. #329 * [BUGFIX] Fixed `CortexInconsistentRuntimeConfig` metric. #335 * [BUGFIX] Fixed scaling dashboard to correctly work when a Cortex service deployment spans across multiple zones (a zone is expected to have the `zone-[a-z]` suffix). #365 * [BUGFIX] Fixed rollout progress dashboard to correctly work when a Cortex service deployment spans across multiple zones (a zone is expected to have the `zone-[a-z]` suffix). #366 -* [BUGFIX] Fixed rollout progress dashboard to include query-scheduler too. #376 -* [BUGFIX] Fixed `-distributor.extend-writes` setting on ruler when `unregister_ingesters_on_shutdown` is disabled. #369 -* [BUGFIX] Upstream recording rule `node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate` renamed. #379 -* [BUGFIX] Treat `compactor_blocks_retention_period` type as string rather than int.#395 -* [BUGFIX] Fixed writes/reads/alertmanager resources dashboards to use `$._config.job_names.gateway`. #403 -* [BUGFIX] Span the annotation.message in alerts as YAML multiline strings. #412 -* [BUGFIX] Pass `-ruler-storage.s3.endpoint` to ruler when using S3. #421 ## 1.9.0 / 2021-05-18