Skip to content

Releases: m-lab/k8s-support

Bump pusher version, update fluentd configs

28 Jan 18:57
13dcb6d
Compare
Choose a tag to compare
  • Should have a more reliable (in a statistical sense) pusher #362
  • fluentd configs that put less of a load on etcd #363
  • removes some proxy metrics to avoid double-counting #361

Wehe, now with data getting pushed

23 Jan 22:19
e76c835
Compare
Choose a tag to compare

This release properly mounts the Wehe data directories and removes traceroute from the Wehe deployment, where it never should have been.

Add WeHe and update Pusher version and configurations

22 Jan 18:13
20cda44
Compare
Choose a tag to compare

This release includes numerous improvements:

Add support for netblock anonymization to tcpinfo (#350)
Bumps flannel-cloud memory limit from 50Mi to 128Mi to get around OOM (#349)
Experiments can only access their own datatypes (#353)
Add Wehe (#351)
Set sigterm_wait_time so pusher uploads before SIGKILL (#355)
Use canonical DNS for wehe (#357)

traceroute caller release with tcpinfo falg and segfault fix

07 Jan 14:41
76b7171
Compare
Choose a tag to compare
Merge pull request #347 from m-lab/sandbox-yachang-pidleak

Bring back tcpinfo for traceroutecaller and fix the pid leak

clean deployment from master branch

06 Jan 18:30
6f00e55
Compare
Choose a tag to compare
Rollback traceroute caller for staging (#346)

* Rollback traceroute caller for staging

Remediate the scamper daemon crashes in traceroute-caller

18 Dec 17:43
a77335e
Compare
Choose a tag to compare

Traceroute-caller now has a backup traceroute system in case the scamper daemon crashes. Other than that, this is exactly what is in v2.3.1

pboothe@:~/k8s-support/$ git checkout current-deployment
Already on 'current-deployment'
Your branch is up to date with 'origin/current-deployment'.
pboothe@:~/k8s-support/$ git diff v2.3.1
diff --git a/k8s/daemonsets/templates.jsonnet b/k8s/daemonsets/templates.jsonnet
index 8625796..d049641 100644
--- a/k8s/daemonsets/templates.jsonnet
+++ b/k8s/daemonsets/templates.jsonnet
@@ -123,7 +123,7 @@ local Tcpinfo(expName, tcpPort, hostNetwork) = [
 local Traceroute(expName, tcpPort, hostNetwork) = [
   {
     name: 'traceroute',
-    image: 'measurementlab/traceroute-caller:v0.3.2',
+    image: 'measurementlab/traceroute-caller:v0.5.1',
     args: [
       if hostNetwork then
         '-prometheusx.listen-address=127.0.0.1:' + tcpPort
@@ -131,6 +131,9 @@ local Traceroute(expName, tcpPort, hostNetwork) = [
         '-prometheusx.listen-address=$(PRIVATE_IP):' + tcpPort,
       '-outputPath=' + VolumeMount(expName).mountPath + '/traceroute',
       '-uuid-prefix-file=' + uuid.prefixfile,
+      '-poll=false',
+      '-tcpinfo.eventsocket=' + tcpinfoServiceVolume.eventsocketFilename,
+      '-tracetool=scamper-daemon-with-scamper-backup',
     ],
     env: if hostNetwork then [] else [
       {

These changes have been stably running in staging overnight, along with many feature additions that we do not want to deploy at this time. This is a cherrypick release to address the scamper daemon crashes we have been seeing.

A rollback release

20 Dec 20:51
Compare
Choose a tag to compare

v2.3.2 has a process leak

We can either debug production over the holidays, or we can respect the prod freeze and issue a rollback release. This is that rollback release. The final state should be the same as v2.3.1

Rollback traceroute-caller to v0.3.2

10 Dec 05:25
Compare
Choose a tag to compare
v2.3.1

Rollback traceroute-caller to v0.3.2

Enable PCAP collection

09 Dec 17:49
578e623
Compare
Choose a tag to compare
  • Whitelist systemd units we're interested in monitoring.
  • Add docker and kubelet to the list of monitored services. #325
  • Add node-exporter process metrics and faster sandbox deploys (#328)
  • Return traceroute to polling ss to prevent segfault frequency #330
  • Deploy packet-headers unconditionally (#332)

Weekly release

02 Dec 19:55
41a1776
Compare
Choose a tag to compare

This release mostly includes changes to various scripts and their configurations i.e., not affecting pods, with the exception of configuring the node-exporter DaemonSet to export systemd metrics.

  • Adds a new script for upgrading all cloud nodes in the platform cluster.
  • Increases the Prometheus VM's persistent disk from 200GB to 500GB.
  • Increases machine type for master VMs in production (was n1-standard-4, now is n1-standard-8).
  • Updates k8s/component versions in k8s_deploy.conf.
  • Removes "legacy" --config flag from the kubeadm upgrade command.
  • Enables node-exporters systemd collector.