uWSGI HTTP Process Continuously Respawning #1649

0xPierre · 2025-02-02T23:31:30Z

Hello,

Description

I am trying to deploy an instance of internet.nl, but the internetnl-prod-app container running uWSGI keeps getting killed due to an out-of-memory (OOM) condition.

I am running a fresh VPS:

OS: Ubuntu 24.10
vCPU: 4
Memory: 8 Go

Journalctl logs

Feb 02 23:26:57 vps-7de75884 kernel: Memory cgroup out of memory: Killed process 833379 (uwsgi) total-vm:8497512kB, anon-rss:5230280kB, file-rss:1216kB, shmem-rss:0kB, UID:65534 pgtables:10340kB oom_score_adj:0
Feb 02 23:26:57 vps-7de75884 systemd[1]: docker-8fd32c8348bf3c6e9ea7a58da0c93f468d6c2952586d82769b5529792801867f.scope: A process of this unit has been killed by the OOM killer.

Logs of internetnl-prod-app

*** Starting uWSGI 2.0.22 (64bit) on [Sun Feb  2 23:12:37 2025] ***
compiled with version: 10.2.1 20210110 on 30 January 2025 09:42:25
os: Linux-6.11.0-12-generic #13-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 21 20:03:13 UTC 2024
nodename: app
machine: x86_64
clock source: unix
detected number of CPU cores: 4
current working directory: /app
detected binary path: /usr/local/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
your memory page size is 4096 bytes
detected max file descriptor number: 1073741816
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uWSGI http bound on 0.0.0.0:8080 fd 4
uwsgi socket 0 bound to TCP address 127.0.0.1:40435 (port auto-assigned) fd 3
Python version: 3.9.2 (default, Dec  1 2024, 12:12:57)  [GCC 10.2.1 20210110]
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x5d3998d4e220
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 145840 bytes (142 KB) for 1 cores
*** Operational MODE: single process ***
Single domain scan enabled, batch scanning and API not available.
WSGI app 0 (mountpoint='') ready in 3 seconds on interpreter 0x5d3998d4e220 pid: 1 (default app)
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 1)
spawned uWSGI worker 1 (pid: 7, cores: 1)
*** Stats server enabled on 127.0.0.1:1717 fd: 17 ***
spawned uWSGI http 1 (pid: 8)
respawned uWSGI http 1 (pid: 15)
respawned uWSGI http 1 (pid: 16)
respawned uWSGI http 1 (pid: 17)
respawned uWSGI http 1 (pid: 18)
respawned uWSGI http 1 (pid: 19)
respawned uWSGI http 1 (pid: 20)
respawned uWSGI http 1 (pid: 21)
respawned uWSGI http 1 (pid: 22)
respawned uWSGI http 1 (pid: 23)
respawned uWSGI http 1 (pid: 24)
[etc...]

I already tried running only one process of uWSGI.

Any idea how to investigate it or fix it?

The text was updated successfully, but these errors were encountered:

0xPierre · 2025-02-14T11:20:53Z

Hi,
I figured out, after letting it running for severals days, it finally worked.
Anyone has an idea of why it took so long ?

I got this type of error again and again in the routinator instance :

[WARN] rsync://rpki.miralium.net/repo/: rsync: getaddrinfo: rpki.miralium.net 873: Name has no usable address
[WARN] rsync://rpki.miralium.net/repo/: rsync error: error in socket IO (code 10) at clientserver.c(139) [Receiver=3.4.0]
[WARN] rsync://krill.accuristechnologies.ca/repo/Accuris-Technologies/0/8A15107195E63966ABA1997AD31382979C75F736.cer: no valid manifest rsync://rpki.miralium.net/repo/Miralium-Research-RPKI-CA-A1/0/8A15107195E63966ABA1997AD31382979C75F736.mft found.
[WARN] rsync://subrepo.wildtky.com/repo/: rsync: [Receiver] failed to connect to subrepo.wildtky.com (23.180.200.177): Host is unreachable (113)
[WARN] rsync://subrepo.wildtky.com/repo/: rsync error: error in socket IO (code 10) at clientserver.c(139) [Receiver=3.4.0]
[WARN] rsync://repodepot.wildtky.com/repo/WTAFarms-Jan2025/0/A0D77953B7619F183E3407CF1E95764F0A641BDD.cer: no valid manifest rsync://subrepo.wildtky.com/repo/SubCAJan2025/0/A0D77953B7619F183E3407CF1E95764F0A641BDD.mft found.
[WARN] rsync://rpki.cc/repo/MythicalKitten/12/7BBD0E669176F6F2E8BB8FC3104A8D23435175AE.cer: no valid manifest rsync://krill.ca-bc-01.ssmidge.xyz/repo/SsmidgeLLC/1/7BBD0E669176F6F2E8BB8FC3104A8D23435175AE.mft found.
[WARN] rsync://cloudie-repo.rpki.app/repo/CLOUDIE-RPKI/0/73236D2CCA0EE5A74A9C40FFF721835444703ABE.cer: no valid manifest rsync://rpki.uz/repo/pedjoeang-digital-networks/4/73236D2CCA0EE5A74A9C40FFF721835444703ABE.mft found.
[WARN] rsync://rsync.paas.rpki.ripe.net/repository/0c70401c-7f41-4a6b-9434-cc80dca093e6/2/3B7184989F76A03708039261134F384B50D011BB.cer: no valid manifest rsync://krill.immarket.space/repo/imh/0/3B7184989F76A03708039261134F384B50D011BB.mft found.
[WARN] rsync://rpki.cc/repo/MythicalKitten/1/4173C015E8E1FED254D4938B7E69CB256CCF6936.cer: no valid manifest rsync://krill.ca-bc-01.ssmidge.xyz/repo/AS199177/0/4173C015E8E1FED254D4938B7E69CB256CCF6936.mft found.
[WARN] rsync://rpki-repository.haruue.net/repo/YC3254-RPKI/2/3F0AC25D352C83DA8307594B98ED061BE8489682.mft: certificate has expired.
[WARN] rsync://cloudie-repo.rpki.app/repo/CLOUDIE-RPKI/0/3F0AC25D352C83DA8307594B98ED061BE8489682.cer: no valid manifest rsync://rpki-repository.haruue.net/repo/YC3254-RPKI/2/3F0AC25D352C83DA8307594B98ED061BE8489682.mft found.
[WARN] RRDP https://krill.stonham.uk/rrdp/notification.xml: HTTP status server error (522 <unknown status code>) for url (https://krill.stonham.uk/rrdp/notification.xml)
[WARN] rsync://krill.stonham.info/repo/: rsync error: timeout waiting for daemon connection (code 35) at socket.c(278) [Receiver=3.4.0]
[WARN] rsync://cloudie-repo.rpki.app/repo/CLOUDIE-RPKI/0/635C29FF238CC286AC1625A68EFCC04E2E460171.cer: no valid manifest rsync://krill.stonham.info/repo/Stonham/1/635C29FF238CC286AC1625A68EFCC04E2E460171.mft found.

bwbroersma · 2025-02-14T12:00:14Z

Note we currently are also investigating a OOM issue, which is indeed an app-container RAM spike (for us daily, just after 03h time, which seems related to activity in the cron container):

Alert 'One or more containers/processes where OOM killed' - every night 3:02 #1651

8GB is a bit tight, but should work, Routinator can be a bit resource demanding, see:

Routinator resource usage NLnetLabs/routinator#333

Which internet.nl version are you running, and what is the load?

If your load is light, you could use a public Routinator instance by overloading the ROUTINATOR_URL in local.env, and remove the routinator from COMPOSE_PROFILES in your local.env. Note that the profiles changed from v1.9.0 and main, the next is only valid for main:

Internet.nl/documentation/rpki.md

Lines 21 to 26 in 583c017

    
           The Routinator instance is an RPKI Relying Party implementation that downloads 
        
           and verifies RPKI data. The check connects to the HTTP API to find ROAs. 
        
           This is configured in the `ROUTINATOR_URL` setting or environment variable. 
        
           There are some publicly available instances that can be used for local 
        
           testing, like `https://rpki-validator.ripe.net/api/v1/validity`. For large 
        
           scale or production setups, you should run your own instance.

Internet.nl/docker/develop.env

Lines 56 to 57 in 583c017

    
           # use public routinator for development so we don't have to let routinator fetch all data 
        
           ROUTINATOR_URL=https://rpki-validator.ripe.net/api/v1/validity

Internet.nl/docker/develop.env

Lines 76 to 79 in 583c017

    
           # Disable (do not enable) the `routinator` profile which is enable by default in `defaults.env`. 
        
           # Routinator is slow to start initially and requires a lot of resources which is not ideal for 
        
           # development environments. 
        
           COMPOSE_PROFILES=

v1.9.0 also needs cron (and connectiontest if your have that setup).

0xPierre · 2025-02-14T13:33:06Z

8GB is a bit tight, but should work, Routinator can be a bit resource demanding, see:

I tried on a 32go of RAM instance. It does the same, the uwsgi get killed around 7/8Go
I also removed the routinator from the COMPOSE_PROFILES

Which internet.nl version are you running, and what is the load?

I am using v1.9.0

bwbroersma · 2025-02-16T13:45:47Z

How often are you seeing a Out Of Memory (OOM) Kill?
Currently we're having the issue one time a day at internet.nl, which is also running 1.9.0, the app container uses about 1.4GiB of memory, and then spikes to 5GiB at 03:00 and then get OOM killed. The testing instance running the main branch experiences the same problem.

0xPierre · 2025-02-17T12:02:20Z

Hi,
I am experiencing the OOM kill every seconds, my instance is still not accessible after 3 days

root@vps-7de75884:/opt/Internet.nl# docker logs internetnl-prod-app-1 -f --tail 10
respawned uWSGI http 1 (pid: 66948)
respawned uWSGI http 1 (pid: 66949)
respawned uWSGI http 1 (pid: 66950)
respawned uWSGI http 1 (pid: 66951)
respawned uWSGI http 1 (pid: 66952)
respawned uWSGI http 1 (pid: 66953)
respawned uWSGI http 1 (pid: 66954)
respawned uWSGI http 1 (pid: 66955)
respawned uWSGI http 1 (pid: 66956)
respawned uWSGI http 1 (pid: 66957)
respawned uWSGI http 1 (pid: 66958)
respawned uWSGI http 1 (pid: 66959)

0xPierre · 2025-02-17T22:08:17Z

Finally I downgraded to Ubuntu 22.04, and it works perfectly, so there is a problem using Ubuntu 24.10.
Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uWSGI HTTP Process Continuously Respawning #1649

uWSGI HTTP Process Continuously Respawning #1649

0xPierre commented Feb 2, 2025

0xPierre commented Feb 14, 2025

bwbroersma commented Feb 14, 2025 •

edited

Loading

0xPierre commented Feb 14, 2025 •

edited

Loading

bwbroersma commented Feb 16, 2025

0xPierre commented Feb 17, 2025

0xPierre commented Feb 17, 2025

uWSGI HTTP Process Continuously Respawning #1649

uWSGI HTTP Process Continuously Respawning #1649

Comments

0xPierre commented Feb 2, 2025

Description

Journalctl logs

Logs of internetnl-prod-app

0xPierre commented Feb 14, 2025

bwbroersma commented Feb 14, 2025 • edited Loading

0xPierre commented Feb 14, 2025 • edited Loading

bwbroersma commented Feb 16, 2025

0xPierre commented Feb 17, 2025

0xPierre commented Feb 17, 2025

bwbroersma commented Feb 14, 2025 •

edited

Loading

0xPierre commented Feb 14, 2025 •

edited

Loading