[Support]: Frigate leaking GBs of memory with Proxmox #10921
-
Describe the problem you are havingFrigate has been leaking GBs of memory ever since upgrading from 0.12 to 0.13.2. I used to run on 4GB of RAM which Frigate fully consumed in less than 1h, I then bumped the memory of my instance to 8GB of RAM and Frigate still managed to eat it all overnight.
Above shows a resident memory usage of the main frigate process at over 4.9GB which combined with all the other support processes was taking all 8GB of RAM. Doing a quick test over a 1h time period:
So frigate seems to be leaking memory at a rate of just under 2MB/minute here. Version0.13.2-6476f8a Frigate config filemqtt:
host: PRIVATE
user: PRIVATE
password: PRIVATE
ffmpeg:
hwaccel_args: preset-vaapi
cameras:
shf-camera01:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera01
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 1920
height: 1080
objects:
track:
- cat
- person
record:
events:
objects: []
shf-camera02:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera02
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 1920
height: 1080
objects:
track:
- cat
- person
record:
events:
objects: []
shf-camera03:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera03
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 1920
height: 1080
objects:
track:
- cat
- person
record:
events:
objects: []
shf-camera04:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera04
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 1920
height: 1080
objects:
track:
- cat
- person
record:
events:
objects: []
shf-camera05:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera05
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 2560
height: 1920
objects:
track:
- bird
record:
events:
objects: []
shf-camera06:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera06
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 2560
height: 1920
objects:
track:
- cat
- person
snapshots:
enabled: True
shf-camera07:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera07
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 2304
height: 1296
objects:
track:
- cat
- person
snapshots:
enabled: True
shf-camera08:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera08
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 2560
height: 1920
objects:
track:
- car
- cat
- person
record:
events:
objects: []
shf-camera09:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera09
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 2560
height: 1920
objects:
track:
- cat
- person
record:
events:
objects: []
shf-camera10:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera10
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 2304
height: 1296
objects:
track:
- cat
- car
- person
snapshots:
enabled: True
shf-camera11:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera11
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 2304
height: 1296
objects:
track:
- cat
- car
- person
snapshots:
enabled: True
required_zones:
- driveway
record:
events:
required_zones:
- driveway
zones:
driveway:
coordinates: 815,1019,2304,1088,2304,1296,0,1296,0,1168,483,943
shf-camera12:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera12
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 2304
height: 1296
objects:
track:
- bird
- cat
- person
snapshots:
enabled: True
shf-camera13:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera13
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 2304
height: 1296
objects:
track:
- cat
- car
- person
snapshots:
enabled: True
shf-camera14:
ffmpeg:
inputs:
- path: rtmp://PRIVATE:1935/shf-camera14
input_args: preset-rtmp-generic
roles:
- record
- detect
detect:
width: 2304
height: 1296
objects:
track:
- bird
- cat
- person
snapshots:
enabled: True
record:
enabled: True
retain:
days: 7
mode: all
events:
retain:
default: 10
mode: active_objects
post_capture: 0
pre_capture: 1
detectors:
coral:
type: edgetpu
device: usb
objects:
track: []
snapshots:
enabled: False
bounding_box: True
crop: true
retain:
default: 10
birdseye:
enabled: True
mode: continuous
live:
height: 720
quality: 5 Relevant log outputThe process being completely out of memory resulted in no log entries at all for the affected time period (10h):
2024-04-09 05:38:48.358042876 2602:fc62:b:8005:216:3eff:fee4:cc60 - - [09/Apr/2024:01:27:36 -0400] "GET /api/shf-camera11/latest.jpg?h=333 HTTP/1.1" 499 0 "-" "HomeAssistant/2024.3.3 aiohttp/3.9.3 Python/3.11" "-"
2024-04-09 05:41:52.667999486 2602:fc62:b:8005:216:3eff:fee4:cc60 - - [09/Apr/2024:01:27:53 -0400] "GET /api/shf-camera06/latest.jpg?h=444 HTTP/1.1" 499 0 "-" "HomeAssistant/2024.3.3 aiohttp/3.9.3 Python/3.11" "-"
2024-04-09 05:41:52.668297252 2602:fc62:b:8005:216:3eff:fee4:cc60 - - [09/Apr/2024:01:27:53 -0400] "GET /api/shf-camera07/latest.jpg?h=333 HTTP/1.1" 499 0 "-" "HomeAssistant/2024.3.3 aiohttp/3.9.3 Python/3.11" "-"
2024-04-09 15:52:02.554579245 [INFO] Preparing Frigate...
2024-04-09 15:52:02.566412169 [INFO] Starting NGINX...
2024-04-09 15:52:02.663614853 [INFO] Preparing new go2rtc config...
2024-04-09 15:52:02.697756151 [INFO] Starting Frigate...
2024-04-09 15:52:03.570434492 [INFO] Starting go2rtc...
2024-04-09 15:52:03.682457180 11:52:03.682 INF go2rtc version 1.8.4 linux/amd64
2024-04-09 15:52:03.683346518 11:52:03.683 INF [api] listen addr=:1984
2024-04-09 15:52:03.683645144 11:52:03.683 INF [rtsp] listen addr=:8554
2024-04-09 15:52:03.684545352 11:52:03.684 INF [webrtc] listen addr=:8555 FFprobe output from your cameraNot relevant as not camera specific. Frigate statsNo response Operating systemDebian Install methodDocker CLI Coral versionUSB Network connectionWired Camera make and modelMix of Reolink and bunch of other vendors (not relevant) Any other information that may be helpfulSeems similar to #10649 but in my case the system was just preventing memory allocations rather than triggering the kernel out of memory killer. |
Beta Was this translation helpful? Give feedback.
Replies: 31 comments 86 replies
-
A lot of time has been spent looking in to this. https://github.com/bloomberg/memray is a great tool for understanding where memory allocations are going. In my usage it appears there is no memory leak, the heap size does not increase at all, python just does not release the memory after it has been garbage collected. I don't think there is any simple / immediate solution. https://www.reddit.com/r/learnpython/comments/17j1aff/rss_memory_inflates_leaks_while_heap_memory_size/?rdt=33549 Perhaps in your case it is something different, a memray output would show what is using memory |
Beta Was this translation helpful? Give feedback.
-
Okay, the fact that there are reports of similar behavior in other python3 applications that similarly are running inside of resource constrained containers makes me want to look at a couple of options here, will report back in a few minutes (or hours if I need to do a full leak test). |
Beta Was this translation helpful? Give feedback.
-
Basically, the other reporter mentioned running this on a Proxmox server, so with Frigate running (with or without Docker, not sure) inside of a resource constrained LXC container. The container I usually give 4GB of RAM out of the 500GB+ of the host system. Now if python3 decided to be funny with caching and allow for memory to grow up till some arbitrary value determined from the total memory it think it can use, then that'd explain the problem. My first attempt at fixing this is simply by manually telling Docker to map LXCFS on /proc/meminfo, /proc/swaps, ... inside of the Frigate container. If python3 is looking at those files to figure out how much memory is available on the system, then that will work fine and it shouldn't get to a point where it runs the whole thing out of RAM anymore. If that doesn't work, then it's possible that python3 makes use of the |
Beta Was this translation helpful? Give feedback.
-
Still appears to be consuming memory at an alarming rate. I'm also tracing it for good measure and see that something within the python3 process does call the I can't tell what the information is used for but it certainly seems likely that this is the culprit and that I'll have to turn on interception to fix this issue. |
Beta Was this translation helpful? Give feedback.
-
And running with sysinfo interception, so far so good, process usage hovers around 485MB and seems to be actually releasing memory back to the OS following spikes. I'm still waiting for the results after keeping this going for an hour or so to make sure things are actually stable. If they are, then we'll know what python3 is doing now and how to fix it, at least for anyone using Incus, those not running on a container platform capable of system call emulation will be out of luck unfortunately. |
Beta Was this translation helpful? Give feedback.
-
Right, so a couple hours later and memory usage is stable, so the issue is confirmed to be related to python3 now calling |
Beta Was this translation helpful? Give feedback.
-
hello ok it's a problem of python3 and not frigate but we can't do anything ? Usage of RAM is really a problem |
Beta Was this translation helpful? Give feedback.
-
I reboot the LXC every 3 days to save RAM and SWAP :'( |
Beta Was this translation helpful? Give feedback.
-
weird. limit memory does not work for me
total : 8 Gib // limited of the container : 4 Gib stay a 1 Gib / 1.5 Gib but memory usage increases |
Beta Was this translation helpful? Give feedback.
-
Soneone test with last kernel 6.8.8 ? or same problem ? |
Beta Was this translation helpful? Give feedback.
-
The latest beta version (v0.14.0-beta3) with 6.5.13-3-pve kernel seems be fine. No memory increase. |
Beta Was this translation helpful? Give feedback.
-
according to @Redsandro in #5773 (comment), |
Beta Was this translation helpful? Give feedback.
-
Frigate version : 0.13.2-6476f8a I will test this version if you want
Test in progress |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Rollback to kernel 8.5 to test Édit = 6.5 not 8.5 🙃 |
Beta Was this translation helpful? Give feedback.
-
Sorry. 6.5 instead of 6.8 |
Beta Was this translation helpful? Give feedback.
-
Unfortunately, I am also affected. Since my cameras throw a lot of errors, I first thought it was related to the logging. The following settings brought some improvement. (maybe just a coincidence?)
Until recently, my Proxmox was still running with kernel 6.5.13-6-pve
[EDIT] |
Beta Was this translation helpful? Give feedback.
-
Is this problem only when i use frigate in docker inside a lxc on proxmox? Is this a problem in a virtual machine also (debian lts vm with docker - inside proxmox)? What about debian vm inside vmware? Is there the problem also? I now use proxmox inside lxc on proxmox and my installation is unresponsibel after some time - i can see maximum memory used in proxmox container overview. After container restart it works for some time - until it happens again. Sometimes it happens every hour. So in this way frigate is unuseable at the moment... |
Beta Was this translation helpful? Give feedback.
-
By the way... I used the proxmox helper script from tteck to install frigate on my proxmox. But i can´t find anything with docker inside this lxc. Is this installed directly or using docker??? How can i access docker to change the config (mem ressources)? |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I have the same problem with unraid, every day |
Beta Was this translation helpful? Give feedback.
-
Same problem here, Frigate in Docker in an LXC on Proxmox. Slowly eats up all RAM available, then when RAM is full, the CPU maxes out as it tries to shuffle memory around and then the LXC becomes totally unresponsive. I can confirm that using psutil within the Frigate container to check memory stats shows the host's available memory, not the LXC's. My LXC config has these lines to define memory:
Further info here on that second line: lxc/lxc#4049 This is in the LXC: root@frigate:~# cat /proc/meminfo
MemTotal: 8388608 kB
... Then, attaching to the Frigate container: root@frigate:~# docker exec -it f2 /bin/bash
root@f2e39a70f6a9:/opt/frigate# cat /proc/meminfo
MemTotal: 32722532 kB
... So the issue is that Frigate is running in Docker which runs on the host, not actually within the container. The simple solution is to add an env var that allows us to specify a hard limit for memory consumption. |
Beta Was this translation helpful? Give feedback.
-
Same issue here. I'm running Frigate under Docker inside an LXC. After a few hours, the container memory is maxed out and it freezes. The drops in these charts are a result of me having to restart the LXC container. As mentioned by @lost-RD, I can also confirm that the Docker container believes it has more memory than what has been allocated to the LXC. Inside the LXC:
Inside the Frigate Docker container:
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Is this exclusive to Proxmox? I'm seeing this on a number of different systems all running Frigate 0.14 on HA OS and different brands of cameras |
Beta Was this translation helpful? Give feedback.
-
The same thing is happening on my side. After some random time, the Proxmox server becomes literally a bean. here is the journalctl dump right from when it started acting up |
Beta Was this translation helpful? Give feedback.
-
I just hit this pretty badly on Proxmox, though it seems to happen when running the tteck Frigate LXC script, no docker necessary. I've been working on removing docker due to other memory/crashing issues related to USB, shm, and tmp/cache mounts. Quick fix was a full system reboot, which brought back stability and stopped the memory leaks, for now. Kernel is 6.8.12-2 |
Beta Was this translation helpful? Give feedback.
-
unfortunately, I've been having the same issue with Frigate on a Docker LXC in Proxmox. I've migrated it back to bare metal i3 with less specs for the stability unfortunately. I do like advantage of having it on Proxmox for the ease of backups and migration features, but that is negated if it cant stay stable within the LXC. I'll wait until this python 3 memory issue is resolved in future kernel updates! |
Beta Was this translation helpful? Give feedback.
Right, so a couple hours later and memory usage is stable, so the issue is confirmed to be related to python3 now calling
sysinfo
which if that returns an amount of total memory higher than what's allowed for the container will cause python3 to start growing the process memory until it crashes.