-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak on context switches #65
Comments
With Fibers 1.0.0 (replacing |
Heap size becomes stable if we change the second fiber above to: (spawn-fiber
(lambda ()
(let loop ()
;; (yield-current-task)
(suspend-current-task schedule-task)
(loop)))) The difference between (spawn-fiber
(lambda ()
(let loop ()
;; (yield-current-task)
(suspend-current-task
(lambda (sched k)
;; When the 'schedule-task' call is not in tail position, memory is leaked.
(->bool (schedule-task sched k))))
(loop)))) |
The program below has a stable heap size when built with (use-modules (fibers)
(fibers channels)
(fibers scheduler)
(statprof))
(run-fibers
(lambda ()
(define channel
(make-channel))
(define leave-channel
(make-channel))
(spawn-fiber
(lambda ()
(sleep 20)
(put-message leave-channel 'leave)))
(spawn-fiber
(lambda ()
(let loop ()
(let ((tag ((@ (fibers scheduler) scheduler-prompt-tag) (current-scheduler))))
(abort-to-prompt tag schedule-task))
(loop))))
(spawn-fiber
(lambda ()
(let loop ()
(pk 'heap-size (assoc-ref (gc-stats) 'heap-size))
(sleep 2)
(loop))))
(get-message leave-channel))
#:parallelism 1 ;don't create POSIX threads
#:hz 0) |
Even more than that, it's the very same commit I found was a problem when I bisected #58 ! That commit caused bugs in both space and time it appears. :) |
Instead of reverting it, a possible fix is: diff --git a/fibers.scm b/fibers.scm
index 6e81f98..11c5389 100644
--- a/fibers.scm
+++ b/fibers.scm
@@ -150,7 +150,9 @@ parameter mutations to the fiber."
(with-dynamic-state dynamic-state thunk))))
(define (create-fiber sched thunk)
(schedule-task sched
- (capture-dynamic-state thunk)))
+ (capture-dynamic-state
+ (lambda ()
+ (%start-stack #t thunk)))))
(cond
(scheduler
;; When a scheduler is passed explicitly, it could be there is no (That Surprising because diff --git a/fibers.scm b/fibers.scm
index 6e81f98..23d07fd 100644
--- a/fibers.scm
+++ b/fibers.scm
@@ -150,7 +150,11 @@ parameter mutations to the fiber."
(with-dynamic-state dynamic-state thunk))))
(define (create-fiber sched thunk)
(schedule-task sched
- (capture-dynamic-state thunk)))
+ (capture-dynamic-state
+ (let ((f (make-fluid)))
+ (lambda ()
+ (with-fluids ((f 42)) ;WTF?
+ (thunk)))))))
(cond
(scheduler
;; When a scheduler is passed explicitly, it could be there is no So I suspect this whole thing exhibits a bug in dynamic stack capture or something. |
Escalating! 👉 https://issues.guix.gnu.org/59021 |
The change in (use-modules (fibers)
(fibers channels)
(fibers scheduler)
(statprof)
(ice-9 match))
(include "heap-profiler.scm")
(run-fibers
(lambda ()
(define leave-channel
(make-channel))
(spawn-fiber
(lambda ()
(sleep 60)
(put-message leave-channel 'leave)))
(let* ((sock (socket AF_INET
(logior SOCK_STREAM SOCK_NONBLOCK SOCK_CLOEXEC)
IPPROTO_IP))
(port 4567))
(bind sock (make-socket-address AF_INET INADDR_LOOPBACK port))
(listen sock 4)
(format #t "listening on port ~a~%" port)
(format #t "In Bash, run:
while : ; do echo foo> /dev/tcp/localhost/~a ; done~%~%" port)
(spawn-fiber
(lambda ()
(let loop ()
(match (accept sock (logior SOCK_NONBLOCK SOCK_CLOEXEC))
((connection . peer)
(close-port connection))
(_ #f))
(loop)))))
(spawn-fiber
(lambda ()
(define delay 2)
(let loop ((previous (assoc-ref (gc-stats) 'heap-total-allocated)))
(let* ((stats (gc-stats))
(allocated (assoc-ref stats 'heap-total-allocated)))
(pk 'heap-size (assoc-ref stats 'heap-size)
'MiB/s (/ (- allocated previous) (expt 2. 20) delay))
;; (profile-heap)
(sleep delay)
(loop allocated)))))
(get-message leave-channel)
(profile-heap))
#:parallelism 1 ;don't create POSIX threads
#:hz 0) It's a relatively slow leak because on my laptop the allocation rate is only ~1 MiB/s, but it is leaky. With the heap profiler called at the end, I get:
... suggesting we've allocated and retained many closures ( |
Turns out the leak was introduced in c25dcb9. Fixed now! |
As discussed in the context of the Shepherd, Fibers 1.1.1 leaks memory on each context switch (!). The simplest reproducer (with Guile 3.0.8 or even 2.2.7) is this:
This leak is proportional to the number of context switches: replace
(yield-current-task)
with(sleep 1)
or similar, and everything's fine.The text was updated successfully, but these errors were encountered: