Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reduce memory & redundant work for concurrent TimeZones construction
This commit improves the thread-local caching scheme introduced in JuliaTime#344, by sharing TimeZones across _all_ thread-local caches (so there's only ever one TimeZone object instance created per process), and reduces redundant work caused by multiple concurrent Tasks starting to construct the same TimeZone at the same time. Before this commit, multiple Tasks that try to construct the same TimeZone on different threads would have constructed the same object multiple times. Whereas now, they would share it, via the "Future". ------------------------------------------------------------ It's difficult to show this effect, but one way to show it is by logging when a TimeZone is constructed, via this diff: ```julia diff --git a/src/types/timezone.jl b/src/types/timezone.jl index 25d36c3..1cea69e 100644 --- a/src/types/timezone.jl +++ b/src/types/timezone.jl @@ -68,6 +68,7 @@ function TimeZone(str::AbstractString, mask::Class=Class(:DEFAULT)) # Note: If the class `mask` does not match the time zone we'll still load the # information into the cache to ensure the result is consistent. tz, class = get!(_tz_cache(), str) do + Core.println("CONSTRUCTING $str") tz_path = joinpath(TZData.COMPILED_DIR, split(str, "/")...) if isfile(tz_path) ``` Before this commit, you can see that Every thread constructs the object twice - once before the clear and once after (total of 8 times): ```julia julia> Threads.nthreads() 4 julia> TimeZones.TZData.compile(); julia> foo() = begin @sync for i in 1:20 if (i == 10) @info "---------clear-------" TimeZones.TZData.compile() end Threads.@Spawn begin TimeZone("US/Eastern", TimeZones.Class(:LEGACY)) end end @info "done" end foo (generic function with 1 method) julia> @time foo() [ Info: ---------clear------- CONSTRUCTING US/Eastern CONSTRUCTING US/Eastern CONSTRUCTING US/Eastern CONSTRUCTING US/Eastern CONSTRUCTING US/Eastern CONSTRUCTING US/Eastern CONSTRUCTING US/Eastern CONSTRUCTING US/Eastern [ Info: done 0.391298 seconds (1.51 M allocations: 64.544 MiB, 2.46% gc time, 0.00% compilation time) ``` After this commit, it's constructed only twice - once before the clear and once after: ```julia julia> @time foo() [ Info: ---------clear------- [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: ---------clear------- [ Info: done 0.414059 seconds (1.46 M allocations: 61.972 MiB, 4.55% gc time) ``` ------------------------------------------------------------------ Finally, the other problem this avoids is if we ever accidentally introduce a Task yield inside the constructor, which could happen if we used `@info` instead of `Core.println()`, then without this PR, the old code could potentially do _redundant work_ to construct the TimeZone multiple times - even on the same thread - since each Task's constructor would see that there's no TZ in the cache, start the work, and then yield to the next Task, which would do the same, until finally they all report their work into the cache, overwriting each other. This is what happens if we use `@info` in the above diff, instead: Before this commit - the results are nondeterministic, but in this run, you can see it redundantly constructed the value all 20 times!: ```julia julia> @time foo() [ Info: ---------clear------- [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: done 0.494492 seconds (1.55 M allocations: 66.754 MiB, 16.67% gc time) ``` After this commit, just the two we expect. 😊 : ```julia julia> @time foo() [ Info: ---------clear------- [ Info: CONSTRUCTING US/Eastern [ Info: CONSTRUCTING US/Eastern [ Info: done 0.422677 seconds (1.47 M allocations: 62.228 MiB, 4.66% gc time) ```
- Loading branch information