NotificationsYou must be signed in to change notification settings
Fork5.6k
Star47.4k

Insert eager calls to`finalize` for otherwise-dead finalizeable objects#44056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Closed

jpsamaroo wants to merge7 commits intoJuliaLang:avi/EASROAfromjpsamaroo:jps/finalizer-elision

Closed

Insert eager calls to`finalize` for otherwise-dead finalizeable objects#44056

jpsamaroo wants to merge7 commits intoJuliaLang:avi/EASROAfromjpsamaroo:jps/finalizer-elision

Conversation

Copy link

Member

jpsamaroo commentedFeb 6, 2022

Finalizers are a great way to defer the freeing side of memory management until some later point; however, they can have unpredictable behavior when the data that they free is not fully known to the GC (e.g. GPU allocations, or distributed references). This can result in behavior like out-of-memory situations, excessive memory usage, and sometimes more costly freeing behavior (in the event that locks need to be taken).

This seems like a bad situation, but there is a silver lining: some code patterns which allocate such objects don't actually need the allocations to stick around very long, and the lifetime of the object could (in theory) be statically determined by the compiler. Thankfully, with the ongoing work of integrating EscapeAnalysis.jl into the optimizer, we can use the generated escape information to improve this situation.

This PR uses escape info from EA to determine when an object has an attached finalizer, and when its lifetime is provably finite (i.e. the object does not escape the analyzed scope). For such objects, we can insert an early call tofinalize(obj) at the end ofobj's lifetime, which will allow the object's finalizer to be enqueued for execution immediately, minimizing how long finalizeable object stay live in the GC.

jpsamaroo added GC

Garbage collector

compiler:optimizerOptimization passes (mostly in base/compiler/ssair/) labels

Feb 6, 2022

jpsamaroo requested a review fromaviatesk

February 6, 2022 22:23

jpsamaroo force-pushed thejps/finalizer-elision branch from17791bb to1d03b6dCompare

February 7, 2022 19:45

jpsamaroo changed the base branch fromavi/EscapeAnalysis toavi/EASROA

February 7, 2022 19:46

aviatesk force-pushed theavi/EASROA branch from4c2ed4e to961366fCompare

February 8, 2022 08:52

aviatesk reviewed

Feb 8, 2022

View reviewed changes

Copy link

Member

aviatesk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I rebasedavi/EASROA. Maybe rebasing this branch against it would fix the build error?

base/compiler/optimize.jl OutdatedShow resolvedHide resolved

aviatesk force-pushed theavi/EASROA branch from961366f toe503797Compare

February 8, 2022 08:58

jpsamaroo force-pushed thejps/finalizer-elision branch from1d03b6d to9dff04dCompare

February 9, 2022 14:34

Copy link

MemberAuthor

jpsamaroo commentedFeb 9, 2022

Latest push post-rebase still spamsUndefRefError() inadce_pass!

Copy link

MemberAuthor

jpsamaroo commentedFeb 9, 2022

Error:

Internal error: encountered unexpected error in runtime:UndefRefError()getindex at ./array.jl:921 [inlined]getindex at ./compiler/ssair/ir.jl:238 [inlined]is_union_phi at ./compiler/ssair/passes.jl:1186 [inlined]adce_pass! at ./compiler/ssair/passes.jl:1240run_passes at ./compiler/optimize.jl:606optimize at ./compiler/optimize.jl:585 [inlined]_typeinf at ./compiler/typeinfer.jl:253typeinf at ./compiler/typeinfer.jl:209typeinf_edge at ./compiler/typeinfer.jl:831abstract_call_method at ./compiler/abstractinterpretation.jl:561abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:114abstract_call_known at ./compiler/abstractinterpretation.jl:1475unknown function (ip: 0x7f13e64d468d)_jl_invoke at /home/jpsamaroo/julia-fin-el/src/gf.c:2311ijl_invoke at /home/jpsamaroo/julia-fin-el/src/gf.c:2337unknown function (ip: 0x7f13e6aceb4c)unknown function (ip: 0x7f13e6aceaad)

Copy link

MemberAuthor

jpsamaroo commentedFeb 9, 2022

Per discussion: the issue here is probably that we don't check that thereturn dominates the allocation passed tofinalize, so we'll need to query the domtree as well.

This change will also potentially cause extended lifetimes for some allocations, but that's apparently a general issue that needs resolving.

aviatesk force-pushed theavi/EASROA branch 4 times, most recently fromef8f7be to04f2d48Compare

February 10, 2022 15:44

jpsamaroo force-pushed thejps/finalizer-elision branch from9dff04d to9714afcCompare

February 11, 2022 15:16

jpsamaroo marked this pull request as ready for review

February 11, 2022 15:25

jpsamaroo added the needs testsUnit tests are required for this change label

Feb 11, 2022

Copy link

MemberAuthor

jpsamaroo commentedFeb 11, 2022

We're now factoring in dominance information (hopefully correctly), which appears to have fixed the errors!

jpsamaroo requested a review fromaviatesk

February 11, 2022 15:27

aviatesk force-pushed theavi/EASROA branch from04f2d48 to974891cCompare

February 13, 2022 09:58

optimizer: Julia-level escape analysis

25635ea

This commit ports [EscapeAnalysis.jl](https://github.com/aviatesk/EscapeAnalysis.jl) into Julia base.You can find the documentation of this escape analysis at [this GitHub page](https://aviatesk.github.io/EscapeAnalysis.jl/dev/)[^1].[^1]: The same documentation will be included into Julia's developer      documentation by this commit.This escape analysis will hopefully be an enabling technology for variousmemory-related optimizations at Julia's high level compilation pipeline.Possible target optimization includes alias aware SROA (JuliaLang#43888),array SROA (JuliaLang#43909), `mutating_arrayfreeze` optimization (JuliaLang#42465),stack allocation of mutables, finalizer elision and so on[^2].[^2]: It would be also interesting if LLVM-level optimizations can consume      IPO information derived by this escape analysis to broaden      optimization possibilities.The primary motivation for porting EA in this PR is to check its impacton latency as well as to get feedbacks from a broader range of developers.The plan is that we first introduce EA in this commit, and then merge thedepending PRs built on top of this commit likeJuliaLang#43888,JuliaLang#43909 andJuliaLang#42465This commit simply defines and runs EA inside Julia base compiler andenables the existing test suite with it. In this commit, we just run EAbefore inlining to generate IPO cache. The depending PRs, EA will beinvoked again after inlining to be used for various local optimizations.

aviatesk force-pushed theavi/EASROA branch from974891c to17c84ffCompare

February 14, 2022 13:07

optimizer: alias-aware SROA

325f414

Enhances SROA of mutables using the novel Julia-level escape analysis (on top ofJuliaLang#43800):1. alias-aware SROA, mutable ϕ-node elimination2. `isdefined` check elimination3. load-forwarding for non-eliminable but analyzable mutables---1. alias-aware SROA, mutable ϕ-node eliminationEA's alias analysis allows this new SROA to handle nested mutables allocationspretty well. Now we can eliminate the heap allocations completely fromthis insanely nested examples by the single analysis/optimization pass:```juliajulia> function refs(x)           (Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref((x))))))))))))[][][][][][][][][][]       endrefs (generic function with 1 method)julia> refs("julia");@allocated refs("julia")0```EA can also analyze escape of ϕ-node as well as its aliasing.Mutable ϕ-nodes would be eliminated even for a very tricky case as like:```juliajulia> code_typed((Bool,String,)) do cond, x           # these allocation form multiple ϕ-nodes           if cond               ϕ2 = ϕ1 = Ref{Any}("foo")           else               ϕ2 = ϕ1 = Ref{Any}("bar")           end           ϕ2[] = x           y = ϕ1[] # => x           return y       end1-element Vector{Any}: CodeInfo(1 ─     gotoJuliaLang#3 if not cond2 ─     gotoJuliaLang#43 ─     nothing::Nothing4 ┄     return x) => Any```Combined with the alias analysis and ϕ-node handling above,allocations in the following "realistic" examples will be optimized:```juliajulia> # demonstrate the power of our field / alias analysis with realistic end to end examples       # adapted fromhttp://wiki.luajit.org/Allocation-Sinking-Optimization#implementation%5B       abstract type AbstractPoint{T} endjulia> struct Point{T} <: AbstractPoint{T}           x::T           y::T       endjulia> mutable struct MPoint{T} <: AbstractPoint{T}           x::T           y::T       endjulia> add(a::P, b::P) where P<:AbstractPoint = P(a.x + b.x, a.y + b.y);julia> function compute_point(T, n, ax, ay, bx, by)           a = T(ax, ay)           b = T(bx, by)           for i in 0:(n-1)               a = add(add(a, b), b)           end           a.x, a.y       end;julia> function compute_point(n, a, b)           for i in 0:(n-1)               a = add(add(a, b), b)           end           a.x, a.y       end;julia> function compute_point!(n, a, b)           for i in 0:(n-1)               a′ = add(add(a, b), b)               a.x = a′.x               a.y = a′.y           end       end;julia> compute_point(MPoint, 10, 1+.5, 2+.5, 2+.25, 4+.75);julia> compute_point(MPoint, 10, 1+.5im, 2+.5im, 2+.25im, 4+.75im);julia>@allocated compute_point(MPoint, 10000, 1+.5, 2+.5, 2+.25, 4+.75)0julia>@allocated compute_point(MPoint, 10000, 1+.5im, 2+.5im, 2+.25im, 4+.75im)0julia> compute_point(10, MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75));julia> compute_point(10, MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im));julia>@allocated compute_point(10000, MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75))0julia>@allocated compute_point(10000, MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im))0julia> af, bf = MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75);julia> ac, bc = MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im);julia> compute_point!(10, af, bf);julia> compute_point!(10, ac, bc);julia>@allocated compute_point!(10000, af, bf)0julia>@allocated compute_point!(10000, ac, bc)0```2. `isdefined` check eliminationThis commit also implements a simple optimization to eliminate`isdefined` call by checking load-fowardability.This optimization may be especially useful to eliminate extra allocationinvolved with a capturing closure, e.g.:```juliajulia> callit(f, args...) = f(args...);julia> function isdefined_elim()           local arr::Vector{Any}           callit() do               arr = Any[]           end           return arr       end;julia> code_typed(isdefined_elim)1-element Vector{Any}: CodeInfo(1 ─ %1 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}└──      gotoJuliaLang#3 if not true2 ─      gotoJuliaLang#43 ─      $(Expr(:throw_undef_if_not, :arr, false))::Any4 ┄      return %1) => Vector{Any}```3. load-forwarding for non-eliminable but analyzable mutablesEA also allows us to forward loads even when the mutable allocationcan't be eliminated but still its fields are known precisely.The load forwarding might be useful since it may derive new type informationthat succeeding optimization passes can use (or just because it allowssimpler code transformations down the load):```juliajulia> code_typed((Bool,String,)) do c, s           r = Ref{Any}(s)           if c               return r[]::String # adce_pass! will further eliminate this type assert call also           else               return r           end       end1-element Vector{Any}: CodeInfo(1 ─ %1 = %new(Base.RefValue{Any}, s)::Base.RefValue{Any}└──      gotoJuliaLang#3 if not c2 ─      return s3 ─      return %1) => Union{Base.RefValue{Any}, String}```---Please refer to the newly added test cases for more examples.Also, EA's alias analysis already succeeds to reason about arrays, andso this EA-based SROA will hopefully be generalized for array SROA as well.

aviatesk force-pushed theavi/EASROA branch from17c84ff to325f414Compare

February 14, 2022 15:28

Implement FinalizerEscape

6340ff0

Co-authored-by: Shuhei Kadowaki <aviatesk@gmail.com>

jpsamaroo force-pushed thejps/finalizer-elision branch from9714afc toea0aaabCompare

March 1, 2022 17:25

Copy link

Contributor

yuyichao commentedMar 4, 2022

Note that callingfinalize is expensive since it's not designed to be used this way. This is especially in code that uses finalizer a lot, which seems to be what this is targetting. In another word, this transformation will make your code run slower if it didn't run out of memory.

If you can prove that the object has only known finalizers, and you know what those are, you should be able to call the finalizer directly without going through the normal GC logic. If you can't prove that exception won't occur, which you probably won't be able to prove at this level, you can just set a flag in the GC to tell it to not run any currently registered finalizers (finalize may need to check this flag as well). If that's too much code to generate, you can simply add a new C API to pass in the finalizer directly so that you can skip the scan of the finalizer list and have that C API set appropriate flags for the GC.

Also, as I've said many time before, it is fairly easy to effectively let the GC know about these object. In most cases all what you need to do is to callGC.gc() when your allocation/file opening failed.

Copy link

AriMKatz commentedMar 7, 2022•
edited
Loading

@chflood any thoughts on this? (Particularly wrt to GPU)

Copy link

MemberAuthor

jpsamaroo commentedMar 8, 2022

Are packages currently doing such resource handling in finalizers, given we don't have#35689?

Yes, MemPool.jl is using a global non-reentrant lock (taken from CUDAdrv/CUDAnative originally), which is taken during allocation, and during finalization. I wouldn't mind changing it to a regularReentrantLock if this is considered to be an unsupported pattern (I'm not sure if it really needs to be non-reentrant anymore, now that we disable finalizers while taking locks;@krynju).

Copy link

Member

chflood commentedMar 8, 2022

In principal I love the idea of escape analysis stack allocating objects that go away without garbage collector intervention however objects with finalizers pose special issues that I don't fully understand.

I'm still learning Julia so please indulge my questions. Does Julia have a rule about finalizers running exactly once like Java does? Can a finalizer bring an object back to life by stashing it somewhere? My concern is that the GC might accidentally run a finalizer again on a zombie object.

There are also memory model issues in Java as detailedhere which may or may not be applicable to Julia. If you run the finalizer without some sort of memory barrier is it possible that instructions may be reordered in incorrect ways? GC provides that memory barrier.

Copy link

Member

oscardssmith commentedMar 8, 2022

We don't appear to document such a property, (or really anything about how finalizers are run).https://docs.julialang.org/en/v1.9-dev/base/base/#Base.finalizer andhttps://docs.julialang.org/en/v1.9-dev/manual/multi-threading/#Safe-use-of-Finalizers are the only places where they are documented at all. We should probably figure out what properties we want to guarantee and document them.

Copy link

MemberAuthor

jpsamaroo commentedMar 8, 2022

It appears to me that the current implementation offinalize ends up calling all finalizers for the object directly (instead of queuing them for later), meaning that finalizers must be safe to execute immediately in allocation scope if this pass callsfinalize. Is this something that we want to assume for finalizers? Or do we want to assume that finalizers must be executed outside of allocation scope, and thus change the approach in this PR to using a delayed approach?

jpsamaroo marked this pull request as draft

March 8, 2022 19:08

jpsamaroo removed the needs testsUnit tests are required for this change label

Mar 8, 2022

jpsamaroo added2 commits

March 8, 2022 14:57

optimizer: Add early finalize calls

75f83ed

tests: Add finalizer escape tests

e708e23

jpsamaroo force-pushed thejps/finalizer-elision branch from25e4b37 toe708e23Compare

March 8, 2022 20:57

fixup! optimizer: Add early finalize calls

1a334a8

jpsamaroo mentioned this pull request

Mar 9, 2022

Use refcounting for memory managementJuliaGPU/AMDGPU.jl#207

Closed

Copy link

Contributor

yuyichao commentedMar 9, 2022

I agree with adding this fast-path; would it be reasonable to punt that to a future PR, or do you want to see that done here before this is considered for merge?

C api should be fairly straightforward as well. The issue with the pr as is is that it will introduce regression.

Can a finalizer bring an object back to life by stashing it somewhere?

yes.

My concern is that the GC might accidentally run a finalizer again on a zombie object.

no that is not supposed to happen, each finalizer will run only once. It is removed from the list before being called.

Is this something that we want to assume for finalizers?

not just that, finalizer can run at any time that a gc can run. There were proposals about running hem on separate thread but it is not done and still have issues regarding gc triggered on the finalizer thread.

Use IPO EA and query argescapes of invokes

7491ade

aviatesk force-pushed theavi/EASROA branch 2 times, most recently from9c84ddc tocdef102Compare

March 23, 2022 07:11

Keno added a commit that referenced this pull request

May 11, 2022

Very WIP: Eager finalizer insertion

74334f1

This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```

Keno mentioned this pull request

May 11, 2022

Eager finalizer insertion#45272

Merged

Keno added a commit that referenced this pull request

May 12, 2022

Very WIP: Eager finalizer insertion

ff35e48

This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```

Keno added a commit that referenced this pull request

May 12, 2022

Very WIP: Eager finalizer insertion

458c7f6

This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```

Keno mentioned this pull request

May 12, 2022

give finalizers their own RNG state#45212

Merged

Keno added a commit that referenced this pull request

May 16, 2022

Eager finalizer insertion

67ec007

This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```

Keno added a commit that referenced this pull request

May 16, 2022

Eager finalizer insertion

dc8a715

This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```

Keno added a commit that referenced this pull request

May 25, 2022

Eager finalizer insertion

8f4ed29

This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```

Keno added a commit that referenced this pull request

May 25, 2022

Eager finalizer insertion

40503d6

This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```

Keno added a commit that referenced this pull request

May 25, 2022

Eager finalizer insertion

3ca07d6

This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```

Keno added a commit that referenced this pull request

May 29, 2022

Eager finalizer insertion

da04e84

This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```

Keno added a commit that referenced this pull request

Jun 7, 2022

Eager finalizer insertion (#45272)

c4effda

* Eager finalizer insertionThis is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```* rm redundant copyCo-authored-by: Shuhei Kadowaki <40514306+aviatesk@users.noreply.github.com>

Copy link

MemberAuthor