Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Insert eager calls tofinalize for otherwise-dead finalizeable objects#44056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed

Conversation

jpsamaroo
Copy link
Member

Finalizers are a great way to defer the freeing side of memory management until some later point; however, they can have unpredictable behavior when the data that they free is not fully known to the GC (e.g. GPU allocations, or distributed references). This can result in behavior like out-of-memory situations, excessive memory usage, and sometimes more costly freeing behavior (in the event that locks need to be taken).

This seems like a bad situation, but there is a silver lining: some code patterns which allocate such objects don't actually need the allocations to stick around very long, and the lifetime of the object could (in theory) be statically determined by the compiler. Thankfully, with the ongoing work of integrating EscapeAnalysis.jl into the optimizer, we can use the generated escape information to improve this situation.

This PR uses escape info from EA to determine when an object has an attached finalizer, and when its lifetime is provably finite (i.e. the object does not escape the analyzed scope). For such objects, we can insert an early call tofinalize(obj) at the end ofobj's lifetime, which will allow the object's finalizer to be enqueued for execution immediately, minimizing how long finalizeable object stay live in the GC.

aviatesk, carstenbauer, miguelraz, ianatol, ToucheSir, jonathan-laurent, StefanKarpinski, and eliascarv reacted with thumbs up emojiaviatesk, carstenbauer, maleadt, miguelraz, ianatol, serenity4, mkschleg, ToucheSir, thazhemadam, chriselrod, and 4 more reacted with heart emojic42f and eliascarv reacted with rocket emoji
@jpsamaroojpsamaroo added GCGarbage collector compiler:optimizerOptimization passes (mostly in base/compiler/ssair/) labelsFeb 6, 2022
@jpsamaroojpsamaroo changed the base branch fromavi/EscapeAnalysis toavi/EASROAFebruary 7, 2022 19:46
Copy link
Member

@aviateskaviatesk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I rebasedavi/EASROA. Maybe rebasing this branch against it would fix the build error?

@jpsamaroo
Copy link
MemberAuthor

Latest push post-rebase still spamsUndefRefError() inadce_pass!

@jpsamaroo
Copy link
MemberAuthor

Error:

Internal error: encountered unexpected error in runtime:UndefRefError()getindex at ./array.jl:921 [inlined]getindex at ./compiler/ssair/ir.jl:238 [inlined]is_union_phi at ./compiler/ssair/passes.jl:1186 [inlined]adce_pass! at ./compiler/ssair/passes.jl:1240run_passes at ./compiler/optimize.jl:606optimize at ./compiler/optimize.jl:585 [inlined]_typeinf at ./compiler/typeinfer.jl:253typeinf at ./compiler/typeinfer.jl:209typeinf_edge at ./compiler/typeinfer.jl:831abstract_call_method at ./compiler/abstractinterpretation.jl:561abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:114abstract_call_known at ./compiler/abstractinterpretation.jl:1475unknown function (ip: 0x7f13e64d468d)_jl_invoke at /home/jpsamaroo/julia-fin-el/src/gf.c:2311ijl_invoke at /home/jpsamaroo/julia-fin-el/src/gf.c:2337unknown function (ip: 0x7f13e6aceb4c)unknown function (ip: 0x7f13e6aceaad)

@jpsamaroo
Copy link
MemberAuthor

Per discussion: the issue here is probably that we don't check that thereturn dominates the allocation passed tofinalize, so we'll need to query the domtree as well.

This change will also potentially cause extended lifetimes for some allocations, but that's apparently a general issue that needs resolving.

@aviateskaviateskforce-pushed theavi/EASROA branch 4 times, most recently fromef8f7be to04f2d48CompareFebruary 10, 2022 15:44
@jpsamaroojpsamaroo marked this pull request as ready for reviewFebruary 11, 2022 15:25
@jpsamaroojpsamaroo added the needs testsUnit tests are required for this change labelFeb 11, 2022
@jpsamaroo
Copy link
MemberAuthor

We're now factoring in dominance information (hopefully correctly), which appears to have fixed the errors!

This commit ports [EscapeAnalysis.jl](https://github.com/aviatesk/EscapeAnalysis.jl) into Julia base.You can find the documentation of this escape analysis at [this GitHub page](https://aviatesk.github.io/EscapeAnalysis.jl/dev/)[^1].[^1]: The same documentation will be included into Julia's developer      documentation by this commit.This escape analysis will hopefully be an enabling technology for variousmemory-related optimizations at Julia's high level compilation pipeline.Possible target optimization includes alias aware SROA (JuliaLang#43888),array SROA (JuliaLang#43909), `mutating_arrayfreeze` optimization (JuliaLang#42465),stack allocation of mutables, finalizer elision and so on[^2].[^2]: It would be also interesting if LLVM-level optimizations can consume      IPO information derived by this escape analysis to broaden      optimization possibilities.The primary motivation for porting EA in this PR is to check its impacton latency as well as to get feedbacks from a broader range of developers.The plan is that we first introduce EA in this commit, and then merge thedepending PRs built on top of this commit likeJuliaLang#43888,JuliaLang#43909 andJuliaLang#42465This commit simply defines and runs EA inside Julia base compiler andenables the existing test suite with it. In this commit, we just run EAbefore inlining to generate IPO cache. The depending PRs, EA will beinvoked again after inlining to be used for various local optimizations.
Enhances SROA of mutables using the novel Julia-level escape analysis (on top ofJuliaLang#43800):1. alias-aware SROA, mutable ϕ-node elimination2. `isdefined` check elimination3. load-forwarding for non-eliminable but analyzable mutables---1. alias-aware SROA, mutable ϕ-node eliminationEA's alias analysis allows this new SROA to handle nested mutables allocationspretty well. Now we can eliminate the heap allocations completely fromthis insanely nested examples by the single analysis/optimization pass:```juliajulia> function refs(x)           (Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref((x))))))))))))[][][][][][][][][][]       endrefs (generic function with 1 method)julia> refs("julia");@allocated refs("julia")0```EA can also analyze escape of ϕ-node as well as its aliasing.Mutable ϕ-nodes would be eliminated even for a very tricky case as like:```juliajulia> code_typed((Bool,String,)) do cond, x           # these allocation form multiple ϕ-nodes           if cond               ϕ2 = ϕ1 = Ref{Any}("foo")           else               ϕ2 = ϕ1 = Ref{Any}("bar")           end           ϕ2[] = x           y = ϕ1[] # => x           return y       end1-element Vector{Any}: CodeInfo(1 ─     gotoJuliaLang#3 if not cond2 ─     gotoJuliaLang#43 ─     nothing::Nothing4 ┄     return x) => Any```Combined with the alias analysis and ϕ-node handling above,allocations in the following "realistic" examples will be optimized:```juliajulia> # demonstrate the power of our field / alias analysis with realistic end to end examples       # adapted fromhttp://wiki.luajit.org/Allocation-Sinking-Optimization#implementation%5B       abstract type AbstractPoint{T} endjulia> struct Point{T} <: AbstractPoint{T}           x::T           y::T       endjulia> mutable struct MPoint{T} <: AbstractPoint{T}           x::T           y::T       endjulia> add(a::P, b::P) where P<:AbstractPoint = P(a.x + b.x, a.y + b.y);julia> function compute_point(T, n, ax, ay, bx, by)           a = T(ax, ay)           b = T(bx, by)           for i in 0:(n-1)               a = add(add(a, b), b)           end           a.x, a.y       end;julia> function compute_point(n, a, b)           for i in 0:(n-1)               a = add(add(a, b), b)           end           a.x, a.y       end;julia> function compute_point!(n, a, b)           for i in 0:(n-1)               a′ = add(add(a, b), b)               a.x = a′.x               a.y = a′.y           end       end;julia> compute_point(MPoint, 10, 1+.5, 2+.5, 2+.25, 4+.75);julia> compute_point(MPoint, 10, 1+.5im, 2+.5im, 2+.25im, 4+.75im);julia>@allocated compute_point(MPoint, 10000, 1+.5, 2+.5, 2+.25, 4+.75)0julia>@allocated compute_point(MPoint, 10000, 1+.5im, 2+.5im, 2+.25im, 4+.75im)0julia> compute_point(10, MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75));julia> compute_point(10, MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im));julia>@allocated compute_point(10000, MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75))0julia>@allocated compute_point(10000, MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im))0julia> af, bf = MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75);julia> ac, bc = MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im);julia> compute_point!(10, af, bf);julia> compute_point!(10, ac, bc);julia>@allocated compute_point!(10000, af, bf)0julia>@allocated compute_point!(10000, ac, bc)0```2. `isdefined` check eliminationThis commit also implements a simple optimization to eliminate`isdefined` call by checking load-fowardability.This optimization may be especially useful to eliminate extra allocationinvolved with a capturing closure, e.g.:```juliajulia> callit(f, args...) = f(args...);julia> function isdefined_elim()           local arr::Vector{Any}           callit() do               arr = Any[]           end           return arr       end;julia> code_typed(isdefined_elim)1-element Vector{Any}: CodeInfo(1 ─ %1 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}└──      gotoJuliaLang#3 if not true2 ─      gotoJuliaLang#43 ─      $(Expr(:throw_undef_if_not, :arr, false))::Any4 ┄      return %1) => Vector{Any}```3. load-forwarding for non-eliminable but analyzable mutablesEA also allows us to forward loads even when the mutable allocationcan't be eliminated but still its fields are known precisely.The load forwarding might be useful since it may derive new type informationthat succeeding optimization passes can use (or just because it allowssimpler code transformations down the load):```juliajulia> code_typed((Bool,String,)) do c, s           r = Ref{Any}(s)           if c               return r[]::String # adce_pass! will further eliminate this type assert call also           else               return r           end       end1-element Vector{Any}: CodeInfo(1 ─ %1 = %new(Base.RefValue{Any}, s)::Base.RefValue{Any}└──      gotoJuliaLang#3 if not c2 ─      return s3 ─      return %1) => Union{Base.RefValue{Any}, String}```---Please refer to the newly added test cases for more examples.Also, EA's alias analysis already succeeds to reason about arrays, andso this EA-based SROA will hopefully be generalized for array SROA as well.
Co-authored-by: Shuhei Kadowaki <aviatesk@gmail.com>
@yuyichao
Copy link
Contributor

Note that callingfinalize is expensive since it's not designed to be used this way. This is especially in code that uses finalizer a lot, which seems to be what this is targetting. In another word, this transformation will make your code run slower if it didn't run out of memory.

If you can prove that the object has only known finalizers, and you know what those are, you should be able to call the finalizer directly without going through the normal GC logic. If you can't prove that exception won't occur, which you probably won't be able to prove at this level, you can just set a flag in the GC to tell it to not run any currently registered finalizers (finalize may need to check this flag as well). If that's too much code to generate, you can simply add a new C API to pass in the finalizer directly so that you can skip the scan of the finalizer list and have that C API set appropriate flags for the GC.

Also, as I've said many time before, it is fairly easy to effectively let the GC know about these object. In most cases all what you need to do is to callGC.gc() when your allocation/file opening failed.

jpsamaroo reacted with thumbs up emoji

@AriMKatz
Copy link

AriMKatz commentedMar 7, 2022
edited
Loading

@chflood any thoughts on this? (Particularly wrt to GPU)

@jpsamaroo
Copy link
MemberAuthor

Are packages currently doing such resource handling in finalizers, given we don't have#35689?

Yes, MemPool.jl is using a global non-reentrant lock (taken from CUDAdrv/CUDAnative originally), which is taken during allocation, and during finalization. I wouldn't mind changing it to a regularReentrantLock if this is considered to be an unsupported pattern (I'm not sure if it really needs to be non-reentrant anymore, now that we disable finalizers while taking locks;@krynju).

@chflood
Copy link
Member

In principal I love the idea of escape analysis stack allocating objects that go away without garbage collector intervention however objects with finalizers pose special issues that I don't fully understand.

I'm still learning Julia so please indulge my questions. Does Julia have a rule about finalizers running exactly once like Java does? Can a finalizer bring an object back to life by stashing it somewhere? My concern is that the GC might accidentally run a finalizer again on a zombie object.

There are also memory model issues in Java as detailedhere which may or may not be applicable to Julia. If you run the finalizer without some sort of memory barrier is it possible that instructions may be reordered in incorrect ways? GC provides that memory barrier.

@oscardssmith
Copy link
Member

We don't appear to document such a property, (or really anything about how finalizers are run).https://docs.julialang.org/en/v1.9-dev/base/base/#Base.finalizer andhttps://docs.julialang.org/en/v1.9-dev/manual/multi-threading/#Safe-use-of-Finalizers are the only places where they are documented at all. We should probably figure out what properties we want to guarantee and document them.

@jpsamaroo
Copy link
MemberAuthor

It appears to me that the current implementation offinalize ends up calling all finalizers for the object directly (instead of queuing them for later), meaning that finalizers must be safe to execute immediately in allocation scope if this pass callsfinalize. Is this something that we want to assume for finalizers? Or do we want to assume that finalizers must be executed outside of allocation scope, and thus change the approach in this PR to using a delayed approach?

@jpsamaroojpsamaroo marked this pull request as draftMarch 8, 2022 19:08
@jpsamaroojpsamaroo removed the needs testsUnit tests are required for this change labelMar 8, 2022
@yuyichao
Copy link
Contributor

I agree with adding this fast-path; would it be reasonable to punt that to a future PR, or do you want to see that done here before this is considered for merge?

C api should be fairly straightforward as well. The issue with the pr as is is that it will introduce regression.

Can a finalizer bring an object back to life by stashing it somewhere?

yes.

My concern is that the GC might accidentally run a finalizer again on a zombie object.

no that is not supposed to happen, each finalizer will run only once. It is removed from the list before being called.

Is this something that we want to assume for finalizers?

not just that, finalizer can run at any time that a gc can run. There were proposals about running hem on separate thread but it is not done and still have issues regarding gc triggered on the finalizer thread.

@aviateskaviateskforce-pushed theavi/EASROA branch 2 times, most recently from9c84ddc tocdef102CompareMarch 23, 2022 07:11
Keno added a commit that referenced this pull requestMay 11, 2022
This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```
@KenoKeno mentioned this pull requestMay 11, 2022
Keno added a commit that referenced this pull requestMay 12, 2022
This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```
Keno added a commit that referenced this pull requestMay 12, 2022
This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```
Keno added a commit that referenced this pull requestMay 16, 2022
This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```
Keno added a commit that referenced this pull requestMay 16, 2022
This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```
Keno added a commit that referenced this pull requestMay 25, 2022
This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```
Keno added a commit that referenced this pull requestMay 25, 2022
This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```
Keno added a commit that referenced this pull requestMay 25, 2022
This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```
Keno added a commit that referenced this pull requestMay 29, 2022
This is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```
Keno added a commit that referenced this pull requestJun 7, 2022
* Eager finalizer insertionThis is a variant of the eager-finalization idea(e.g. as seen in#44056), but with a focus on the mechanismof finalizer insertion, since I need a similar pass downstream.Integration of EscapeAnalysis is left to#44056.My motivation for this change is somewhat different. In particular,I want to be able to insert finalize call such that I cansubsequently SROA the mutable object. This requires a coupledesign points that are more stringent than the pass from#44056,so I decided to prototype them as an independent PR. The primarythings I need here that are not seen in#44056 are:- The ability to forgo finalizer registration with the runtime  entirely (requires additional legality analyis)- The ability to inline the registered finalizer at the deallocation  point (to enable subsequent SROA)To this end, adding a finalizer is promoted to a builtinthat is recognized by inference and inlining (such that inferencecan produce an inferred version of the finalizer for inlining).The current status is that this fixes the minimal example I wantedto have work, but does not yet extend to the motivating case I had.Nevertheless, I felt that this was a good checkpoint to synchronizewith other efforts along these lines.Currently working demo:```julia> const total_deallocations = Ref{Int}(0)Base.RefValue{Int64}(0)julia> mutable struct DoAlloc               function DoAlloc()                   this = new()                       Core._add_finalizer(this, function(this)                               global total_deallocations[] += 1                       end)                       return this               end       endjulia> function foo()               for i = 1:1000                       DoAlloc()               end       endfoo (generic function with 1 method)julia> @code_llvm foo();  @ REPL[3]:1 within `foo`define void @julia_foo_111() #0 {top:  %.promoted = load i64, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:2 within `foo`  %0 = add i64 %.promoted, 1000;  @ REPL[3] within `foo`  store i64 %0, i64* inttoptr (i64 140370001753968 to i64*), align 16;  @ REPL[3]:4 within `foo`  ret void}```* rm redundant copyCo-authored-by: Shuhei Kadowaki <40514306+aviatesk@users.noreply.github.com>
@jpsamaroo
Copy link
MemberAuthor

Abandoned in favor of#45272

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@aviateskaviateskAwaiting requested review from aviatesk

Assignees
No one assigned
Labels
compiler:optimizerOptimization passes (mostly in base/compiler/ssair/)GCGarbage collector
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

7 participants
@jpsamaroo@yuyichao@AriMKatz@maleadt@chflood@oscardssmith@aviatesk

[8]ページ先頭

©2009-2025 Movatter.jp