Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Garbage collection (computer science)

From Wikipedia, the free encyclopedia
Form of automatic memory management
This article is about garbage collection in memory management. For garbage collection in a solid-state drive, seeGarbage collection (SSD).

Stop-and-copy garbage collection in aLisp architecture:[1] Memory is divided intoworking andfree memory; new objects are allocated in the former. When it is full (depicted), garbage collection is performed: All data structures still in use are located by pointer tracing and copied into consecutive locations in free memory.
After that, the working memory contents is discarded in favor of the compressed copy, and the role ofworking andfree memory are exchanged (depicted).

Incomputer science,garbage collection (GC) is a form of automaticmemory management.[2] Thegarbage collector attempts to reclaim memory that was allocated by the program, but is no longer referenced; such memory is calledgarbage. Garbage collection was invented by American computer scientistJohn McCarthy around 1959 to simplify manual memory management inLisp.[3]

Garbage collection relieves the programmer from doingmanual memory management, where the programmer specifies what objects to de-allocate and return to the memory system and when to do so.[2] Other, similar techniques includestack allocation,region inference, and memory ownership, and combinations thereof. Garbage collection may take a significant proportion of a program's total processing time, and affectperformance as a result.

Resources other than memory, such asnetwork sockets, databasehandles,windows,file descriptors, and device descriptors, are not typically handled by garbage collection, but rather by othermethods (e.g.destructors). Some such methods de-allocate memory also.

Overview

[edit]
icon
This sectionneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources in this section. Unsourced material may be challenged and removed.(July 2014) (Learn how and when to remove this message)

Manyprogramming languages require garbage collection, either as part of thelanguage specification (e.g.,RPL,Java,C#,D,[4]Go, and mostscripting languages) or effectively for practical implementation (e.g., formal languages likelambda calculus).[5] These are said to begarbage-collected languages. Other languages, such asC andC++, were designed for use with manual memory management, but have garbage-collected implementations available. Some languages, likeAda,Modula-3, andC++/CLI, allow both garbage collection andmanual memory management to co-exist in the same application by using separateheaps for collected and manually managed objects. Still others, likeD, are garbage-collected but allow the user to manually delete objects or even disable garbage collection entirely when speed is required.[6]

Although many languages integrate GC into theircompiler andruntime system,post-hoc GC systems also exist, such asAutomatic Reference Counting (ARC). Some of thesepost-hoc GC systems do not require recompilation.[7]

Advantages

[edit]

GC frees the programmer from manually de-allocating memory. This helps avoid some kinds oferrors:[8]

  • Dangling pointers, which occur when a piece of memory is freed while there are stillpointers to it, and one of those pointers isdereferenced. By then the memory may have been reassigned to another use, with unpredictable results.[9]
  • Double free bugs, which occur when the program tries to free a region of memory that has already been freed, and perhaps already been allocated again.
  • Certain kinds ofmemory leaks, in which a program fails to free memory occupied by objects that have becomeunreachable, which can lead to memory exhaustion.[10]

Disadvantages

[edit]

GC uses computing resources to decide which memory to free. Therefore, the penalty for the convenience of not annotating object lifetime manually in the source code isoverhead, which can impair program performance.[11] A peer-reviewed paper from 2005 concluded that GC needs five times the memory to compensate for this overhead and to perform as fast as the same program using idealized explicit memory management. The comparison however is made to a program generated by inserting deallocation calls using anoracle, implemented by collecting traces from programs run under aprofiler, and the program is only correct for one particular execution of the program.[12] Interaction withmemory hierarchy effects can make this overhead intolerable in circumstances that are hard to predict or to detect in routine testing. The impact on performance was given by Apple as a reason for not adopting garbage collection iniOS, despite it being the most desired feature.[13]

The moment when the garbage is actually collected can be unpredictable, resulting in stalls (pauses to shift/free memory) scattered throughout asession. Unpredictable stalls can be unacceptable inreal-time environments, intransaction processing, or in interactive programs. Incremental, concurrent, and real-time garbage collectors address these problems, with varying trade-offs.

Strategies

[edit]

Tracing

[edit]
Main article:Tracing garbage collection

Tracing garbage collection is the most common type of garbage collection, so much so that "garbage collection" often refers to tracing garbage collection, rather than other methods such asreference counting. The overall strategy consists of determining which objects should be garbage collected by tracing which objects arereachable by a chain of references from certain root objects, and considering the rest as garbage and collecting them. However, there are a large number of algorithms used in implementation, with widely varying complexity and performance characteristics.

Reference counting

[edit]
Main article:Reference counting

Reference counting garbage collection is where each object has a count of the number of references to it. Garbage is identified by having a reference count of zero. An object's reference count is incremented when a reference to it is created and decremented when a reference is destroyed. When the count reaches zero, the object's memory is reclaimed.[14]

As with manual memory management, and unlike tracing garbage collection, reference counting guarantees that objects are destroyed as soon as their last reference is destroyed, and usually only accesses memory which is either inCPU caches, in objects to be freed, or directly pointed to by those, and thus tends to not have significant negative side effects on CPU cache andvirtual memory operation.

There are a number of disadvantages to reference counting; this can generally be solved or mitigated by more sophisticated algorithms:

Cycles
If two or more objects refer to each other, they can create a cycle whereby neither will be collected as their mutual references never let their reference counts become zero. Some garbage collection systems using reference counting (like the one inCPython) use specific cycle-detecting algorithms to deal with this issue.[15] Another strategy is to useweak references for the "backpointers" which create cycles. Under reference counting, a weak reference is similar to a weak reference under a tracing garbage collector. It is a special reference object whose existence does not increment the reference count of the referent object. Furthermore, a weak reference is safe in that when the referent object becomes garbage, any weak reference to itlapses, rather than being permitted to remain dangling, meaning that it turns into a predictable value, such as a null reference.
Space overhead (reference count)
Reference counting requires space to be allocated for each object to store its reference count. The count may be stored adjacent to the object's memory or in a side table somewhere else, but in either case, every single reference-counted object requires additional storage for its reference count. Memory space with the size of an unsigned pointer is commonly used for this task, meaning that 32 or 64 bits of reference count storage must be allocated for each object. On some systems, it may be possible to mitigate this overhead by using atagged pointer to store the reference count in unused areas of the object's memory. Often, an architecture does not actually allow programs to access the full range of memory addresses that could be stored in its native pointer size; a certain number of high bits in the address is either ignored or required to be zero. If an object reliably has a pointer at a certain location, the reference count can be stored in the unused bits of the pointer. For example, each object inObjective-C has a pointer to itsclass at the beginning of its memory; on theARM64 architecture usingiOS 7, 19 unused bits of this class pointer are used to store the object's reference count.[16][17]
Speed overhead (increment/decrement)
In naive implementations, each assignment of a reference and each reference falling out of scope often require modifications of one or more reference counters. However, in a common case when a reference is copied from an outer scope variable into an inner scope variable, such that the lifetime of the inner variable is bounded by the lifetime of the outer one, the reference incrementing can be eliminated. The outer variable "owns" the reference. In the programming language C++, this technique is readily implemented and demonstrated with the use ofconst references. Reference counting in C++ is usually implemented using "smart pointers"[18] whose constructors, destructors, and assignment operators manage the references. A smart pointer can be passed by reference to a function, which avoids the need to copy-construct a new smart pointer (which would increase the reference count on entry into the function and decrease it on exit). Instead, the function receives a reference to the smart pointer which is produced inexpensively. The Deutsch-Bobrow method of reference counting capitalizes on the fact that most reference count updates are in fact generated by references stored in local variables. It ignores these references, only counting references in the heap, but before an object with reference count zero can be deleted, the system must verify with a scan of the stack and register that no other reference to it still exists. A further substantial decrease in the overhead on counter updates can be obtained by update coalescing introduced by Levanoni andPetrank.[19][20] Consider a pointer that in a given interval of the execution is updated several times. It first points to an objectO1, then to an objectO2, and so forth until at the end of the interval it points to some objectOn. A reference counting algorithm would typically executerc(O1)--,rc(O2)++,rc(O2)--,rc(O3)++,rc(O3)--, ...,rc(On)++. But most of these updates are redundant. In order to have the reference count properly evaluated at the end of the interval it is enough to performrc(O1)-- andrc(On)++. Levanoni and Petrank measured an elimination of more than 99% of the counter updates in typical Java benchmarks.
Requires atomicity
When used in amultithreaded environment, these modifications (increment and decrement) may need to beatomic operations such ascompare-and-swap, at least for any objects which are shared, or potentially shared among multiple threads. Atomic operations are expensive on a multiprocessor, and even more expensive if they have to be emulated with software algorithms. It is possible to avoid this issue by adding per-thread or per-CPU reference counts and only accessing the global reference count when the local reference counts become or are no longer zero (or, alternatively, using a binary tree of reference counts, or even giving up deterministic destruction in exchange for not having a global reference count at all), but this adds significant memory overhead and thus tends to be only useful in special cases (it is used, for example, in the reference counting of Linux kernel modules). Update coalescing by Levanoni and Petrank[19][20] can be used to eliminate all atomic operations from the write-barrier. Counters are never updated by the program threads in the course of program execution. They are only modified by the collector which executes as a single additional thread with no synchronization. This method can be used as a stop-the-world mechanism for parallel programs, and also with a concurrent reference counting collector.
Not real-time
Naive implementations of reference counting do not generally provide real-time behavior, because any pointer assignment can potentially cause a number of objects bounded only by total allocated memory size to be recursively freed while the thread is unable to perform other work. It is possible to avoid this issue by delegating the freeing of unreferenced objects to other threads, at the cost of extra overhead.

Escape analysis

[edit]
Main article:Escape analysis

Escape analysis is a compile-time technique that can convertheap allocations tostack allocations, thereby reducing the amount of garbage collection to be done. This analysis determines whether an object allocated inside a function is accessible outside of it. If a function-local allocation is found to be accessible to another function or thread, the allocation is said to "escape" and cannot be done on the stack. Otherwise, the object may be allocated directly on the stack and released when the function returns, bypassing the heap and associated memory management costs.[21]

Availability

[edit]

Generally speaking,higher-level programming languages are more likely to have garbage collection as a standard feature. In some languages lacking built-in garbage collection, it can be added through a library, as with theBoehm garbage collector for C and C++.

Mostfunctional programming languages, such asML,Haskell, andAPL, have garbage collection built in.Lisp is especially notable as both the firstfunctional programming language and the first language to introduce garbage collection.[22]

Other dynamic languages, such asRuby andJulia (but notPerl 5 orPHP before version 5.3,[23] which both use reference counting),JavaScript andECMAScript also tend to use GC.Object-oriented programming languages such asSmalltalk,ooRexx,RPL andJava usually provide integrated garbage collection. Notable exceptions areC++ andDelphi, which havedestructors.

BASIC

[edit]

BASIC andLogo have often used garbage collection for variable-length data types, such as strings and lists, so as not to burden programmers with memory management details. On theAltair 8800, programs with many string variables and little string space could cause long pauses due to garbage collection.[24] Similarly theApplesoft BASIC interpreter's garbage collection algorithm repeatedly scans the string descriptors for the string having the highest address in order to compact it toward high memory, resulting inO(n2){\displaystyle O(n^{2})} performance[25] and pauses anywhere from a few seconds to a few minutes.[26] A replacement garbage collector for Applesoft BASIC byRandy Wigginton identifies a group of strings in every pass over the heap, reducing collection time dramatically.[27] BASIC.SYSTEM, released withProDOS in 1983, provides a windowing garbage collector for BASIC that is many times faster.[28]

C and C++

[edit]

C has never offered official support for garbage collection.C++ added garbage collection support inC++11 to the standard library, however this was removed inC++23 due to no compilers implementing support for the feature.[29] The features that were part of this were related to pointer safety.[30]

Although garbage collection support in the standard library was removed, some garbage collectors such asBoehm garbage collector (for C and C++) can still be used. Boehm GC usesmark-and-sweep garbage collection. It can also be used in leak detection mode, where memory management is still manual however leaks and double-free errors can be detected and reported. Its use can be called from the header<gc.h>.

One can still abstract away manual object destructions in C++ by using the "resource acquisition is initialization" (RAII) idiom andsmart pointers.std::unique_ptr ties lifetimes to ownership, whilestd::shared_ptr uses reference counting to determine lifetime.std::weak_ptr can be used to obtain a pointer without increasing the reference count. Unlike garbage collection, RAII is deterministic.

Objective-C

[edit]

While theObjective-C traditionally had no garbage collection, with the release ofOS X 10.5 in 2007Apple introduced garbage collection forObjective-C 2.0, using an in-house developed runtime collector.[31]However, with the 2012 release ofOS X 10.8, garbage collection was deprecated in favor ofLLVM'sautomatic reference counter (ARC) that was introduced withOS X 10.7.[32] Furthermore, since May 2015 Apple even forbade the usage of garbage collection for new OS X applications in theApp Store.[33][34] ForiOS, garbage collection has never been introduced due to problems in application responsivity and performance;[13][35] instead, iOS uses ARC.[36][37]

Limited environments

[edit]

Garbage collection is rarely used onembedded or real-time systems because of the usual need for very tight control over the use of limited resources. However, garbage collectors compatible with many limited environments have been developed.[38] The Microsoft.NET Micro Framework, .NET nanoFramework[39] andJava Platform, Micro Edition are embedded software platforms that, like their larger cousins, include garbage collection.

Java

[edit]
Main article:Java virtual machine § Garbage collectors

Garbage collectors available inJavaOpenJDKs virtual machine (JVM) include:

  • Serial
  • Parallel
  • CMS (Concurrent Mark Sweep)
  • G1 (Garbage-First)
  • ZGC (Z Garbage Collector)
  • Epsilon
  • Shenandoah
  • GenZGC (Generational ZGC)
  • GenShen (Generational Shenandoah)
  • IBM Metronome (only inIBM OpenJDK)
  • SAP (only inSAP OpenJDK)
  • Azul C4 (Continuously Concurrent Compacting Collector)[40] (only inAzul Systems OpenJDK)

Compile-time use

[edit]

Compile-time garbage collection is a form ofstatic analysis allowing memory to be reused and reclaimed based on invariants known during compilation.

This form of garbage collection has been studied in theMercury programming language,[41] and it saw greater usage with the introduction ofLLVM'sautomatic reference counter (ARC) into Apple's ecosystem (iOS and OS X) in 2011.[36][37][33]

Real-time systems

[edit]

Incremental, concurrent, and real-time garbage collectors have been developed, for example byHenry Baker and byHenry Lieberman.[42][43][44]

In Baker's algorithm, the allocation is done in either half of a single region of memory. When it becomes half full, a garbage collection is performed which moves the live objects into the other half and the remaining objects are implicitly deallocated. The running program (the 'mutator') has to check that any object it references is in the correct half, and if not move it across, while a background task is finding all of the objects.[45]

Generational garbage collection schemes are based on the empirical observation that most objects die young. In generational garbage collection, two or more allocation regions (generations) are kept, which are kept separate based on the object's age. New objects are created in the "young" generation that is regularly collected, and when a generation is full, the objects that are still referenced from older regions are copied into the next oldest generation. Occasionally a full scan is performed.

Somehigh-level language computer architectures include hardware support for real-time garbage collection.

Most implementations of real-time garbage collectors usetracing.[citation needed] Such real-time garbage collectors meethard real-time constraints when used with a real-time operating system.[46]

See also

[edit]

References

[edit]
  1. ^Abelson, Harold; Sussman, Gerald Jay; Sussman, Julie (2016).Structure and Interpretation of Computer Programs(PDF) (2nd ed.). Cambridge, Massachusetts, US:MIT Press. pp. 734–736.
  2. ^ab"What is garbage collection (GC) in programming?".Storage. Retrieved2024-06-21.
  3. ^McCarthy, John (1960)."Recursive functions of symbolic expressions and their computation by machine, Part I".Communications of the ACM.3 (4):184–195.doi:10.1145/367177.367199.S2CID 1489409. Retrieved2009-05-29.
  4. ^"Overview – D Programming Language".dlang.org. Digital Mars. Retrieved2014-07-29.
  5. ^Heller, Martin (2023-02-03)."What is garbage collection? Automated memory management for your programs".InfoWorld. Retrieved2024-06-21.
  6. ^"A Guide to Garbage Collection in Programming".freeCodeCamp.org. 2020-01-16. Retrieved2024-06-21.
  7. ^"Garbage Collection - D Programming Language".dlang.org. Retrieved2022-10-17.
  8. ^"Garbage Collection".rebelsky.cs.grinnell.edu. Retrieved2024-01-13.
  9. ^Heller, Martin (2023-02-03)."What is garbage collection? Automated memory management for your programs".InfoWorld. Retrieved2024-06-21.
  10. ^Microsoft (2023-02-28)."Fundamentals of garbage collection | Microsoft Learn". Retrieved2023-03-29.
  11. ^Zorn, Benjamin (1993-01-22). "The Measured Cost of Conservative Garbage Collection".Software: Practice and Experience.23 (7). Department of Computer Science,University of Colorado Boulder:733–756.CiteSeerX 10.1.1.14.1816.doi:10.1002/spe.4380230704.S2CID 16182444.
  12. ^Hertz, Matthew; Berger, Emery D. (2005)."Quantifying the Performance of Garbage Collection vs. Explicit Memory Management"(PDF).Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications - OOPSLA '05. pp. 313–326.doi:10.1145/1094811.1094836.ISBN 1-59593031-0.S2CID 6570650.Archived(PDF) from the original on 2012-04-02. Retrieved2015-03-15.
  13. ^ab"Developer Tools Kickoff – session 300"(PDF).WWDC 2011.Apple, Inc. 2011-06-24. Archived fromthe original(PDF) on 2023-09-04. Retrieved2015-03-27.
  14. ^Microsoft (2009-01-27)."Reference Counting Garbage Collection". Retrieved2023-03-29.
  15. ^"Reference Counts".Extending and Embedding the Python Interpreter. 2008-02-21. Retrieved2014-05-22.
  16. ^Ash, Mike."Friday Q&A 2013-09-27: ARM64 and You". mikeash.com. Retrieved2014-04-27.
  17. ^"Hamster Emporium: [objc explain]: Non-pointer isa". Sealiesoftware.com. 2013-09-24. Retrieved2014-04-27.
  18. ^Pibinger, Roland (2005-05-03) [2005-04-17]."RAII, Dynamic Objects, and Factories in C++".
  19. ^abLevanoni, Yossi;Petrank, Erez (2001)."An on-the-fly reference-counting garbage collector for java".Proceedings of the 16th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications.OOPSLA 2001. pp. 367–380.doi:10.1145/504282.504309.
  20. ^abLevanoni, Yossi;Petrank, Erez (2006)."An on-the-fly reference-counting garbage collector for java".ACM Trans. Program. Lang. Syst.28:31–69.CiteSeerX 10.1.1.15.9106.doi:10.1145/1111596.1111597.S2CID 14777709.
  21. ^Salagnac, Guillaume; Yovine, Sergio; Garbervetsky, Diego (2005-05-24)."Fast Escape Analysis for Region-based Memory Management".Electronic Notes in Theoretical Computer Science.131:99–110.doi:10.1016/j.entcs.2005.01.026.
  22. ^Chisnall, David (2011-01-12).Influential Programming Languages, Part 4: Lisp.
  23. ^"PHP: Performance Considerations".php.net. Retrieved2015-01-14.
  24. ^"Altair 8800 Basic 4.1 Reference Manual"(PDF).The Vintage Technology Digital Archive. April 1977. p. 108.Archived(PDF) from the original on 2021-06-29. Retrieved2021-06-29.
  25. ^"I did some work to speed up string garbage collection under Applesoft..."Hacker News. Retrieved2021-06-29.
  26. ^Little, Gary B. (1985).Inside the Apple IIc. Bowie, Md.: Brady Communications Co. p. 82.ISBN 0-89303-564-5. Retrieved2021-06-29.
  27. ^"Fast Garbage Collection".Call-A.P.P.L.E.:40–45. January 1981.
  28. ^Worth, Don (1984).Beneath Apple Pro DOS(PDF) (March 1985 printing ed.). Chatsworth, California, US: Quality Software. pp. 2–6.ISBN 0-912985-05-4.Archived(PDF) from the original on 2008-12-03. Retrieved2021-06-29.
  29. ^JF Bastien; Alisdair Meredith (2021-04-16)."Removing Garbage Collection Support".
  30. ^"std::pointer_safety - cppreference.com".en.cppreference.com. Retrieved2024-12-09.
  31. ^"Objective-C 2.0 Overview". Archived fromthe original on 2010-07-24.
  32. ^Siracusa, John (2011-07-20)."Mac OS X 10.7 Lion: the Ars Technica review".
  33. ^ab"Apple says Mac app makers must transition to ARC memory management by May".AppleInsider. 2015-02-20.
  34. ^Cichon, Waldemar (2015-02-21)."App Store: Apple entfernt Programme mit Garbage Collection".Heise.de. Retrieved2015-03-30.
  35. ^Silva, Precious (2014-11-18)."iOS 8 vs Android 5.0 Lollipop: Apple Kills Google with Memory Efficiency".International Business Times. Archived fromthe original on 2015-04-03. Retrieved2015-04-07.
  36. ^abNapier, Rob; Kumar, Mugunth (2012-11-20).iOS 6 Programming Pushing the Limit.John Wiley & Sons.ISBN 978-1-11844997-4. Retrieved2015-03-30.
  37. ^abCruz, José R. C. (2012-05-22)."Automatic Reference Counting on iOS".Dr. Dobbs. Archived fromthe original on 2020-05-16. Retrieved2015-03-30.
  38. ^Fu, Wei; Hauser, Carl (2005). "A real-time garbage collection framework for embedded systems".Proceedings of the 2005 Workshop on Software and Compilers for Embedded Systems - SCOPES '05. pp. 20–26.doi:10.1145/1140389.1140392.ISBN 1-59593207-0.S2CID 8635481.
  39. ^".NET nanoFramework".
  40. ^Tene, Gil; Iyengar, Balaji; Wolf, Michael (2011)."C4: the continuously concurrent compacting collector"(PDF).ISMM '11: Proceedings of the international symposium on Memory management.doi:10.1145/1993478.ISBN 978-1-45030263-0.Archived(PDF) from the original on 2017-08-09.
  41. ^Mazur, Nancy (May 2004).Compile-time garbage collection for the declarative language Mercury(PDF) (Thesis).Katholieke Universiteit Leuven.Archived(PDF) from the original on 2014-04-27.
  42. ^Huelsbergen, Lorenz; Winterbottom, Phil (1998)."Very concurrent mark-&-sweep garbage collection without fine-grain synchronization"(PDF).Proceedings of the First International Symposium on Memory Management - ISMM '98. pp. 166–175.doi:10.1145/286860.286878.ISBN 1-58113114-3.S2CID 14399427.Archived(PDF) from the original on 2008-05-13.
  43. ^"GC FAQ".
  44. ^Lieberman, Henry; Hewitt, Carl (1983)."A real-time garbage collector based on the lifetimes of objects".Communications of the ACM.26 (6):419–429.doi:10.1145/358141.358147.hdl:1721.1/6335.S2CID 14161480.
  45. ^Baker, Henry G. (1978). "List processing in real time on a serial computer".Communications of the ACM.21 (4):280–294.doi:10.1145/359460.359470.hdl:1721.1/41976.S2CID 17661259. see alsodescription
  46. ^McCloskey; Bacon; Cheng; Grove (2008),Staccato: A Parallel and Concurrent Real-time Compacting Garbage Collector for Multiprocessors(PDF),archived(PDF) from the original on 2014-03-11

Further reading

[edit]

External links

[edit]
The WikibookMemory Management has a page on the topic of:Garbage Collection
Hardware
Virtual memory
Segmentation
Allocator
Manual means
Garbage
collection
Safety
Issues
Other
Authority control databasesEdit this at Wikidata
Retrieved from "https://en.wikipedia.org/w/index.php?title=Garbage_collection_(computer_science)&oldid=1318980607"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp