In languages like C or C++, the programmer is responsible fordynamic allocation and deallocation of memory on the heap. In C,this is done using the functionsmalloc() andfree(). In C++, the operatorsnew anddelete are used with essentially the same meaning andwe'll restrict the following discussion to the C case.
Every block of memory allocated withmalloc() shouldeventually be returned to the pool of available memory by exactly onecall tofree(). It is important to callfree() at the right time. If a block's address isforgotten butfree() is not called for it, the memory itoccupies cannot be reused until the program terminates. This iscalled amemory leak. On the other hand, if a program callsfree() for a block and then continues to use the block, itcreates a conflict with re-use of the block through anothermalloc() call. This is calledusing freed memory.It has the same bad consequences as referencing uninitialized data --core dumps, wrong results, mysterious crashes.
Common causes of memory leaks are unusual paths through the code. Forinstance, a function may allocate a block of memory, do somecalculation, and then free the block again. Now a change in therequirements for the function may add a test to the calculation thatdetects an error condition and can return prematurely from thefunction. It's easy to forget to free the allocated memory block whentaking this premature exit, especially when it is added later to thecode. Such leaks, once introduced, often go undetected for a longtime: the error exit is taken only in a small fraction of all calls,and most modern machines have plenty of virtual memory, so the leakonly becomes apparent in a long-running process that uses the leakingfunction frequently. Therefore, it's important to prevent leaks fromhappening by having a coding convention or strategy that minimizesthis kind of errors.
Since Python makes heavy use ofmalloc() andfree(), it needs a strategy to avoid memory leaks as wellas the use of freed memory. The chosen method is calledreference counting. The principle is simple: every objectcontains a counter, which is incremented when a reference to theobject is stored somewhere, and which is decremented when a referenceto it is deleted. When the counter reaches zero, the last referenceto the object has been deleted and the object is freed.
An alternative strategy is calledautomatic garbage collection.(Sometimes, reference counting is also referred to as a garbagecollection strategy, hence my use of ``automatic'' to distinguish thetwo.) The big advantage of automatic garbage collection is that theuser doesn't need to callfree() explicitly. (Another claimedadvantage is an improvement in speed or memory usage -- this is nohard fact however.) The disadvantage is that for C, there is notruly portable automatic garbage collector, while reference countingcan be implemented portably (as long as the functionsmalloc()andfree() are available -- which the C Standard guarantees).Maybe some day a sufficiently portable automatic garbage collectorwill be available for C. Until then, we'll have to live withreference counts.
While Python uses the traditional reference counting implementation,it also offers a cycle detector that works to detect referencecycles. This allows applications to not worry about creating director indirect circular references; these are the weakness of garbagecollection implemented using only reference counting. Referencecycles consist of objects which contain (possibly indirect) referencesto themselves, so that each object in the cycle has a reference countwhich is non-zero. Typical reference counting implementations are notable to reclaim the memory belonging to any objects in a referencecycle, or referenced from the objects in the cycle, even though thereare no further references to the cycle itself.
The cycle detector is able to detect garbage cycles and can reclaimthem so long as there are no finalizers implemented in Python(__del__() methods). When there are such finalizers, thedetector exposes the cycles through thegcmodule (specifically, thegarbagevariable in that module). Thegc module also exposes a wayto run the detector (thecollect() function), as well asconfiguration interfaces and the ability to disable the detector atruntime. The cycle detector is considered an optional component;though it is included by default, it can be disabled at build timeusing the--without-cycle-gc option to theconfigure script onUnix platforms (including Mac OS X)or by removing the definition ofWITH_CYCLE_GC in thepyconfig.h header on other platforms. If the cycle detector isdisabled in this way, thegc module will not be available.