KEMBAR78
Memory Handling and Garbage Collection in Python | PPTX
Memory Handling
&
Garbage Collection in Python
Prepared by:
Ms. Pooja Mehta
ITSNS Branch,
GTU-CDAC-BISAG ME Program,
Gandhinagar
1
Memory Management
3 March 2016By: Pooja Mehta
2
 The process of binding values to memory
locations
 Whether done automatically (as in
Python), or partially by the programmer
(as in C/C++), dynamic memory
management is an important part of
programming language design.
Three Categories of Memory
(for Data Storage)
3 March 2016By: Pooja Mehta
3
 Static: storage requirements are known
prior to run time; lifetime is the entire
program execution
 Run-time stack: memory associated with
active functions
Structured as stack frames
 Heap: dynamically allocated storage; the
least organized and most dynamic storage
area
Cont..
3 March 2016By: Pooja Mehta
4
Cont..
3 March 2016By: Pooja Mehta
5
Memory Leaks and Garbage
Collection
3 March 2016By: Pooja Mehta
6
 Three types of storage
 Static
 Stack
 Heap
 Problems with heap storage:
 Memory leaks (garbage): failure to free storage when
pointers (references) are reassigned
 Dangling pointers: when storage is freed, but
references to the storage still exist.
Garbage
3 March 2016By: Pooja Mehta
7
 Any block of heap memory that cannot be
accessed by the program; i.e., there is no
stack pointer to the block; but which the
runtime system thinks is in use.
 Garbage is created in several ways:
 A function ends without returning the space
allocated to a local array or other dynamic
variable.
 A node is deleted from a linked data structure, but
isn’t freed
Garbage Collection
3 March 2016By: Pooja Mehta
8
 All inaccessible blocks of storage are identified
and returned to the free list.
 The heap may also be compacted at this time:
allocated space is compressed into one end of
the heap, leaving all free space in a large
block at the other end.
Implementing Automated
Garbage Collection
3 March 2016By: Pooja Mehta
9
 If programmers were perfect, garbage
collection wouldn’t be needed.
 There are three major approaches to
automating the process:
 Reference counting
 Mark-sweep
 Copy collection
Reference Counting
10
 Prior to Python version 2.0, the Python
interpreter only used reference counting for
memory management.
 Reference counting works by counting the
number of times an object is referenced by
other objects in the system.
 When references to an object are removed,
the reference count for an object is
decremented.
 When the reference count becomes zero the
object is de allocated.
3 March 2016By: Pooja Mehta
Mark and Sweep
3 March 2016By: Pooja Mehta
11
 Reference counting tries to find unreachable
objects by finding objects with no incoming
references.
 Imprecise because we forget which references
those are.
Cont..
3 March 2016By: Pooja Mehta
12
 The root set is the set of memory locations in
the program that are known to be reachable.
 Any objects reachable from the root set are
reachable.
 Any objects not reachable from the root set are
not reachable.
Cont..
3 March 2016By: Pooja Mehta
13
Marking phase: Find reachable objects.
 Add the root set to a worklist.
 While the worklist isn't empty:
 Remove an object from the worklist.
 If it is not marked, mark it and add to the worklist all
objects reachable from that object.
Sweeping phase: Reclaim free memory.
 For each allocated object:
 If that object isn't marked, reclaim its memory.
 If the object is marked, unmark it (so on the next
mark-and sweep iteration we have to mark it again).
Copy Collection
3 March 2016By: Pooja Mehta
14
 Divide heap into two “semispaces”.
 Allocate from one space (fromspace) till full.
 Copy live data into other space (tospace).
 Switch roles of the spaces.
 Requires fixing pointers to moved data
(forwarding).
 Eliminates fragmentation.
 DFS improves locality, while BFS does not
require any extra storage.
References
15
 https://docs.python.org/2/c-api/memory.html
 http://foobarnbaz.com/2012/07/08/understanding-
python-variables/
 https://www.digi.com/wiki/developer/index.php/Pyt
hon_Garbage_Collection
3 March 2016By: Pooja Mehta
16
Thank
you...!

Memory Handling and Garbage Collection in Python

  • 1.
    Memory Handling & Garbage Collectionin Python Prepared by: Ms. Pooja Mehta ITSNS Branch, GTU-CDAC-BISAG ME Program, Gandhinagar 1
  • 2.
    Memory Management 3 March2016By: Pooja Mehta 2  The process of binding values to memory locations  Whether done automatically (as in Python), or partially by the programmer (as in C/C++), dynamic memory management is an important part of programming language design.
  • 3.
    Three Categories ofMemory (for Data Storage) 3 March 2016By: Pooja Mehta 3  Static: storage requirements are known prior to run time; lifetime is the entire program execution  Run-time stack: memory associated with active functions Structured as stack frames  Heap: dynamically allocated storage; the least organized and most dynamic storage area
  • 4.
  • 5.
  • 6.
    Memory Leaks andGarbage Collection 3 March 2016By: Pooja Mehta 6  Three types of storage  Static  Stack  Heap  Problems with heap storage:  Memory leaks (garbage): failure to free storage when pointers (references) are reassigned  Dangling pointers: when storage is freed, but references to the storage still exist.
  • 7.
    Garbage 3 March 2016By:Pooja Mehta 7  Any block of heap memory that cannot be accessed by the program; i.e., there is no stack pointer to the block; but which the runtime system thinks is in use.  Garbage is created in several ways:  A function ends without returning the space allocated to a local array or other dynamic variable.  A node is deleted from a linked data structure, but isn’t freed
  • 8.
    Garbage Collection 3 March2016By: Pooja Mehta 8  All inaccessible blocks of storage are identified and returned to the free list.  The heap may also be compacted at this time: allocated space is compressed into one end of the heap, leaving all free space in a large block at the other end.
  • 9.
    Implementing Automated Garbage Collection 3March 2016By: Pooja Mehta 9  If programmers were perfect, garbage collection wouldn’t be needed.  There are three major approaches to automating the process:  Reference counting  Mark-sweep  Copy collection
  • 10.
    Reference Counting 10  Priorto Python version 2.0, the Python interpreter only used reference counting for memory management.  Reference counting works by counting the number of times an object is referenced by other objects in the system.  When references to an object are removed, the reference count for an object is decremented.  When the reference count becomes zero the object is de allocated. 3 March 2016By: Pooja Mehta
  • 11.
    Mark and Sweep 3March 2016By: Pooja Mehta 11  Reference counting tries to find unreachable objects by finding objects with no incoming references.  Imprecise because we forget which references those are.
  • 12.
    Cont.. 3 March 2016By:Pooja Mehta 12  The root set is the set of memory locations in the program that are known to be reachable.  Any objects reachable from the root set are reachable.  Any objects not reachable from the root set are not reachable.
  • 13.
    Cont.. 3 March 2016By:Pooja Mehta 13 Marking phase: Find reachable objects.  Add the root set to a worklist.  While the worklist isn't empty:  Remove an object from the worklist.  If it is not marked, mark it and add to the worklist all objects reachable from that object. Sweeping phase: Reclaim free memory.  For each allocated object:  If that object isn't marked, reclaim its memory.  If the object is marked, unmark it (so on the next mark-and sweep iteration we have to mark it again).
  • 14.
    Copy Collection 3 March2016By: Pooja Mehta 14  Divide heap into two “semispaces”.  Allocate from one space (fromspace) till full.  Copy live data into other space (tospace).  Switch roles of the spaces.  Requires fixing pointers to moved data (forwarding).  Eliminates fragmentation.  DFS improves locality, while BFS does not require any extra storage.
  • 15.
    References 15  https://docs.python.org/2/c-api/memory.html  http://foobarnbaz.com/2012/07/08/understanding- python-variables/ https://www.digi.com/wiki/developer/index.php/Pyt hon_Garbage_Collection 3 March 2016By: Pooja Mehta
  • 16.

Editor's Notes

  • #11 The user does not have to preallocate or deallocate memory by hand as one has to when using dynamic memory allocation in languages such as C or C++.
  • #12 The easiest way to create a reference cycle is to create an object which refers to itself as in the example below: def make_cycle(): l = [ ] l.append(l) make_cycle() Reference counting is extremely efficient but it does have some caveats. One such caveat is that it cannot handle reference cycles. A reference cycle is when there is no way to reach an object but its reference count is still greater than zero.
  • #14 Problems Memory Fragmentation Work proportional to size of heap Potential lack of locality of reference for newly allocated objects (fragmentation) Solution Compaction Contiguous live objects, contiguous free space
  • #15 Automatic Garbage Collection of Cycles Because reference cycles are take computational work to discover, garbage collection must be a scheduled activity. Python schedules garbage collection based upon a threshold of object allocations and object deallocations. When the number of allocations minus the number of deallocations are greater than the threshold number, the garbage collector is run. One can inspect the threshold for new objects (objects in Python known as generation 0 objects) by loading the gc module and asking for garbage collection thresholds. Because make_cycle() creates an object l which refers to itself, the object l will not automatically be freed when the function returns. This will cause the memory that l is using to be held onto until the Python garbage collector is invoked