spacepaste

  1.  
  2. diff -r ed765f557481 pypy/doc/gc_info.rst
  3. --- a/pypy/doc/gc_info.rst Thu Feb 15 14:05:41 2018 +0100
  4. +++ b/pypy/doc/gc_info.rst Thu Feb 15 15:52:14 2018 +0100
  5. @@ -15,14 +15,41 @@
  6. processes) and cache sizes you might want to experiment with it via
  7. *PYPY_GC_NURSERY* environment variable. When the nursery is full, there is
  8. performed a minor collection. Freed objects are no longer referencable and
  9. -just die, without any effort, while surviving objects from the nursery
  10. -are copied to the old generation. Either to arenas, which are collections
  11. -of objects of the same size, or directly allocated with malloc if they're big
  12. -enough.
  13. +just die, just by not being referenced any more; on the other hand, objects
  14. +found to still be alive must survive and are copied from the nursery
  15. +to the old generation. Either to arenas, which are collections
  16. +of objects of the same size, or directly allocated with malloc if they're
  17. +larger. (A third category, the very large objects, are initially allocated
  18. +outside the nursery and never move.)
  19. Since Incminimark is an incremental GC, the major collection is incremental,
  20. meaning there should not be any pauses longer than 1ms.
  21. +
  22. +Fragmentation
  23. +-------------
  24. +
  25. +Before we discuss issues of "fragmentation", we need a bit of precision.
  26. +There are two kinds of related but distinct issues:
  27. +
  28. +* If the program allocates a lot of memory, and then frees it all by
  29. + dropping all references to it, then we might expect to see the RSS
  30. + to drop. (RSS = Resident Set Size on Linux, as seen by "top"; it is an
  31. + approximation of the actual memory usage from the OS's point of view.)
  32. + This might not occur: the RSS may remain at its highest value. This
  33. + issue is more precisely caused by the process not returning "free"
  34. + memory to the OS. We call this case "unreturned memory".
  35. +
  36. +* After doing the above, if the RSS didn't go down, then at least future
  37. + allocations should not cause the RSS to grow more. That is, the process
  38. + should reuse unreturned memory as long as it has got some left. If this
  39. + does not occur, the RSS grows even larger and we have real fragmentation
  40. + issues.
  41. +
  42. +
  43. +gc.get_stats
  44. +------------
  45. +
  46. There is a special function in the ``gc`` module called
  47. ``get_stats(memory_pressure=False)``.
  48. @@ -56,19 +83,30 @@
  49. In this particular case, which is just at startup, GC consumes relatively
  50. little memory and there is even less unused, but allocated memory. In case
  51. -there is a high memory fragmentation, the "allocated" can be much higher
  52. -than "used". Generally speaking, "peak" will more resemble the actual
  53. -memory consumed as reported by RSS, since returning memory to the OS is a hard
  54. -and not solved problem.
  55. +there is a lot of unreturned memory or actual fragmentation, the "allocated"
  56. +can be much higher than "used". Generally speaking, "peak" will more closely
  57. +resemble the actual memory consumed as reported by RSS. Indeed, returning
  58. +memory to the OS is a hard and not solved problem. In PyPy, it occurs only if
  59. +an arena is entirely free---a contiguous block of 64 pages of 4 or 8 KB each.
  60. +It is also rare for the "rawmalloced" category, at least for common system
  61. +implementations of ``malloc()``.
  62. The details of various fields:
  63. -* GC in arenas - small old objects held in arenas. If the amount of allocated
  64. - is much higher than the amount of used, we have large fragmentation issue
  65. +* GC in arenas - small old objects held in arenas. If the amount "allocated"
  66. + is much higher than the amount "used", we have unreturned memory. It is
  67. + possible but unlikely that we have internal fragmentation here. However,
  68. + this unreturned memory cannot be reused for any ``malloc()``, including the
  69. + memory from the "rawmalloced" section.
  70. -* GC rawmalloced - large objects allocated with malloc. If this does not
  71. - correspond to the amount of RSS very well, consider using jemalloc as opposed
  72. - to system malloc
  73. +* GC rawmalloced - large objects allocated with malloc. This is gives the
  74. + current (first block of text) and peak (second block of text) memory
  75. + allocated with ``malloc()``. The amount of unreturned memory or
  76. + fragmentation caused by ``malloc()`` cannot easily be reported. Usually
  77. + you can guess there is some if the RSS is much larger than the total
  78. + memory reported for "GC allocated", but do keep in mind that this total
  79. + does not include malloc'ed memory not known to PyPy's GC at all. If you
  80. + guess there is some, consider using jemalloc as opposed to system malloc.
  81. * nursery - amount of memory allocated for nursery, fixed at startup,
  82. controlled via an environment variable
  83. @@ -91,7 +129,7 @@
  84. ``PYPY_GC_NURSERY``
  85. The nursery size.
  86. - Defaults to 1/2 of your cache or ``4M``.
  87. + Defaults to 1/2 of your last-level cache, or ``4M`` if unknown.
  88. Small values (like 1 or 1KB) are useful for debugging.
  89. ``PYPY_GC_NURSERY_DEBUG``
  90.