Skip to content

runtime: smarter scavenging #30333

Closed
Closed
@mknyszek

Description

@mknyszek

Proposal: Smarter Scavenging

Motivation & Purpose

Out-of-memory errors (OOMs) have been a pain-point for Go applications. A class of these errors come from the same underlying cause: a temporary spike in memory causes the Go runtime to grow the heap, but it takes a very long time (on the order of minutes) to return that unneeded memory back to the system. The system can end up killing the application in many situations, such as if the system has no swap space or if system monitors count this space against your application. In addition, if this additional space is counted against your application, you end up paying more for memory when you don’t really need it.

The Go runtime does have internal mechanisms to help deal with this, but they don’t react to changes in the application promptly enough. The way users solve this problem today is through a runtime library function called debug.FreeOSMemory. debug.FreeOSMemory performs a GC and subsequently returns all unallocated memory back to the underlying system. However, this solution is very heavyweight:

  • Returning all free memory back to the underlying system at once is expensive, and can lead to latency spikes as it holds the heap lock through the whole process.
  • It’s an invasive solution: you need to modify your code to call it when you need it.
  • Reusing free chunks of memory becomes more expensive. On UNIX-y systems that means an extra page fault (which is surprisingly expensive on some systems).

I believe the Go runtime should do better here by default, so that for most cases the scavenger is prompt enough and debug.FreeOSMemory is unnecessary.

See #16930 and #14045.

Background

Scavenging

Dynamic memory allocators typically obtain memory from the operating system by requesting for it to be mapped into their virtual address space. Sometimes this space ends up unused, and modern operating systems provide a way to tell the OS that certain virtual memory address regions won’t be used without unmapping them. This means the physical memory backing those regions may be taken back by the OS and used elsewhere. We in the Go runtime refer to this technique as “scavenging”.

Scavenging is especially useful in dealing with page-level external fragmentation, since we can give these fragments back to the OS, reducing the process’ resident set size (RSS). That is, the amount of memory that is backed by physical memory in the application’s address space.

Go 1.11

As of Go 1.11, the only scavenging process in the Go runtime was a periodic scavenger which runs every 2.5 minutes. This scavenger combs over all the free spans in the heap and scavenge them if they have been unused for at least 5 minutes. When the runtime coalesced spans, it would track how much of the new span was scavenged.

While this simple technique is surprisingly effective for long-running applications, the peak RSS of an application can end up wildly exaggerated in many circumstances, even though the application’s peak in-use memory is significantly smaller. The periodic scavenger just does not react quickly enough to changes in the application’s memory usage.

Go 1.12

As of Go 1.12, in addition to the periodic scavenger, the Go runtime also performs heap-growth scavenging. On each heap growth up to N bytes of the largest spans are scavenged, where N is the amount of bytes the heap grew by. The idea here is to “pay back” the cost of a heap growth. This technique helped to reduce the peak RSS of some applications (#14045).

Goals

The goal in scavenging smarter is two-fold:

  • Reduce the average and peak RSS of Go applications.
  • Minimize the CPU impact of keeping the RSS low.

The two goals go hand-in-hand. On the one hand, you want to keep the RSS of the application as close to its in-use memory usage as possible. On the other hand, doing so is expensive in terms of CPU time, having to make syscalls and handle page faults. If we’re too aggressive and scavenge every free space we have, then on every span allocation we effectively incur a hard page fault (or invoke a syscall), and we’re calling a syscall on every span free.

The ideal scenario, in my view, is that the RSS of the application “tracks” the peak in-use memory over time.

  • We should keep the RSS close to the actual in-use heap, but leave enough of a buffer such that the application has a pool of unscavenged memory to allocate from.
  • We should try to smooth over fast and transient changes in heap size.

The goal of this proposal is to improve the Go runtime’s scavenging mechanisms such that it exhibits the behavior described. Compared with today’s implementation, this behavior should reduce the average overall RSS of most Go applications with minimal impact on performance.

Proposal

Three questions represent the key policy decisions that describe a memory scavenging system.

  1. At what rate is memory scavenged?
  2. How much memory should we retain (not scavenge)?
  3. Which memory should we scavenge?

I propose that for the Go runtime, we:

  1. Scavenge at a rate proportional to the rate at which the application is allocating memory, in the same vein as proportional sweeping.
  2. Retain some constant times the peak heap goal over the last N GCs (same as runtime: make the scavenger more prompt #16930).
  3. Scavenge the unscavenged spans with the highest base addresses first.

Additionally, I propose we change the span allocation policy to prefer unscavenged spans over scavenged spans, and to be first-fit rather than best-fit.

A brief rationale is that:

  1. We get guarantees about when memory will get scavenged, and spread the expense of scavenging operations across a GC cycle, to avoid latency spikes.
  2. By retaining the max heap goal over N GCs we can smooth out latency spikes. Retaining a constant times that gives us the overhead to avoid page faults.
  3. Best-fit allocation is difficult to glean any trends out of. On the other hand, first-fit allocation trivially prefers lower address. First-fit allocation policies tend to perform just as well as best-fit allocation, and can be implemented efficiently. By picking higher addresses first with a first-fit scavenging policy, we're scavenging memory that is less likely to be used.

Detailed rationale and design available very soon.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions