-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slices: have Delete and others clear the tail #63393
Comments
from my engine room: it's exactly the lack of zeroing behavior why I still need to keep my own code in many places, when I really would like to migrate to stdlib here. |
I think this would be kind of weird because it returns a slice instead of taking a pointer, so it would be weird to do stuff out of band. Why not just add a new slices.ClearTail? func ClearTail[E any, S ~[]E](s S) {
clear(s[len(s):cap(s)])
} |
Not really out of band, as Delete is expected to be destructive in-place. IMO the important consideration is that the tail is now "garbage for everyone", other slice headers pointing to the same memory should be regarded as invalidated. ClearTail is nice to have but has 2 drawbacks in the context of Delete:
|
That's persuasive. It's probably fine to change the behavior here since, anything relying on the old behavior was broken. |
could this proposal please also applied to the exp/slices version, so that code that follows the N and N-1 support strategy will benefit as early as possible? |
Dropping an element is equivalent to With the current behavior (and suggested by the current documentation, really) problems can be resolved by dropping elements following |
For clarification, This means that, while a |
Same problem for |
@go101 thanks, I'll add Replace to the section "impacted API" |
I think this only really make sense if you are not expecting to reuse the space cleared out. I think that in the cases trailing values are a valid concern something like #60136 or #63393 (comment) is a better solution. s = slices.DeleteFunc(s, filter)
s = append(s, stuff...)
slices.ClearTail(s) Or slices.ClearTail(slices.DeleteFunc(s, filter)) Assuming Go2, IMO a better solution would be for reslicing a slice beyond it's length to zero the "new" elements, then we could not zero things when slicing down, the GC would instead be able to inspect the |
That should be Following what @Deleplace said above, if you wanted to do the clear yourself, you would need to write, oldLen := len(s)
s = slices.DeleteFunc(s, filter)
clear(s[len(s):oldLen]) This way, you don't accidentally clear out some end capacity that doesn't belong to you. A question I have is whether there is a cheap way to test whether a slice element is a "reference type" (pointer, interface, map, channel, slice). Clearing the tail only makes sense as an optimization to tell the garbage collector that certain items are free, and that only makes sense when the items are associated with memory not in the slice (ie it's a reference type). Using reflect for this test would probably (but maybe not?) be too slow, but maybe there's some runtime/reflectlite system that could be used to make the test cheap enough. |
One tricky thing is that a struct in slice only needs to be cleared if it contains reference types itself. So struct { int ; int } is fine, but struct { *int ; *int } needs to be cleared. |
I had the same thought about "no need to clear value elements", however introducing any difference in behaviour for
would violate the principle of least astonishment. Unexpected zeroing/not zeroing behaviours leads to surprises and headaches. Also, half-zeroing a struct (just the pointer fields, not the number fields) would lead to corrupt state, that's dangerous. |
How expensive is a If that were true, I would assert that it seems fine for a relatively "high-level" function like those described here to pay that light cost, and that those who have more demanding performance needs can write their own non-clearing The potential for a hidden memory leak seems more concerning to me than the cost of zeroing some memory. |
The non-clearing behavior of |
That is a fair point, @AndrewHarrisSPU. Do you think it's common for folks to intentionally retain the "hidden" tail in a separate variable so they can perform cleanup afterwards? My intuition would be that if you're using The potential need to actually do something with the discarded portion makes me think about an new function: // (This name is terrible and just a placeholder)
func DeleteWithDiscard[S ~[]E, E any](s S, i, j int) (new S, discarded S) In this case I imagine that the first return value Or to say it another way, func Delete[S ~[]E, E any](s S, i, j int) S {
ret, toClear := DeleteWithDiscard(s, i, j)
clear(toClear)
return ret
} ...but those with more "interesting" cleanup needs could use |
If you are using slices.Delete on a slice of io.Closers that aren't closed, your code is broken, and you can't fix it by fishing around in the tail of a deleted slice for things to close. What remains in the tail will depends on exactly what in the slice gets deleted. In most cases, the tail will contain a duplicate value from the non-deleted portion of the slice. It may also contain a subset of the deleted values. It won't be consistent and shouldn't be relied on for correctness. |
I don't think that the use of slices.Delete is the cause of the broken code. The code is broken if the io.Closers are not properly closed. In fact, the use of slices.Delete makes it more evident during a review that the code is broken. |
I like this proposal. Failing to clear pointers can increase the asymptotic cost of an operation (in live heap) whereas clearing them increases the CPU cost by only a small constant factor over the existing work Delete must do. Users can easily write their own delete logic in the rare cases when the clearing loop is the hot spot, or when they don't want the clearing semantics. (But if you care so much about performance at the micro-level, why are you shuffliing slices around with Delete in the first place?). |
Based on the discussion above, this proposal seems like a likely accept. In package slices, Delete changes to clear (zero) the elements s[len(s):oldlen], to prevent confusion and especially accidentally retained memory. |
I would like to point out, to ensure it doesn't go unnoticed, that the proposal intends to have the functions |
Indeed, 5 functions need to be modified. Shall I make a CL? |
No change in consensus, so accepted. 🎉 In package slices, Delete, DeleteFunc, Compact, CompactFunc, and Replace change to clear (zero) the elements s[len(s):oldlen], to prevent confusion and especially accidentally retained memory. |
@Deleplace please go ahead |
Change https://go.dev/cl/541477 mentions this issue: |
The functions documentation will now be explicit about zeroing the elements between the original length and the new length. When the main CL will be merged, I will make another CL to apply the same change to golang.org/x/exp/slices. |
Change https://go.dev/cl/543335 mentions this issue: |
…act, CompactFunc, Replace. Backport from stdlib: to avoid memory leaks in slices that contain pointers, clear the elements between the new length and the original length. Fixes golang/go#63393 Change-Id: I38bf64b27619d8067f2e95ce3c7952ec95ca55b8 Reviewed-on: https://go-review.googlesource.com/c/exp/+/543335 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Eli Bendersky <eliben@google.com>
Backport to golang.org/x/exp/slices : done |
A bit of late to the party, but we actually observed that this change breaks some of the production code. Here is an example:
It seems that some Go code may hold a reference to the old slice This seems become worse. In Go 1.21, the following code is fine:
But in Go 1.22, it panics:
|
@changkun: All the changes you mention here are intended, or at least expected. |
@randall77 It would be harder to paste the relevant production code here, but the idea is well demonstrated in the above example code. We see more than one case where production services crashed due to this behavior change. The |
@changkun I think we need to see some realistic code to decide whether this is a serious issue or not. Generally it is not a great idea to have two slices to the same backing store and modify through one without any expectations of modification by the other. For instance, in the first case you cite, instead of
That's just how slices work. There's no way |
@changkun indeed some tests were passing in Go 1.21 and will now fail in Go 1.22. This is expected and intended. We observed this internally at Google as well. A full blog post about this will be published this week in the Go blog. While it is certainly not "funny" to break user code, I do believe that only incorrect code will be broken and need fixing. I tried to articulate this in the section "List ownership" above: keeping using old slice values is mostly a mistake in the first place. |
To clarify, nothing was stated to object this change; the previous examples just put the consequence of this change on the table. It is another discussion and judgment on whether the code produced by an engineer is buggy or not; at least, there is nothing that warns them to not do so. Users should continuously improve their testing infrastructure to prevent such an outage from happening at an early stage but also require Go release to spend additional effort to communicate and inform users of the potential consequences. I believe I read about this proposal months back and failed to anticipate how it will impact user code as well. |
You are right that this a user-visible change, and strictly speaking a backward incompatible one. But as @randall77 pointed out, it's difficult to imagine an algorithm that would intentionally rely on the tail of the array being unmodified after a call to Delete, since this implies a very intimate and fragile coupling between the caller of Delete and the logic assuming the tail is unchanged. If I was to see such code during review, I would almost certainly call it out as a likely mistake or at least an opportunity for clarification. Could you post a reduced example of code that actually does this? |
The most likely circumstance I can think of is code that is working with subslices in a transparent manner such that it doesn't know that they're subslices of a larger slice. Thus is still inherently dangerous, though, as an incorrect I wonder if it would be possible to rig up a |
@DeedleFake could you make a small playground snippet about this? This may not 100% address your suggestion, but 2 positive notes:
|
Oh, so it does. For some reason, I was under the impression that it cleared the remaining capacity, though now that I think about it that really doesn't make sense. Never mind, then, and yeah, anything adversely affected by this behavior is almost definitely a bug anyways. |
Without digging into the exact business/product goal the relevant cases want to achieve, I don't directly see it being buggy, as it met the expectation for a long period, even if the code itself might be an unexpected use. The relevant code holds part of a slice, say |
Met the expectation doesn't exclude being buggy. Your vague description looks to me like desaster waiting to happen all the time. Getting away with shoplifting also meets the business expectation, but still is unlawful. |
The code before must not have cared about the slice being sorted or even having particular elements in it. Maybe it's just doing best effort background processing of some kind. So then in that case, wouldn't a simple |
Proposal
a = slices.Delete(a, i, j)
should zero the (j
-i
) elements after the length of the returned sliceTL;DR zeroing the elements is error-prone, it should be taken care of by the stdlib function.
Chronology
2012 Go 1.0 released. It doesn’t have generics.
2022 Go 1.18 released. It has generics.
2022 The golang.org/x/exp/slices package has Delete.
2023 Delete is promoted to the new stdlib package slices.
Current implementation
slices.Delete
achieves its job in-place byM
=len(a)-j
tail elements to the left, by copySee its source.
Problem: memory
If the elements have a reference type (pointer, map, slice, chan, string), or contain fields having a reference type, then (
j
-i
) elements at the end of the original slice are left untouched and keep referencing some objects. They will not be garbage collected, and they may enjoy a lifetime far exceeding the intent of the developer.This is a memory leak.
Impact
Slices containing only value types are not impacted. Their memory footprint is the same, regardless of any zeroing.
Some programs will have a memory leak without noticing and without serious consequences, when the objects are few and small.
Some programs will have serious memory leaks, and find it difficult to properly diagnose. The consequences are crashes (OOM), increased GC pressure, overall bad performance, higher costs associated with more memory utilization and more server instances, and higher costs associated with debugging and fixing the leak.
Indirect impact
Some developers do want to address this and write zero values themselves (see documentation recommendation below). But they risk writing incorrect code, turning a mild performance concern into a serious correctness problem.
Partial solution: documentation
The documentation of
slices.Delete
explicitly states:Problem: zeroing is difficult
In order to properly prevent a memory leak, the burden is on the developer to:
This is error-prone.
When dealing with pointer zeroing, the Delete API is both hard to use and easy to misuse.
Cognitive burden
Incorrect code may be written in good faith in a context where several non-trivial concepts must be properly understood by the developers.
correct utilization: the developers need to understand what a slice is and how slice headers work, in order to understand the API design “pass a slice as argument, get a slice as return value”
developers are implicitly expecting the stdlib to “do the right thing” (principle of least astonishment)
the perceived “right thing” is not universal: some want performance by default, some want correctness and safety by default
even for skilled developers, the list of conditions (listed in previous section) to properly address the problem will too often not be fully met
the need for such manual workarounds partially defeats the purpose of having a stdlib function.
Suggested implementation
Impacted API
Same changes to these
45 functions:Backward compatibility
Changing the behavior of
slices.Delete
is backward compatible. Its current documentation does not promise to zero or not zero the obsolete elements.Cost
Runtime overhead of having
slices.Delete
zero the obsolete elements, per this benchmark:When deleting 1 element
~1% (this is a common use case)
When deleting a large range of elements
~30%
The GC work of "collecting" obsolete objects when they will become unreachable is expected to be very small, and is not specifically attributable to
slices.Delete
(just normal GC work).Pointers vs. Values
The proposal is mostly useful for slices whose elements are pointers, or contain pointers.
Should we try to keep the existing behaviour for slices of value elements (e.g. int, or a struct of values)?
Probably not, for 2 reasons:
List ownership
We could argue that any use of a subslice
s[i:j]
always leaves elements not set to zero, beforei
and afterj
.If slicing doesn’t do any zeroing, should Delete have to do it?
One major difference is that it makes sense to have various parts of a program use various slices (“views”, “windows”) on the same backing array. The parts need to be careful, as changes made by one may affect the other parts. Using append in such a context is subtle, with reallocation happening transparently. The ownership and responsibility are shared. An arbitrary number of agents only reading and slicing is always fine, not mutating the backing array contents.
Delete
is different because it is destructive by nature. No part of the program is supposed to make sense of “a slice where some elements were shifted, and some obsolete elements were left in place” by some other code. Only the slice returned byDelete
makes sense, and any other slice of the same backing array can be regarded as obsolete. The caller ofDelete
gets de facto the ownership of the backing array.What other languages do
Java
C#
C++
Edit: precision about
a[len(a):cap(a)]
It is out of scope of the current proposal to write anything to the original tail (the elements
v17
,v18
,v19
in the diagram).We must not assume ownership of the elements beyond the original length, and we must not zero them out. They might be used for legit purpose by other parts of the program, in theory.
The text was updated successfully, but these errors were encountered: