-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
all: add opt-in transparent telemetry to Go toolchain #58894
Comments
This proposal has been added to the active column of the proposals project |
@rsc I didn't get a chance to ask this before the discussion was closed:
Why not assign every known string a number, and then use those numbers instead of sending strings? While sending the strings themselves is equivalent to the numbers, there's a "slippery slope" whiff I get from it. Publishing the known string index is another way to assure users that their privacy is being protected. There's an extra feeling of security knowing that only numbers are being sent. |
@willfaught I actually think it would be less transparent, not more. Because instead of seeing the actual human-readable information that is sent, you get a bunch of cryptic numbers that you have to interpret. Even if it was increasing a "feeling of privacy", there is a good argument to be made that giving such a feeling without actually increasing privacy (which this wouldn't do) is a bad thing. This would be privacy theater, if you will. Combine that with the fact that this would be significantly more churn and work - you either have the mapping be automatic, in which case it might completely re-number everything on every Go release, or you have to manually keep it up to date whenever something changes - and this seems an overall pretty bad idea. |
Thank you for: allowing anyone to participate in this this conversation So far I see a thoughtfully and polished proposal with user privacy as a fundamental guideline. I trust the spirit of the proposal that safeguards the balance between privacy and functionality is preserved and never broken. I'm making these comments as simple user who has chosen to use the language for work and personal use for many years now, not having any significant reason to doubt or question this choice. Grateful for the work being done and the direction where the language is going. |
Most of the data sent will be counters/numbers anyway. The format is binary, if I remember correctly, so it can be mmap'ed. This wouldn't affect the human readability of the data either way.
Avoiding a slippery slope isn't theater.
In my opinion, there isn't much widespread sympathy for having to do more work to do the right, best thing in terms of privacy. This change in direction happened because the community insisted that the Go team do the harder thing. Russ acknowledged that this change will require more work:
So I don't think it's necessarily relevant how hard it is, not without a careful tradeoff analysis. And I doubt it would entail as much hassle as you make it out to be, once a process is in place. For instance, the index can be immutable, and added to over time and releases with an AST scan. There may indeed be a good reason not to do this, but I'm not convinced that this is one of them. |
Hi @willfaught, there is a binary file used for performance reasons, but as I understand it, the uploaded data is purposefully in human-readable JSON. From the design post:
If the Go toolchain was to be underhanded about what it is doing, information could in theory be smuggled in a series of words or a series of numbers. To reduce the chance of the Go toolchain doing something underhanded here, including regarding how it handles strings, I suspect someone opting in will need to rely at least in part on the Go toolchain being open source (and either look at the code themselves, or rely on the likelihood of others doing so, the reproducibility of toolchain builds, the code review process, etc.). If someone crosses that threshold of trust and is willing to choose to opt in, I tend to agree with @Merovius that plain text strings are easier to understand and help with transparency more than a series of numbers. |
From https://research.swtch.com/telemetry-design:
I think this is saying the counter files are locally compact, but emitted as JSON. In the original thread someone made the point that plaintext is a good clue to someone who is sniffing traffic what's going on, that made sense to me. In a wonky way, a set of plaintext strings is already a set of unique numbers in this kind of scenario. |
I see. It's hard to judge the readability of I think the bigger concern—bigger than the malicious sending of personal data strings—is the accidental sending of personal data strings. Nobody wants "home_video_of_[embarrassing thing].mov" or "[embarrassing thing]website.com/go-foo/bar" accidentally sent to the Go team. And perhaps no one will file an issue to fix it either, out of embarrassment, when the problem is discovered. Not sending any strings at all rules out that possibility entirely. |
By design only strings that are explicitly requested by the telemetry server will be sent to the telemetry server. Of course we can't stop anyone from modifying the Go tools to send other strings, but it hardly seems likely to happen by accident. |
@willfaught publishing numbers instead of strings is a false sense of security. It accomplishes nothing. Who does it protect from when the strings are public and can be consulted anytime by anyone? Users would be more comfortable reading and understating right there what is being sent instead of cross-referencing that is a burden for the user an no burden for any would-be malicious actors. |
Note that one class of strings are counter names and the counter names might be a full stack trace (function names and line number offsets). The space is large. So, no, I don't think I'm overstating the hassle. And, again, I think this actually worsens the privacy and transparency of the design and makes it easier to abuse, if anything. I agree that doing extra work might be worth it, if it improved privacy - but in this case, it's a lose-lose, you'd have to do extra work to make things worse. |
Ah, I see, then the string space might indeed be very large. Makes sense. |
For those who might be interested, if users will be able to inspect the reports:
(From https://research.swtch.com/telemetry-design#counting.)
|
It's not |
Have all concerns about this proposal been addressed? |
Based on the discussion above, this proposal seems like a likely accept. |
No change in consensus, so accepted. 🎉 |
Change https://go.dev/cl/562715 mentions this issue: |
Change https://go.dev/cl/562735 mentions this issue: |
This change was produced with go get golang.org/x/telemetry@latest go mod tidy go mod vendor For #65586,#58894 Change-Id: I631a424ebb726fb0999d4b5d1e6e7a288b475344 Cq-Include-Trybots: luci.golang.try:gotip-windows-386,gotip-windows-amd64-longtest,gotip-solaris-amd64,gotip-openbsd-amd64,gotip-wasip1-wasm_wazero,gotip-js-wasm Reviewed-on: https://go-review.googlesource.com/c/go/+/562715 TryBot-Bypass: Michael Matloob <matloob@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Michael Matloob <matloob@golang.org> Commit-Queue: Michael Matloob <matloob@golang.org>
For #58894,#65586 This is a revert of CL 560655 which was a revert of CL 559519. CL 559519 was reverted because it was broken on windows/386. But now CL 562715 pulls in x/telemetry CL 560462 which disables telemetry on windows/386, fixing that issue. Change-Id: I094e90c28bca02f2303807d3b008f2ef9d59433c Reviewed-on: https://go-review.googlesource.com/c/go/+/562735 Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Michael Matloob <matloob@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Change https://go.dev/cl/564555 mentions this issue: |
This change vendors golang.org/x/telemetry and calls counter.Open in cmd/go right at the beginning. For golang#58894 Change-Id: Iae56328440614b213c1429972e6f68f22c2112cd Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-windows-amd64-longtest,gotip-linux-amd64-longtest-race Reviewed-on: https://go-review.googlesource.com/c/go/+/559199 Reviewed-by: Bryan Mills <bcmills@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
This brings in CL 559505 which adds a stub for counter.CountFlags so it can be depended on and still build on Go 1.18 and earlier. This will allow the go command to use counter.CountFlags and still be able to build as the bootstrap command with an earlier version of Go. For golang#58894 Change-Id: I31d5b96bd47eef2e407ef97e6146adece403f2c0 Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/559795 Reviewed-by: Bryan Mills <bcmills@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
For golang#58894 Change-Id: I6b5d5b14be9858f5855eeac0110aa44e762cee03 Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/559519 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Bryan Mills <bcmills@google.com>
This change was produced with go get golang.org/x/telemetry@latest go mod tidy go mod vendor For golang#65586,golang#58894 Change-Id: I631a424ebb726fb0999d4b5d1e6e7a288b475344 Cq-Include-Trybots: luci.golang.try:gotip-windows-386,gotip-windows-amd64-longtest,gotip-solaris-amd64,gotip-openbsd-amd64,gotip-wasip1-wasm_wazero,gotip-js-wasm Reviewed-on: https://go-review.googlesource.com/c/go/+/562715 TryBot-Bypass: Michael Matloob <matloob@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Michael Matloob <matloob@golang.org> Commit-Queue: Michael Matloob <matloob@golang.org>
For golang#58894,golang#65586 This is a revert of CL 560655 which was a revert of CL 559519. CL 559519 was reverted because it was broken on windows/386. But now CL 562715 pulls in x/telemetry CL 560462 which disables telemetry on windows/386, fixing that issue. Change-Id: I094e90c28bca02f2303807d3b008f2ef9d59433c Reviewed-on: https://go-review.googlesource.com/c/go/+/562735 Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Michael Matloob <matloob@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Change https://go.dev/cl/568257 mentions this issue: |
Maintain a list of counters we collect and test that it hasn't changed. If it has, fail a test and have the user update the list. The update process will print a reminder to update the list of collected counters. Also run go mod vendor to pull in golang.org/x/telemetry/counter/countertest. For #58894 Change-Id: I661a9c3d67cb33f42a5519f4639af7aa05c3821d Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/564555 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Robert Findley <rfindley@google.com>
Change https://go.dev/cl/570679 mentions this issue: |
Change https://go.dev/cl/570736 mentions this issue: |
Primarily, this change removes the cmd/ prefix on the go command counter names. The 'error' counter is changed to 'errors' reflecting that it's a bucket that contains multiple errors. the switch-exec and select-exec counters are moved into a 'toolchain' grouping. For #58894 Change-Id: Id6e0e7a0b4a5e42a0aef04b1210d2bb5256eb6c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/570736 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Change https://go.dev/cl/582695 mentions this issue: |
This CL adds a wrapper for the golang.org/x/telemetry/counter.NewStack function so that it can be used by the compiler. Also add build constraints for compiler_bootstrap to build the stubs when we're bootstrapping the compiler. For #58894 Change-Id: Icdbdd7aa6d2a3f1147112739c6939e14414f5ee9 Cq-Include-Trybots: luci.golang.try:gotip-linux-arm64-longtest,gotip-windows-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/582695 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Sam Thanawalla <samthanawalla@google.com>
Change https://go.dev/cl/584276 mentions this issue: |
Add cmd/internal/telemetry to cmd/dist's bootstrapDirs so it's built when bootstrapping the compiler. cmd/internal/telemetry is a wrapper arount telemetry functions that stubs out the functions when built in bootstrap mode to avoid dependencies on x/telemetry in bootstrap mode. Call telemetry.Start with an empty config to open the counter file, and increment a counter for when the command is invoked. After flags are parsed, increment a counter for each of the names of the flags that were passed in. The counter names will be compile/flag:<name> so for example we'll have compile/flag:e and compile/flag:E. In FatalfAt, increment a stack counter for internal errors. For #58894 Change-Id: Ia5a8a63aa43b2276641181626cbfbea7e4647faa Reviewed-on: https://go-review.googlesource.com/c/go/+/570679 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
Add counters for invocations and provided flag names. For #58894 Change-Id: Ibd4eeca905d277879b601d95bab524fbced6a98b Reviewed-on: https://go-review.googlesource.com/c/go/+/584276 Reviewed-by: Than McIntosh <thanm@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
Change https://go.dev/cl/585235 mentions this issue: |
This change modifies the commands in cmd to open counter files, increment invocations counters and to increment counters for the names of the flags that were passed in. cmd/pprof and cmd/vet are both wrappers around tools defined in other modules which do their own flag processing so we can't directly increment flag counters right after flags are parsed. For those two commands we wait to increment counters until after the programs have returned. cmd/dist is built with the bootstrap go so it can't depend on telemetry yet. We can add telemetry support to it once 1.23 is the minimum bootstrap version. For #58894 Change-Id: Ic7f6009992465e55c56ad4dc6451bcb1ca51374a Reviewed-on: https://go-review.googlesource.com/c/go/+/585235 Reviewed-by: Sam Thanawalla <samthanawalla@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Change https://go.dev/cl/586138 mentions this issue: |
For #58894 Change-Id: Ia30a3a1a9c7b611f55701956c08caa967634cd5a Reviewed-on: https://go-review.googlesource.com/c/go/+/586138 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Sam Thanawalla <samthanawalla@google.com> Reviewed-by: Robert Findley <rfindley@google.com>
@matloob with the 1.23 release, is this done? |
Yes. |
In February I posted a series of blog posts defining Transparent Telemetry, and we had a lively discussion on #58409. In the original posts, the design was opt-out. Based on that discussion as well as private discussions with long-time contributors and users, I revised the design to be opt-in.
I propose that we add opt-in transparent telemetry to the Go toolchain as described in those posts, specifically “The Design of Transparent Telemetry.” Transparent Telemetry has the following key properties:
Please note, as described in the Why Telemetry? section of the intro post, that telemetry addresses a different kind of problem than bug reports and surveys. In particular, relying on bug reports is not sufficient to identify problems that don't obviously impact functionality, including performance problems, and surveys are not sufficient to identify the variety of usage and contexts where Go is used and which would inform prioritization of effort.
There is good reason to believe that with even tens of thousands of users opted in, we should be able to get helpful data. It will not be as complete as the opt-out system, but it should be good enough. As described in the Can We Still Make Good Decisions? section of the opt-in post, there will be certain biases in the data based on who is more likely to opt in. Once we have data, it would make sense to compare “technical demographics” like operating system and editor against the annual Go survey and Stack Overflow surveys. If there is significant skew, we could look into reweighting the data as standard polls do (https://en.wikipedia.org/wiki/Iterative_proportional_fitting).
For examples of the kinds of questions we'd use telemetry to answer and the kinds of decisions those answers would inform (but not decide directly), see “Use Cases for Transparent Telemetry”.
The text was updated successfully, but these errors were encountered: