Open
Description
As an experiment, I build "hello, world" at the release points for go 1.0. 1.1, and 1.2. Here are the binary's sizes: % ls -l x.1.? -rwxr-xr-x 1 r staff 1191952 Nov 30 10:25 x.1.0 -rwxr-xr-x 1 r staff 1525936 Nov 30 10:20 x.1.1 -rwxr-xr-x 1 r staff 2188576 Nov 30 10:18 x.1.2 % size x.1.? __TEXT __DATA __OBJC others dec hex 880640 33682096 0 4112 34566848 20f72c0 x.1.0 1064960 94656 0 75952 1235568 12da70 x.1.1 1429504 147896 0 177440 1754840 1ac6d8 x.1.2 % A near-doubling of the binary size in two releases is a bug of a kind. I will hold on to the files so they can be analyzed more, but am filing this issue to get the topic registered. We need to develop a better understanding of the problem and how to address it. Marking this 1.3 (not maybe) because I consider it a priority. A few months ago I exchanged mail with Russ about this topic regarding a different, much larger binary. To avoid him having to redo the analysis, here is what he said at the time: ==== i sent CL 13722046 to make the nm -S output a bit more useful. for the toy binary i now get 4a2280 1898528 D symtab 26f3a0 1405936 D type.* 671aa0 1058432 D pclntab 3c6790 598056 D go.string.* 4620c0 49600 D gcbss 7a7c20 45496 B runtime.mheap 46e280 21936 D gcdata 7a29e0 21056 b bufferList 1ed600 16480 T crypto/tls.(*Conn).clientHandshake 79eb20 16064 b semtable 1b3d90 14224 T net/http.init that seems plausible to me. some notes: symtab is the plan 9 symbol table. it in the binary but never referenced at run time. it supports things like nm -S only. it needs to move into an unmapped section of the binary, but it is only costing at most 8k at run time right now due to fragmentation and it just wasn't worth the effort to try to move. the new linker will make this easier. of course, moving it in the file doesn't shrink the file. the thing named pclntab is a reencoding of the original pclntab and the parts of the plan 9 symbol table that we did need at run time (mostly just a list of functions and their names and addresses). as you can see, it is much smaller than the old form (the symbol table dominates). type.* is the reflect types and go.string.* is the static go string data. the * indicates that i coalesced many symbols into one, to avoid useless individual names bloating the symbol table. if we tried we could probably cut the reflect types by 2-4x. it would mean packing the data a bit more compactly than an ordinary go data structure would and then using unsafe to get it back out. gcbss and gcdata are garbage collection bits for the bss and data segments. that's what atom symbol did, and it's not clear whether it will last (probably not) and whether what will replace it will be smaller. time will tell. i have a meeting with dmitriy, carl, and keith next week to figure out what the plan is. runtime.mheap, bufferList, and semtable are bss. you're not seeing the gdb dwarf debug information here, because it's not a runtime symbol. g% otool -l $(which toy) | egrep '^ segname|filesize' segname __PAGEZERO filesize 0 segname __TEXT filesize 7811072 segname __DATA filesize 126560 segname __LINKEDIT filesize 921772 segname __DWARF filesize 2886943 g% there's another 3 MB. you can build with -ldflags -w to get rid of that at least. if you read the full otool -l output you will find Load command 6 cmd LC_SYMTAB cmdsize 24 symoff 10825728 nsyms 22559 stroff 11186924 strsize 560576 looks like another 1 MB or so (560576+11186924-10825728 or 22559*16+560576) for the mach-o symbol table. when we do the new linker we can make recording this kind of information in a useful form a priority.