Skip to content

all: binaries too big and growing #6853

Open
@robpike

Description

@robpike
As an experiment, I build "hello, world" at the release points for go 1.0.
1.1, and 1.2. Here are the binary's sizes:

% ls -l x.1.?
-rwxr-xr-x  1 r  staff  1191952 Nov 30 10:25 x.1.0
-rwxr-xr-x  1 r  staff  1525936 Nov 30 10:20 x.1.1
-rwxr-xr-x  1 r  staff  2188576 Nov 30 10:18 x.1.2
% size x.1.?
__TEXT  __DATA  __OBJC  others  dec hex
880640  33682096    0   4112    34566848    20f72c0 x.1.0
1064960 94656   0   75952   1235568 12da70  x.1.1
1429504 147896  0   177440  1754840 1ac6d8  x.1.2
% 

A near-doubling of the binary size in two releases is a bug of a kind. I will hold on to
the files so they can be analyzed more, but am filing this issue to get the topic
registered. We need to develop a better understanding of the problem and how to address
it.

Marking this 1.3 (not maybe) because I consider it a priority.


A few months ago I exchanged mail with Russ about this topic regarding a different, much
larger binary. To avoid him having to redo the analysis, here is what he said at the
time:

====
i sent CL 13722046 to make the nm -S output a bit more useful.
for the toy binary i now get

  4a2280  1898528 D symtab
  26f3a0  1405936 D type.*
  671aa0  1058432 D pclntab
  3c6790   598056 D go.string.*
  4620c0    49600 D gcbss
  7a7c20    45496 B runtime.mheap
  46e280    21936 D gcdata
  7a29e0    21056 b bufferList
  1ed600    16480 T crypto/tls.(*Conn).clientHandshake
  79eb20    16064 b semtable
  1b3d90    14224 T net/http.init

that seems plausible to me. some notes:

symtab is the plan 9 symbol table. it in the binary but never referenced at run time. it
supports things like nm -S only. it needs to move into an unmapped section of the
binary, but it is only costing at most 8k at run time right now due to fragmentation and
it just wasn't worth the effort to try to move. the new linker will make this easier. of
course, moving it in the file doesn't shrink the file.

the thing named pclntab is a reencoding of the original pclntab and the parts of the
plan 9 symbol table that we did need at run time (mostly just a list of functions and
their names and addresses). as you can see, it is much smaller than the old form (the
symbol table dominates).

type.* is the reflect types and go.string.* is the static go string data. the *
indicates that i coalesced many symbols into one, to avoid useless individual names
bloating the symbol table. if we tried we could probably cut the reflect types by 2-4x.
it would mean packing the data a bit more compactly than an ordinary go data structure
would and then using unsafe to get it back out.

gcbss and gcdata are garbage collection bits for the bss and data segments. that's what
atom symbol did, and it's not clear whether it will last (probably not) and whether what
will replace it will be smaller. time will tell. i have a meeting with dmitriy, carl,
and keith next week to figure out what the plan is.

runtime.mheap, bufferList, and semtable are bss.

you're not seeing the gdb dwarf debug information here, because it's not a runtime
symbol.
 
g% otool -l $(which toy) | egrep '^  segname|filesize'
  segname __PAGEZERO
 filesize 0
  segname __TEXT
 filesize 7811072
  segname __DATA
 filesize 126560
  segname __LINKEDIT
 filesize 921772
  segname __DWARF
 filesize 2886943
g% 

there's another 3 MB. you can build with -ldflags -w to get rid of that at least.
if you read the full otool -l output you will find

Load command 6
     cmd LC_SYMTAB
 cmdsize 24
  symoff 10825728
   nsyms 22559
  stroff 11186924
 strsize 560576

looks like another 1 MB or so (560576+11186924-10825728 or 22559*16+560576) for the
mach-o symbol table.

when we do the new linker we can make recording this kind of information in a useful
form a priority.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsFixThe path to resolution is known, but the work has not been done.binary-sizeumbrella

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions