diff --git a/docs/keep_data_small.txt b/docs/keep_data_small.txt index fcd8df4a9..72de2d1ef 100644 --- a/docs/keep_data_small.txt +++ b/docs/keep_data_small.txt @@ -2,44 +2,48 @@ When many applets are compiled into busybox, all rw data and bss for each applet are concatenated. Including those from libc, -if static bbox is built. When bbox is started, _all_ this data +if static busybox is built. When busybox is started, _all_ this data is allocated, not just that one part for selected applet. What "allocated" exactly means, depends on arch. -On nommu it's probably bites the most, actually using real +On NOMMU it's probably bites the most, actually using real RAM for rwdata and bss. On i386, bss is lazily allocated by COWed zero pages. Not sure about rwdata - also COW? -In order to keep bbox NOMMU and small-mem systems friendly +In order to keep busybox NOMMU and small-mem systems friendly we should avoid large global data in our applets, and should minimize usage of libc functions which implicitly use -such structures in libc. +such structures. -Small experiment measures "parasitic" bbox memory consumption. -Here we start 1000 "busybox sleep 10" in parallel. -bbox binary is practically allyesconfig static one, -built against uclibc: +Small experiment to measure "parasitic" bbox memory consumption: +here we start 1000 "busybox sleep 10" in parallel. +busybox binary is practically allyesconfig static one, +built against uclibc. Run on x86-64 machine with 64-bit kernel: -bash-3.2# nmeter '%t %c %b %m %p %[pn]' -23:17:28 .......... 0 0 168M 0 147 -23:17:29 .......... 0 0 168M 0 147 -23:17:30 U......... 0 0 168M 1 147 -23:17:31 SU........ 0 188k 181M 244 391 -23:17:32 SSSSUUU... 0 0 223M 757 1147 -23:17:33 UUU....... 0 0 223M 0 1147 -23:17:34 U......... 0 0 223M 1 1147 -23:17:35 .......... 0 0 223M 0 1147 -23:17:36 .......... 0 0 223M 0 1147 -23:17:37 S......... 0 0 223M 0 1147 -23:17:38 .......... 0 0 223M 1 1147 -23:17:39 .......... 0 0 223M 0 1147 -23:17:40 .......... 0 0 223M 0 1147 -23:17:41 .......... 0 0 210M 0 906 -23:17:42 .......... 0 0 168M 1 147 -23:17:43 .......... 0 0 168M 0 147 +bash-3.2# nmeter '%t %c %m %p %[pn]' +23:17:28 .......... 168M 0 147 +23:17:29 .......... 168M 0 147 +23:17:30 U......... 168M 1 147 +23:17:31 SU........ 181M 244 391 +23:17:32 SSSSUUU... 223M 757 1147 +23:17:33 UUU....... 223M 0 1147 +23:17:34 U......... 223M 1 1147 +23:17:35 .......... 223M 0 1147 +23:17:36 .......... 223M 0 1147 +23:17:37 S......... 223M 0 1147 +23:17:38 .......... 223M 1 1147 +23:17:39 .......... 223M 0 1147 +23:17:40 .......... 223M 0 1147 +23:17:41 .......... 210M 0 906 +23:17:42 .......... 168M 1 147 +23:17:43 .......... 168M 0 147 This requires 55M of memory. Thus 1 trivial busybox applet -takes 55k of memory. +takes 55k of memory on 64-bit x86 kernel. + +On 32-bit kernel we need ~26k per applet. + +(Data from NOMMU arches are sought. Provide 'size busybox' output too) Example 1 @@ -104,8 +108,12 @@ its needs. Library functions are prohibited from using it. #define G (*(struct globals*)&bb_common_bufsiz1) -Be careful, though, and use it only if -sizeof(struct globals) <= sizeof(bb_common_bufsiz1). +Be careful, though, and use it only if globals fit into bb_common_bufsiz1. +Since bb_common_bufsiz1 is BUFSIZ + 1 bytes long and BUFSIZ can change +from one libc to another, you have to add compile-time check for it: + +if(sizeof(struct globals) > sizeof(bb_common_bufsiz1)) + BUG__globals_too_big(); Drawbacks @@ -135,7 +143,7 @@ static int tabstop; static struct termios term_orig __attribute__ ((aligned (4))); static struct termios term_vi __attribute__ ((aligned (4))); -reduced bss size by 32 bytes, because gcc sometimes aligns structures to +reduces bss size by 32 bytes, because gcc sometimes aligns structures to ridiculously large values. asm output diff for above example: tabstop: @@ -154,3 +162,15 @@ ridiculously large values. asm output diff for above example: .size term_vi, 60 gcc doesn't seem to have options for altering this behaviour. + +gcc 3.4.3: +// gcc aligns to 32 bytes if sizeof(struct) >= 32 +struct st { + int c_iflag,c_oflag,c_cflag,c_lflag; + int i1,i2,i3; // struct will be aligned to 4 bytes +// int i1,i2,i3,i4; // struct will be aligned to 32 bytes +}; +struct st t = { 1 }; +// same for arrays +char vc31[31] = { 1 }; // unaligned +char vc32[32] = { 1 }; // aligned to 32 bytes