libbb: better unicode width support. Hopefully fixes bug 839.

Also opens up a possibility to make other unicode stuff smaller
and more correct later. but:

function                                             old     new   delta
static.combining                                       -     516    +516
bb_wcwidth                                             -     328    +328
unicode_cut_nchars                                     -     141    +141
mbstowc_internal                                       -      93     +93
in_table                                               -      78     +78
cal_main                                             899     961     +62
static.combining0x10000                                -      40     +40
unicode_strlen                                         -      31     +31
bb_mbstrlen                                           31       -     -31
bb_mbstowcs                                          173     102     -71
------------------------------------------------------------------------------
(add/remove: 7/1 grow/shrink: 1/1 up/down: 1289/-102)        Total: 1187 bytes

Uses code of Markus Kuhn, which is in public domain:
http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
"Permission to use, copy, modify, and distribute this software
 for any purpose and without fee is hereby granted. The author
 disclaims all warranties with regard to this software."

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
This commit is contained in:
Denys Vlasenko
2010-01-24 07:44:03 +01:00
parent 5da9f96ad8
commit 9f93d62192
10 changed files with 410 additions and 98 deletions

View File

@@ -64,7 +64,7 @@ int lsmod_main(int argc UNUSED_PARAM, char **argv UNUSED_PARAM)
} else
token[3] = (char *) "";
# if ENABLE_FEATURE_ASSUME_UNICODE
name_len = bb_mbstrlen(token[0]);
name_len = unicode_strlen(token[0]);
name_len = (name_len > 19) ? 0 : 19 - name_len;
printf("%s%*s %8s %2s %s\n", token[0], name_len, "", token[1], token[2], token[3]);
# else
@@ -78,7 +78,7 @@ int lsmod_main(int argc UNUSED_PARAM, char **argv UNUSED_PARAM)
// so trimming the trailing char is just what we need!
token[3][strlen(token[3])-1] = '\0';
# if ENABLE_FEATURE_ASSUME_UNICODE
name_len = bb_mbstrlen(token[0]);
name_len = unicode_strlen(token[0]);
name_len = (name_len > 19) ? 0 : 19 - name_len;
printf("%s%*s %8s %2s %s\n", token[0], name_len, "", token[1], token[2], token[3]);
# else