Commit Graph

990 Commits

Author SHA1 Message Date
Ron Yorston
2759201401 vi: allow delimiter in ':s' to be escaped
When regular expressions are allowed in search commands it becomes
possible to escape the delimiter in search/replace commands.  For
example, this command will replace '/abc' with '/abc/':

   :s/\/abc/\/abc\//g

The code to split the command into 'find' and 'replace' strings
should allow for this possibility.

VI_REGEX_SEARCH isn't enabled by default.  When it is:

function                                             old     new   delta
strchr_backslash                                       -      38     +38
colon                                               4378    4373      -5
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 0/1 up/down: 38/-5)              Total: 33 bytes

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 14:42:20 +02:00
Denys Vlasenko
95ac4a48f1 vi: allow regular expressions in ':s' commands
BusyBox vi has never supported the use of regular expressions in
search/replace (':s') commands.  Implement this using GNU regex
when VI_REGEX_SEARCH is enabled.

The implementation:

- uses basic regular expressions, to match those used in the search
  command;

- only supports substitution of back references ('\0' - '\9') in the
  replacement string.  Any other character following a backslash is
  treated as that literal character.

VI_REGEX_SEARCH isn't enabled in the default build.  In that case:

function                                             old     new   delta
colon                                               4036    4033      -3
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-3)               Total: -3 bytes

When VI_REGEX_SEARCH is enabled:

function                                             old     new   delta
colon                                               4036    4378    +342
.rodata                                           108207  108229     +22
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 364/0)             Total: 364 bytes

v2: Rebase.  Code shrink.  Ensure empty replacement string is null terminated.

Signed-off-by: Andrey Dobrovolsky <andrey.dobrovolsky.odessa@gmail.com>
Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 14:38:20 +02:00
Ron Yorston
c76c78740a vi: improve handling of anchored searches
Suppose we search for a git conflict marker '<<<<<<< HEAD' using
the command '/^<<<'.  Using 'n' to go to the next match finds
'<<<' on the current line, apparently ignoring the '^' anchor.

Set a flag in the compiled regular expression to indicate that the
start of the string should not be considered a beginning-of-line
anchor.  An exception has to be made when the search starts from
the beginning of the file.  Make a similar change for end-of-line
anchors.

This doesn't affect a default build with VI_REGEX_SEARCH disabled.
When it's enabled:

function                                             old     new   delta
char_search                                          247     285     +38

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 13:50:59 +02:00
Ron Yorston
2916443ab6 vi: use basic regular expressions for search
Both traditional vi and vim use basic regular expressions for
search.  Also, they don't allow matches to extend across line
endings.  Thus with the file:

   123
   234

the search '/2.*4$' should find the second '2', not the first.

Make BusyBox vi do the same.

Whether or not VI_REGEX_SEARCH is enabled:

function                                             old     new   delta
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0)                 Total: 0 bytes

Signed-off-by: Andrey Dobrovolsky <andrey.dobrovolsky.odessa@gmail.com>
Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 13:36:29 +02:00
Ron Yorston
b50ac07cba vi: allow 'gg' to specify a range
Commit 7b93e317c (vi: enable 'dG' command. Closes 11801) allowed
'G' to be used as a range specifier for change/yank/delete
operations.

Add similar support for 'gg'.  This requires setting the 'cmd_error'
flag if 'g' is followed by any character other than another 'g'.

function                                             old     new   delta
do_cmd                                              4852    4860      +8
.rodata                                           108179  108180      +1
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 9/0)                 Total: 9 bytes

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 13:32:13 +02:00
Denys Vlasenko
ab755e3717 awk: in parsing, remove superfluous NEWLINE check; optimize builtin arg evaluation
function                                             old     new   delta
exec_builtin                                        1149    1145      -4

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-12 13:30:30 +02:00
Denys Vlasenko
8d269ef859 awk: fix printf "%-10c", 0
function                                             old     new   delta
awk_printf                                           596     626     +30

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-12 11:27:11 +02:00
Denys Vlasenko
caa93ecdd3 awk: fix corner case in awk_printf
Example where it wasn't working:
    awk 'BEGIN { printf "qwe %s rty %c uio\n", "a", 0, "c" }'
- the NUL printing in %c caused premature stop of printing.

function                                             old     new   delta
awk_printf                                           593     596      +3

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 18:16:10 +02:00
Denys Vlasenko
39aabfe8f0 awk: unbreak "cmd" | getline
function                                             old     new   delta
evaluate                                            3337    3343      +6

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 12:51:43 +02:00
Denys Vlasenko
4ef8841b21 awk: unbreak "printf('%c') can output NUL" testcase
function                                             old     new   delta
awk_printf                                           546     593     +47

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 12:25:33 +02:00
Denys Vlasenko
3d57a84907 awk: undo TI_PRINT, it introduced a bug (print with any redirect acting as printf)
function                                             old     new   delta
evaluate                                            3329    3337      +8

Patch by Ron Yorston <rmy@pobox.com>

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 12:00:31 +02:00
Denys Vlasenko
49c3ce64f0 awk: rollback_token() + chain_group() == chain_until_rbrace()
function                                             old     new   delta
parse_program                                        336     332      -4

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 11:46:21 +02:00
Denys Vlasenko
e2e3802987 awk: fix printf buffer overflow
function                                             old     new   delta
awk_printf                                           468     546     +78
fmt_num                                              239     247      +8
getvar_s                                             125     111     -14
evaluate                                            3343    3329     -14
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/2 up/down: 86/-28)             Total: 58 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-04 01:25:34 +02:00
Denys Vlasenko
08ca313d7e awk: simplify tests for operation class
Usually, an operation class has only one possible value of "info" word.
In this case, just compare the entire info word, do not bother
to mask OPCLSMASK bits.

(Example where this is not the case: OC_REPLACE for "<op>=")

function                                             old     new   delta
mk_splitter                                          106     100      -6
chain_group                                          616     610      -6
nextarg                                               40      32      -8
exec_builtin                                        1157    1149      -8
as_regex                                             111     103      -8
awk_split                                            553     543     -10
parse_expr                                           948     936     -12
awk_getline                                          656     642     -14
evaluate                                            3387    3343     -44
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/9 up/down: 0/-116)           Total: -116 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 14:11:51 +02:00
Denys Vlasenko
cb042b0582 awk: restore strdup elision optimization in assignment
function                                             old     new   delta
evaluate                                            3339    3387     +48

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 13:30:45 +02:00
Denys Vlasenko
90404ed2f6 awk: match(): code shrink
function                                             old     new   delta
do_match                                               -     165    +165
exec_builtin_match                                   202       -    -202
------------------------------------------------------------------------------
(add/remove: 1/1 grow/shrink: 0/0 up/down: 165/-202)          Total: -37 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 12:20:36 +02:00
Denys Vlasenko
0e3ef4efb0 awk: rand(): 64-bit constants should be ULL
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 11:57:59 +02:00
Denys Vlasenko
2211fa70cc awk: do not use a copy of g_progname for node->l.new_progname
We never destroy g_progname's, the strings still exist, no need to copy

function                                             old     new   delta
chain_node                                           104      97      -7

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 11:54:01 +02:00
Denys Vlasenko
e1e7ad6b60 awk: support %F %a %A in printf
function                                             old     new   delta
.rodata                                           104111  104120      +9

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 01:59:36 +02:00
Denys Vlasenko
1f765709ed awk: open-code TS_OPTERM, no logic changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 01:32:03 +02:00
Denys Vlasenko
2b65e73db3 awk: tighten rules in action parsing
Disallow:
    BEGIN
	{ action }  - must start on the same line
Disallow:
    func f()
	print "hello" - must be in {...}

function                                             old     new   delta
chain_until_rbrace                                     -      41     +41
parse_program                                        307     336     +29
chain_group                                          649     616     -33
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 1/1 up/down: 70/-33)             Total: 37 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 01:16:48 +02:00
Denys Vlasenko
717200eb43 awk: rename GRPSTART/END to L/RBRACE, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 00:39:55 +02:00
Denys Vlasenko
b705bf5539 awk: move match() code out-of-line
function                                             old     new   delta
exec_builtin_match                                     -     202    +202
exec_builtin                                        1434    1157    -277
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 0/1 up/down: 202/-277)          Total: -75 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 23:48:48 +02:00
Denys Vlasenko
646429e05e awk: use smaller regmatch_t arrays, they had 2 elements for no apparent reason
function                                             old     new   delta
exec_builtin                                        1479    1434     -45

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 23:26:09 +02:00
Denys Vlasenko
a5d7b0f4f4 awk: fix detection of VAR=VAL arguments
1NAME=VAL is not it, neither is VA.R=VAL

function                                             old     new   delta
next_input_file                                      216     214      -2
is_assignment                                        115      91     -24
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-26)             Total: -26 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 23:07:21 +02:00
Denys Vlasenko
4d902ea9de awk: fix beavior of "exit" without parameter
function                                             old     new   delta
evaluate                                            3336    3339      +3
awk_exit                                              93      94      +1
awk_main                                             829     827      -2
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/1 up/down: 4/-2)                Total: 2 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 22:28:51 +02:00
Denys Vlasenko
8bb03da906 awk: rand() could return 1.0, fix this - should be in [0,1)
While at it, make it finer-grained (63 bits of randomness)

function                                             old     new   delta
evaluate                                            3303    3336     +33
.rodata                                           104107  104111      +4
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 37/0)               Total: 37 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 19:38:03 +02:00
Denys Vlasenko
37ae8cdc6e awk: beautify builtins table, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 18:55:00 +02:00
Denys Vlasenko
47d9133896 awk: enforce simple builtins' argument number
function                                             old     new   delta
evaluate                                            3215    3303     +88
.rodata                                           104036  104107     +71
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 159/0)             Total: 159 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 18:28:12 +02:00
Denys Vlasenko
786ca197ad awk: make builtin definitions more understandable, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 17:32:08 +02:00
Denys Vlasenko
640212ae0e awk: do not special-case "delete"
Rework of the previous fix:
Can use operation attributes to disable arg evaluation instead of special-casing.

function                                             old     new   delta
.rodata                                           104032  104036      +4
evaluate                                            3223    3215      -8
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/1 up/down: 4/-8)               Total: -4 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 15:21:36 +02:00
Denys Vlasenko
ef5463cf16 awk: shuffle globals for smaller offsets
function                                             old     new   delta
awk_main                                             832     829      -3
evaluate                                            3229    3223      -6
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-9)               Total: -9 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 14:53:52 +02:00
Denys Vlasenko
966cafcc77 awk: use "static" tmpvars in main and exit
function                                             old     new   delta
awk_exit                                             103      93     -10
awk_main                                             850     832     -18
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-28)             Total: -28 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 14:33:13 +02:00
Denys Vlasenko
1193c68fa7 awk: when parsing length(), simplify eating of LPAREN
function                                             old     new   delta
parse_expr                                           945     948      +3

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 14:29:01 +02:00
Denys Vlasenko
40573556f2 awk: shuffle functions to reduce forward declarations, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 14:27:40 +02:00
Denys Vlasenko
8b4c429025 awk: use static tmpvars instead of nvalloc(1)ed ones
ptest() was using this idea already.

As far as I can see, this is safe. Ttestsuite passes.

One downside is that a temporary from e.g. printf invocation
won't be freed until the next printf call.

function                                             old     new   delta
awk_printf                                           481     468     -13
as_regex                                             137     111     -26
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-39)             Total: -39 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-01 17:50:26 +02:00
Denys Vlasenko
1573487e21 awk: rename temp variables, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-01 16:17:33 +02:00
Denys Vlasenko
d7354df169 awk: evaluate all, even superfluous function args
function                                             old     new   delta
evaluate                                            3128    3135      +7

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-30 12:52:51 +02:00
Denys Vlasenko
ca9278ee58 awk: rewrite "print" logic a bit to make it clearer
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-30 12:42:39 +02:00
Denys Vlasenko
d150710169 awk: allow empty fuinctions with no arguments, disallow function redefinitions
function                                             old     new   delta
.rodata                                           103681  103700     +19
parse_program                                        303     307      +4
evaluate                                            3145    3141      -4
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/1 up/down: 23/-4)              Total: 19 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-30 12:23:51 +02:00
Denys Vlasenko
86fc2872b3 awk: replace incorrect use of union in undefined function check (no code changes)
...which reveals that it's buggy: it thinks "func f(){}" is an undefined function!

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-30 12:12:20 +02:00
Denys Vlasenko
6cf6f1eaee awk: remove custom pool allocator for temporary awk variables
It seems to be designed to reduce overhead of malloc's auxiliary data,
by allocating at least 64 variables as a block.
With "struct var" being about 20-32 bytes long (32/64 bits),
malloc overhead for one temporary indeed is high, ~33% more memory used
than needed.

function                                             old     new   delta
evaluate                                            3137    3145      +8
modprobe_main                                        798     803      +5
exec_builtin                                        1414    1419      +5
awk_printf                                           476     481      +5
as_regex                                             132     137      +5
EMSG_INTERNAL_ERROR                                   15       -     -15
nvfree                                               169     116     -53
nvalloc                                              145       -    -145
------------------------------------------------------------------------------
(add/remove: 0/2 grow/shrink: 5/1 up/down: 28/-213)          Total: -185 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-30 08:01:29 +02:00
Denys Vlasenko
3aff3b9cb8 awk: assorted optimizations
hash_find(): do not caclculate hash twice. Do not divide - can use
cheap multiply-by-8 shift.

nextword(): do not repeatedly increment in-memory value, do it in register,
then store final result.

hashwalk_init(): do not strlen() twice.

function                                             old     new   delta
hash_search3                                           -      49     +49
hash_find                                            259     281     +22
nextword                                              19      16      -3
evaluate                                            3141    3137      -4
hash_search                                           54      28     -26
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 1/3 up/down: 71/-33)             Total: 38 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-29 19:07:36 +02:00
Denys Vlasenko
b3c91a127f awk: free unused parsing structures after parse is done
function                                             old     new   delta
hash_clear                                             -      90     +90
awk_main                                             827     849     +22
clear_array                                           90       -     -90
------------------------------------------------------------------------------
(add/remove: 1/1 grow/shrink: 1/0 up/down: 112/-90)            Total: 22 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-29 19:06:59 +02:00
Denys Vlasenko
21fbee2e87 awk: document which hashes are used at what state (parse/execute)
We can free them after they are no longer needed.
(Currently, being a NOEXEC applet is much larger waste of memory
for the case of long-running awk script).

function                                             old     new   delta
awk_main                                             831     827      -4

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-29 14:33:04 +02:00
Denys Vlasenko
6872c193a9 awk: fix parsing of expressions such as "v (a)"
function                                             old     new   delta
next_token                                           812     825     +13

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-29 12:16:36 +02:00
Denys Vlasenko
686287b5da awk: deindent a block, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-29 03:47:46 +02:00
Denys Vlasenko
216d3d8ad9 awk: code shrink
function                                             old     new   delta
parse_expr                                           948     945      -3
chain_expr                                            65      62      -3
chain_group                                          655     649      -6
parse_program                                        310     303      -7
rollback_token                                        10       -     -10
------------------------------------------------------------------------------
(add/remove: 0/1 grow/shrink: 0/4 up/down: 0/-29)             Total: -29 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-29 03:44:56 +02:00
Denys Vlasenko
4f27503a1e awk: get rid of "move name one char back" trick in next_token()
function                                             old     new   delta
next_token                                           791     812     +21
awk_main                                             886     831     -55
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/1 up/down: 21/-55)            Total: -34 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-29 03:27:07 +02:00
Denys Vlasenko
f414fb4411 awk: when parsing TC_FUNCTION token, eat its opening '('
...like we do for array references.

function                                             old     new   delta
parse_expr                                           938     948     +10
next_token                                           788     791      +3
parse_program                                        313     310      -3
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/1 up/down: 13/-3)              Total: 10 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-06-29 03:02:21 +02:00