busybox/shell
Christoph Schulz 03ad7ae081 ash: reset tokpushback before prompting while parsing heredoc
The parser reads from an already freed memory location, thereby causing
unpredictable results, in the following situation:

- ENABLE_ASH_EXPAND_PRMT is enabled
- heredoc is being parsed
- command substitution is used within heredoc

Examples where this bug crops up are (PS2 is set to "> "):

$ cat <<EOF
> `echo abc`
> EOF
-sh: O: not found

$ cat <<EOF
> $(echo abc)
> EOF
-sh: {garbage}: not found

The presumable reason is that setprompt_if() causes a nested expansion when
ENABLE_ASH_EXPAND_PRMT is enabled, therefore leaving "wordtext" in an unusable
state. However, when parseheredoc() is called, "tokpushback" is non-zero, which
causes the next call to xxreadtoken() to return TWORD, causing the caller to
use the invalid "wordtoken" instead of reading the next valid token.

The call chain is:

list()
-> peektoken() [sets tokpushback to 1]
-> parseheredoc()
   -> setprompt_if()
      -> pushstackmark()
      -> expandstr()
         -> readtoken1()
            [sets lasttoken to TWORD, wordtoken points to expanded prompt]
      -> popstackmark() [invalidates wordtoken, leaves lasttoken as is]
   -> readtoken1()
      -> ...parsebackq
         -> list()
            -> andor()
               -> pipeline()
                  -> readtoken()
                     -> xxreadtoken()
                        [tokpushback non-zero, reuse lasttoken and wordtext]

Note that in almost all other contexts, each call to setprompt_if() is preceded
by setting "tokpushback" to zero. One exception is "oldstyle" backquote parsing
in readtoken1(), but there "tokpushback" is reset afterwards. The other
exception is nlprompt(), but this function is only used within readtoken1()
(but in contexts where no nested calls to xxreadtoken() occur) and xxreadtoken()
(where "tokpushback" is guaranteed to be zero).

function                                             old     new   delta
parseheredoc                                         124     131      +7

Signed-off-by: Christoph Schulz <develop@kristov.de>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2018-11-20 17:45:52 +01:00
..
ash_test ash: expand: Do not quote backslashes in unquoted parameter expansion 2018-08-07 18:58:02 +02:00
hush_test ash: expand: Do not quote backslashes in unquoted parameter expansion 2018-08-07 18:58:02 +02:00
ash_doc.txt ash: fix TRACE commands 2009-03-19 23:09:58 +00:00
ash_ptr_hack.c *: make GNU licensing statement forms more regular 2010-08-16 20:14:46 +02:00
ash.c ash: reset tokpushback before prompting while parsing heredoc 2018-11-20 17:45:52 +01:00
brace.txt hush: wait for cmd to complete, and immediately store its exitcode in $? 2009-11-15 19:58:19 +01:00
Config.src restore documentation on the build config language 2018-06-06 15:16:48 +02:00
cttyhack.c regularize format of source file headers, no code changes 2017-09-18 16:28:43 +02:00
hush_doc.txt hush: implement break and continue 2008-07-28 23:04:34 +00:00
hush_leaktool.sh hush: fix "export not_yet_defined_var", fix parsing of "cmd | }" 2009-04-19 23:07:51 +00:00
hush.c hush: correct description for HUSH_TICK config option 2018-11-14 11:35:58 +01:00
Kbuild.src Make it possible to select "sh" and "bash" aliases without selecting ash or hush 2016-12-23 16:56:43 +01:00
match.c hush: fix a='a\\'; echo "${a%\\\\}" 2018-03-02 20:48:36 +01:00
match.h hush: optimize #[#] and %[%] for speed. size -2 bytes. 2010-09-04 21:21:07 +02:00
math.c shell: handle $((NUM++...) like bash does. Closes 10706 2018-01-28 20:13:33 +01:00
math.h Make it possible to select "sh" and "bash" aliases without selecting ash or hush 2016-12-23 16:56:43 +01:00
random.c whitespace fixes 2018-07-17 15:04:17 +02:00
random.h ash,hush: improve randomness of $RANDOM, add easy-ish way to test it 2014-03-13 12:52:43 +01:00
README update shell/README 2010-05-20 12:56:14 +02:00
README.job hush: small code shrink; style fixes 2007-04-20 08:35:45 +00:00
shell_common.c ash,hush: fold shell_builtin_read() way-too-many params into a struct param 2018-08-05 18:11:15 +02:00
shell_common.h ash,hush: fold shell_builtin_read() way-too-many params into a struct param 2018-08-05 18:11:15 +02:00

http://www.opengroup.org/onlinepubs/9699919799/
Open Group Base Specifications Issue 7


http://www.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html
Shell & Utilities

It says that any of the standard utilities may be implemented
as a regular shell built-in. It gives a list of utilities which
are usually implemented that way (and some of them can only
be implemented as built-ins, like "alias"):

alias
bg
cd
command
false
fc
fg
getopts
jobs
kill
newgrp
pwd
read
true
umask
unalias
wait


http://www.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html
Shell Command Language

It says that shell must implement special built-ins. Special built-ins
differ from regular ones by the fact that variable assignments
done on special builtin are *PRESERVED*. That is,

VAR=VAL special_builtin; echo $VAR

should print VAL.

(Another distinction is that an error in special built-in should
abort the shell, but this is not such a critical difference,
and moreover, at least bash's "set" does not follow this rule,
which is even codified in autoconf configure logic now...)

List of special builtins:

. file
: [argument...]
break [n]
continue [n]
eval [argument...]
exec [command [argument...]]
exit [n]
export name[=word]...
export -p
readonly name[=word]...
readonly -p
return [n]
set [-abCefhmnuvx] [-o option] [argument...]
set [+abCefhmnuvx] [+o option] [argument...]
set -- [argument...]
set -o
set +o
shift [n]
times
trap n [condition...]
trap [action condition...]
unset [-fv] name...

In practice, no one uses this obscure feature - none of these builtins
gives any special reasons to play such dirty tricks.

However. This section also says that *function invocation* should act
similar to special built-in. That is, variable assignments
done on function invocation should be preserved after function invocation.

This is significant: it is not unthinkable to want to run a function
with some variables set to special values. But because of the above,
it does not work: variable will "leak" out of the function.