03ad7ae081
The parser reads from an already freed memory location, thereby causing unpredictable results, in the following situation: - ENABLE_ASH_EXPAND_PRMT is enabled - heredoc is being parsed - command substitution is used within heredoc Examples where this bug crops up are (PS2 is set to "> "): $ cat <<EOF > `echo abc` > EOF -sh: O: not found $ cat <<EOF > $(echo abc) > EOF -sh: {garbage}: not found The presumable reason is that setprompt_if() causes a nested expansion when ENABLE_ASH_EXPAND_PRMT is enabled, therefore leaving "wordtext" in an unusable state. However, when parseheredoc() is called, "tokpushback" is non-zero, which causes the next call to xxreadtoken() to return TWORD, causing the caller to use the invalid "wordtoken" instead of reading the next valid token. The call chain is: list() -> peektoken() [sets tokpushback to 1] -> parseheredoc() -> setprompt_if() -> pushstackmark() -> expandstr() -> readtoken1() [sets lasttoken to TWORD, wordtoken points to expanded prompt] -> popstackmark() [invalidates wordtoken, leaves lasttoken as is] -> readtoken1() -> ...parsebackq -> list() -> andor() -> pipeline() -> readtoken() -> xxreadtoken() [tokpushback non-zero, reuse lasttoken and wordtext] Note that in almost all other contexts, each call to setprompt_if() is preceded by setting "tokpushback" to zero. One exception is "oldstyle" backquote parsing in readtoken1(), but there "tokpushback" is reset afterwards. The other exception is nlprompt(), but this function is only used within readtoken1() (but in contexts where no nested calls to xxreadtoken() occur) and xxreadtoken() (where "tokpushback" is guaranteed to be zero). function old new delta parseheredoc 124 131 +7 Signed-off-by: Christoph Schulz <develop@kristov.de> Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
http://www.opengroup.org/onlinepubs/9699919799/ Open Group Base Specifications Issue 7 http://www.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html Shell & Utilities It says that any of the standard utilities may be implemented as a regular shell built-in. It gives a list of utilities which are usually implemented that way (and some of them can only be implemented as built-ins, like "alias"): alias bg cd command false fc fg getopts jobs kill newgrp pwd read true umask unalias wait http://www.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html Shell Command Language It says that shell must implement special built-ins. Special built-ins differ from regular ones by the fact that variable assignments done on special builtin are *PRESERVED*. That is, VAR=VAL special_builtin; echo $VAR should print VAL. (Another distinction is that an error in special built-in should abort the shell, but this is not such a critical difference, and moreover, at least bash's "set" does not follow this rule, which is even codified in autoconf configure logic now...) List of special builtins: . file : [argument...] break [n] continue [n] eval [argument...] exec [command [argument...]] exit [n] export name[=word]... export -p readonly name[=word]... readonly -p return [n] set [-abCefhmnuvx] [-o option] [argument...] set [+abCefhmnuvx] [+o option] [argument...] set -- [argument...] set -o set +o shift [n] times trap n [condition...] trap [action condition...] unset [-fv] name... In practice, no one uses this obscure feature - none of these builtins gives any special reasons to play such dirty tricks. However. This section also says that *function invocation* should act similar to special built-in. That is, variable assignments done on function invocation should be preserved after function invocation. This is significant: it is not unthinkable to want to run a function with some variables set to special values. But because of the above, it does not work: variable will "leak" out of the function.