Commit Graph

172 Commits

Author SHA1 Message Date
Rob Landley
dcc286607c Hiroshi found another bug. Currently sed's $ triggers at end of every file,
and with multiple files SuSv3 says it should only trigger at the end of the
LAST file.

The trivial fix I tried first broke if the last file is empty.  Fixing this
properly required restructuring things to create a file list (actually a
FILE * list), and then processing it all in one go.  (There's probably a
smaller way to do this, merging with append_list perhaps.  But let's get
the behavior correct first.)

Note that editing files in place (-i) needs the _old_ behavior, with $
triggering at the end of each file.

Here's a test of all the things this patch fixed.  gnu and busybox seds produce
the same results with this patch, and different without it.

echo -n -e "1one\n1two\n1three" > ../test1
echo -n > ../test2
echo -e "3one\n3two\n3three" > ../test3
sed -n "$ p" ../test1 ../test2 ../test3
sed -n "$ p" ../test1 ../test2
sed -i -n "$ p" ../test1 ../test2 ../test3
2004-11-25 07:21:47 +00:00
Rob Landley
ce4f0e982b Hiroshi Ito found some bugs. The 'c' command (cut and paste) was hardwired
to not put a newline at the end (which was backwards, it should have been
hardwired _to_ put a newline at the end, whether or not the input line
ended with a newline).  Test case for that:

echo | sed -e '$ctest'

And then this would segfault:

echo | sed -e 'g'

Because pattern_space got freed but the dead pointer was only overwritten
in an if statement that didn't trigger if the hold space was empty.  Oops.

While debugging it, I found out that the hold space is persistent between
multiple input files, so I promoted it to a global and added it to the
memory cleanup.  The relevant test case (to compare with That Other Sed) is:

echo -n woo > woo
sed -e h -e g woo
echo "fish" | sed -e '/woo/h' -e "izap" -e 's/woo/thingy/' -e '/fish/g' woo -

And somebody gratuitously stuck in a c99 int8_t type for something that's just
a flag, so I grouped the darn ints.
2004-10-30 06:54:19 +00:00
Eric Andersen
9855548a77 Rob Landley writes:
add sed -r support.

I bumped into a couple of things that want to use extended regular expressions
in sed, and it really isn't that hard to add.  Can't say I've extensively
tested it, but it's small and isn't going to break anything that doesn't use
it, so...

Rob
2004-05-26 10:03:33 +00:00
Glenn L McGrath
21d7d61de1 Use int instead of char for return type, in theory avoiding a cast 2004-05-16 02:35:49 +00:00
Glenn L McGrath
5d2edbf16d Fix for debian bug #248106, should use int for returned getopt value. 2004-05-10 08:59:17 +00:00
Glenn L McGrath
c6992feee3 Update my email address, document some of my tasks in the AUTHORS file 2004-04-25 05:11:19 +00:00
Eric Andersen
b94669543d This sed patch can only be described as "duh". Stat the source file, chmod
the _destination_ file.  (Ah hah!  That works _much_ better...)  I
implemented the behavior, I just forgot to test this corner of it.  My fault,
sorry...

No, gnu sed -i doesn't preverve ownership information.  I checked.
Permissions, yes, ownership info, no.

Rob
2004-04-21 00:57:14 +00:00
Eric Andersen
faa7d863fc So I'm building a linux from scratch system, using a working script to do this
that the _only_ change to is that gnu sed has been replaced with busybox sed.
And ncurses' install phase hangs.  I trace it down, and it's trying to run
gawk.  (Insert obligatory doubletake, but this is FSF code we're talking
about, so...)

It turns out gawk shells out to sed, ala "sed -f /tmp/blah file.h".  The
/tmp/blah file is basically empty (it contains one character, a newline).  So
basically, gawk is using sed as "cat".  With gnu sed, it works like cat,
anyway.

With busybox sed, it tests if its command list is empty after parsing the
command line, and if the list is empty it takes the first file argument as a
sed command string, and if that leaves the file list empty it tries to read
the data to operate on from stdin.  (Hence the hang, since nothing's coming
in on stdin...)

It _should_ be testing whether there were any instances of -f or -e, not
whether it actually got any commands.  Using sed as cat may be kind of
stupid, but it's valid and gawk relies on this behavior.

Here's a patch to fix it, turning a couple of ints into chars in hopes of
saving a bit of the space this adds.  Comments?

Rob
2004-04-21 00:56:22 +00:00
Eric Andersen
aff114c33d Larry Doolittle writes:
This is a bulk spelling fix patch against busybox-1.00-pre10.
If anyone gets a corrupted copy (and cares), let me know and
I will make alternate arrangements.

Erik - please apply.

Authors - please check that I didn't corrupt any meaning.

Package importers - see if any of these changes should be
passed to the upstream authors.

I glossed over lots of sloppy capitalizations, missing apostrophes,
mixed American/British spellings, and German-style compound words.

What is "pretect redefined for test" in cmdedit.c?

Good luck on the 1.00 release!

      - Larry
2004-04-14 17:51:38 +00:00
Rob Landley
25d82397f7 The last patch broke:
sed -i "/^boo/a fred" ipsec.conf

Which works in gnu sed.  (And is _supposed_ to strip all the whitespace before
"fred".)

It also broke:
sed -i -e "/^boo/a \\" -e "   fred" ipsec.conf

I.E. there can legally be spaces between the a and the backslash at the end of
the line.

And strangely enough, gnu sed accepts the following syntax as well:
sed -i "/^boo/a \\  fred" ipsec.conf

Which is a way of having the significant whitespace at the start of the line,
all on one line.  (But notice that the whitespace BEFORE the slash is still
stripped, as is the slash itself.  And notice that the naieve placement of
"\n" there doesn't work, it puts an n at the start of the appended line.  The
double slashing is for shell escapes because you could escape the quote, you
see.  It's turned into a single backslash.  But \n there is _not_ turned into
a newline by the shell.  So there.)

This makes all three syntaxes work in my tests.  I should probably start
writing better documentation at some point.  I posted my current sedtests.py
file to the list, which needs a lot more tests added as well...
2004-04-01 09:23:30 +00:00
Eric Andersen
46390ed829 Junio Hamano, junio at twinsun dot com writes:
The sed command in busybox 1.0.0-pre8 loses leading whitespace
in 'a' command ('i' and 'c' commands are also affected).  A
patch to fix this is attached at the end of this message.

The following is a transcript that reproduces the problem.  The
first run uses busybox 1.0.0-pre3 as "/bin/sed" command, which
gets the expected result.  Later in the test, /bin/sed symlink
is changed to point at busybox 1.0.0-pre8 and the test script is
run again, which shows the failure.

=== reproduction recipe ===
* Part 1.  Use busybox 1.0.0-pre3 as sed; this works.

root# cd /tmp
root# cat 1.sh
#!/bin/sh

cd /tmp
rm -f ipsec.conf ipsec.conf+
cat >ipsec.conf <<\EOF
version 2.0

config setup
        klipsdebug=none
        plutodebug=none
        plutostderrlog=/dev/null

conn %default
        keyingtries=1
        ...
EOF
sed -e '/^config setup/a\
	nat_traversal=yes' ipsec.conf >ipsec.conf+
mv -f ipsec.conf+ ipsec.conf
root# sh -x 1.sh
+ cd /tmp
+ rm -f ipsec.conf ipsec.conf+
+ cat
+ sed -e /^config setup/a\
        nat_traversal=yes ipsec.conf
+ mv -f ipsec.conf+ ipsec.conf
root# cat ipsec.conf
version 2.0

config setup
        nat_traversal=yes
        klipsdebug=none
        plutodebug=none
        plutostderrlog=/dev/null

conn %default
        keyingtries=1
        ...
root# sed --version
sed: invalid option -- -
BusyBox v1.00-pre3 (2004.02.26-18:47+0000) multi-call binary

Usage: sed [-nef] pattern [files...]

* Part 2.  Continuing from the above, use busybox 1.0.0-pre8
  as sed; this fails.

root# ln -s busybox-pre8 /bin/sed-8
root# mv /bin/sed-8 /bin/sed
root# sed --version
This is not GNU sed version 4.0
root# sed --
BusyBox v1.00-pre8 (2004.03.30-02:44+0000) multi-call binary

Usage: sed [-nef] pattern [files...]
root# sh -x 1.sh
+ cd /tmp
+ rm -f ipsec.conf ipsec.conf+
+ cat
+ sed -e /^config setup/a\
        nat_traversal=yes ipsec.conf
+ mv -f ipsec.conf+ ipsec.conf
root# cat ipsec.conf
version 2.0

config setup
nat_traversal=yes
        klipsdebug=none
        plutodebug=none
        plutostderrlog=/dev/null

conn %default
        keyingtries=1
        ...
root#
=== reproduction recipe ends here ===

This problem was introduced in 1.0.0-pre4.  The problem is that
the command argument parsing code strips leading whitespaces too
aggressively.  When running the above example, the piece of code
in question gets "\n\tnat_traversal=yes" as its argument in
cmdstr variable (shown part in the following patch).  What it
needs to do at this point is to strip the first newline and
nothing else, but it instead strips all the leading whitespaces
at the beginning of the string, thus losing the tab character.
The following patch fixes this.
2004-03-31 11:42:40 +00:00
Eric Andersen
c7bda1ce65 Remove trailing whitespace. Update copyright to include 2004. 2004-03-15 08:29:22 +00:00
Rob Landley
53302f80da Add -i option to sed, to edit files in-place. 2004-02-18 09:54:15 +00:00
Eric Andersen
c06f568dda Rob Landley writes:
While building glibc with busybox as part of the development environment, I
found a bug in glibc's regexec can throw sed into an endless loop.  This
fixes it.  Should I put an #ifdef around it or something?  (Note, this patch
also contains the "this is not gnu sed 4.0" hack I posted earlier, which is
also needed to build glibc...)
2004-02-04 10:57:46 +00:00
Rob Landley
40ec4aeb8e Thinko in s//options. (Whitespace skipping in the wrong place.) 2004-01-04 06:42:14 +00:00
Eric Andersen
52a3c2726e Patch from Matt Kraai:
sed is broken:

 busybox sed -n '/^a/,/^a/p' >output <<EOF
 a
 b
 a
 b
 EOF
 cmp -s output - <<EOF
 a
 b
 a
 EOF

The attached patch fixes it.
2003-12-23 08:53:51 +00:00
Eric Andersen
638da75f4b Fix some warnings that have crept in recently 2003-10-09 08:18:36 +00:00
Glenn L McGrath
586d86cc8c Comaptability with gcc-2.95 2003-10-09 07:22:59 +00:00
Glenn L McGrath
42c25735e6 Patch from Rob Landley;
Moving on to building diffutils, busybox sed needs this patch to get
past the first problem.  (Passing it a multi-line command line argument
with -e works, but if you don't use -e it doesn't break up the multiple
lines...)
2003-10-04 05:27:56 +00:00
Glenn L McGrath
0ad4daa54e Patch from Rob Landley to fix backrefs 2003-10-01 10:26:23 +00:00
Glenn L McGrath
738fb33994 Patch by Rob Landley, fix "newline after edit command" 2003-10-01 06:45:11 +00:00
Glenn L McGrath
aa5a602689 Patch by Rob Landley, work in progress update, fixes lots of bugs,
introduces a few others (but they are being worked on)
2003-10-01 03:06:16 +00:00
Glenn L McGrath
761ec20f81 Fix some typo's, remove some extra free statements 2003-09-24 10:23:39 +00:00
Glenn L McGrath
2570b43e82 Configuration option to define wether to follows GNU sed's behaviour
or the posix standard.
Put the cleanup code back the way it was.
2003-09-16 05:25:43 +00:00
Glenn L McGrath
204ff1cea4 Fix a bug that creapt in recently with substitution subprinting, and add
a test for it.
2003-09-16 01:46:36 +00:00
Glenn L McGrath
977451ef44 Fix a simple mistake with pattern space, and add a test for it 2003-09-15 12:07:48 +00:00
Glenn L McGrath
e3e28d3bb6 Fix some memory allocation problems
----------------------------------------------------------------------
2003-09-15 09:22:04 +00:00
Glenn L McGrath
2eed0e2d47 Add a test for the 'P' command and fix current implementation so it
doesnt permanently modify the pattern space.
2003-09-15 06:28:45 +00:00
Glenn L McGrath
6e5687abc3 A test and fix for the sed 'n' command 2003-09-15 06:12:53 +00:00
Glenn L McGrath
73116311e5 Fix for the sed-append-next-line test 2003-09-15 05:42:05 +00:00
Glenn L McGrath
640c1f547f Fix recursion problem 2003-09-15 04:55:29 +00:00
Glenn L McGrath
294d113adb Memory cleanups and fix for echo "foo" | sed 's/foo/bar/;H;q' 2003-09-14 16:28:08 +00:00
Glenn L McGrath
8417c8c38b Cleanup memory usage 2003-09-14 15:24:18 +00:00
Glenn L McGrath
edc388cf4e The previous fix for 's/a/1/;s/b/2/;t one;p;:one;p' broke the case of
echo fooba | ./busybox sed -n 's/foo//;s/bar/found/p'

I really need to start adding these tests to the testsuite.

keep the substituted and altered flags seperate
2003-09-14 08:52:53 +00:00
Glenn L McGrath
3fe475677a Preserve substitution flag value within the current line.
Fixed the following testcase
# cat strings |./busybox sed -n -f test3.sed
1
1
2
c
c
# cat strings
a
b
c
2003-09-14 07:59:28 +00:00
Glenn L McGrath
f4523562b6 Fix branching commands.
If a label isnt specified, jump to end of script, not the last command
in the script.

Print an error and exit if you try and jump to a non-existant label

Works for the following testcase
# cat strings
a
b
c
d
e
f
g
# cat strings | ./busybox sed -n '/d/b;p'
a
b
c
e
f
g
2003-09-14 06:01:14 +00:00
Glenn L McGrath
8aac05bfe5 Patch from Rob Landley
Fixed a memory leak in add_cmd/add_cmd_str by moving the allocation
of sed_cmd down to where it's actually first needed.
                                                                                
In get_address, if index_of_next_unescaped_regexp_delim ever failed, we
wouldn't notice because the return value was added to idx, which was 
already guaranteed to be > 0.  (This is buried in the changes made when 
I redid get_address to be based on pointer arithmetic, because all the tests 
were gratuitously dereferencing with a constant zero, which wasn't obvious.)
         
Comment in parse_regex_delim was wrong: 's' and 'y' both call it.
 
The reason "sed_cmd->num_backrefs = 0;" isn't needed is that sed_cmd was
allocated with cmalloc, which zeroes memory.

Different handling of space after \ in i...

Different handling of pattern "s/a/b s/c/d"

Cool, resursive reads don't cause a crash. :)

Fixed "sed -f blah filename - < filename" since GNU sed was handling 
both - and filenames on the same line.  (You can even list - more than 
once, although it's immediate EOF...)
2003-09-14 04:06:12 +00:00
Glenn L McGrath
7c59a83a77 Stupid typo 2003-09-14 02:37:46 +00:00
Glenn L McGrath
4dc1d25a30 Fix some memory allocation problems 2003-09-14 01:25:31 +00:00
Glenn L McGrath
f36635cec6 Fix the following testcase by disabling global substitution if the regex
is anchored to the start of line, there can be only one subst.
echo "aah" | sed 's/^a/b/g'
2003-09-13 15:12:22 +00:00
Glenn L McGrath
c18ce373a2 Fix the following testcase by storing the state of the adress match with
the command.
# cat strings
a
b
c
d
e
f
g
# ./busybox sed '1,2d;4,$d' <strings
c
# ./busybox sed '4,$d;1,2d' <strings
# sed '4,$d;1,2d' <strings
c
# sed '1,2d;4,$d' <strings
c
2003-09-13 06:57:39 +00:00
Glenn L McGrath
9b04f1841e Fix the substitution print subcommand, it should only print if its
own substitution matched, not previous ones.
e.g
echo fooba | sed -n 's/foo//;s/bar/found/p'
shouldnt print anything
2003-08-30 04:35:07 +00:00
Glenn L McGrath
91e1978ff0 New commands, 'G' and 'H' 2003-04-26 07:40:07 +00:00
Glenn L McGrath
fc4cb4dbb5 Fix logic error in grouped commands 2003-04-12 16:10:42 +00:00
Glenn L McGrath
d4185b0e15 Fix up indenting 2003-04-11 17:10:23 +00:00
Glenn L McGrath
d7fe39b587 Really fix the 'r' command 2003-04-09 15:52:32 +00:00
Glenn L McGrath
d87a7ac269 Fix the sed 'r' command 2003-04-09 15:26:14 +00:00
Glenn L McGrath
2410386611 fix substitution when replacing with &, we shouldnt check for an escape charcter. Its already been taken care of _somewhere_ else 2003-04-09 07:51:43 +00:00
Glenn L McGrath
bd9b32bc0d Label ends at a newline, update comments, rename linked list field 2003-04-09 01:43:54 +00:00
Glenn L McGrath
8d6395d41a Run through indent 2003-04-08 11:56:11 +00:00