getline(3) is much more readable than manually looping. It has some
overhead due to the allocation of a buffer, but that shouldn't be a
problem here. If that was a problem, we could reuse the buffer (thus
making the function non-reentrant), but I don't think that's worth the
extra complexity.
Using rpmatch(3) instead of a simple y/n test provides i18n to the
response checking. We have a fall-back minimalistic implementation for
systems that lack this function (e.g., musl libc).
While we're at it, apply some other minor improvements to this file:
- Remove comment saying which files use this function. That's likely
to get outdated. And anyway, it's just a grep(1) away, so it doesn't
really add any value.
- Remove unnecessary casts to (void) that were used to verbosely ignore
errors from stdio calls. They add clutter without really adding much
value to the code (or I don't see it).
- Remove comments from the function body. They make the function less
readable. Instead, centralize the description of the function into a
man-page-like comment before the function definition. This keeps the
function body short and sweet.
- Add '#include <stdbool.h>', which was missing.
- Minor whitespace style changes (it doesn't hurt the diff at this
point, since most of the affected lines were already touched by other
changes, so I applied my preferred style :).
Acked-by: Samanta Navarro <ferivoz@riseup.net>
Cc: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Do not stop after 79 characters. Read the complete line to avoid
arbitrary limitations.
Proof of Concept:
```
cat > passwd-poc << EOF
root❌0:0:root:/root:/bin/bash
root❌0:0:root:/root:/bin/bash
root❌0:0:root:/root:/bin/bash
EOF
python -c "print(80*'y')" | pwck passwd-poc
```
Two lines should still be within the file because we agreed only once
to remove a duplicated line.
Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
This commit will serve to document why we shouldn't worry about the
truncation in the call to strlcpy(3). Since we have one more byte in
tmptty than in full_tty, truncation will produce a string that is at
least one byte longer than full_tty. Such a string could never compare
equal, so we're actually handling the truncation in a clever way. Maybe
too clever, but that's why I'm documenting it here.
Now, about the simplification itself:
Since we made sure that both full_tty and tmptty are null-terminated, we
can call strcmp(3) instead of strncmp(3). We can also simplify the
return logic avoiding one branch.
Cc: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Iker Pedrosa <ipedrosa@redhat.com>
* libmisc/utmp.c (is_my_tty): Declare the parameter as a char array,
not char *, as it is not necessarily null-terminated.
Avoid a read overrun when reading 'tty', which comes from
'ut_utname'.
Reported-by: Paul Eggert <eggert@cs.ucla.edu>
Co-developed-by: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Iker Pedrosa <ipedrosa@redhat.com>
* libmisc/date_to_str.c (date_to_str): Do not crash if gmtime(3)
returns NULL because the timestamp is far in the future.
Reported-by: Paul Eggert <eggert@cs.ucla.edu>
Co-developed-by: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Iker Pedrosa <ipedrosa@redhat.com>
gettime.c:25:30: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
/*@observer@*/time_t gettime ()
^
void
We can't use a pointer that was input to realloc(3), nor any pointers
that point to reallocated memory, without making sure that the memory
wasn't moved. If we do, the Behavior is Undefined.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Use of these macros, apart from the benefits mentioned in the commit
that adds the macros, has some other good side effects:
- Consistency in getting the size of the object from sizeof(type),
instead of a mix of sizeof(type) sometimes and sizeof(*p) other
times.
- More readable code: no casts, and no sizeof(), so also shorter lines
that we don't need to cut.
- Consistency in using array allocation calls for allocations of arrays
of objects, even when the object size is 1.
Cc: Valentin V. Bartenev <vbartenev@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
We'll expand the contents in a following commit, so let's move the file
to a more generic name, have a dedicated header, and update includes.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Use the new header for xstrdup()
Signed-off-by: Alejandro Colomar <alx@kernel.org>
On 2/19/23 18:09, David Mudrich wrote:
> I am working on a RAM based Linux OS from source, and try to use
> latest versions of all software. I found shadow needs libbsd's
> readpassphrase(3) as superior alternative to getpass(3). While
> considering if I a) include libbsd, or include libbsd's code of
> readpassphrase(3) into shadow, found, that libbsd's readpassphrase(3)
> never returns \n or \r
> <https://cgit.freedesktop.org/libbsd/tree/src/readpassphrase.c>
> line 122, while agetpass() uses a check for \n in agetpass.c line 108.
> I assume it always fails.
Indeed, it always failed. I made a mistake when writing agetpass(),
assuming that readpassphrase(3) would keep newlines.
>
> I propose a check of len == PASS_MAX - 1, with false positive error for
> exactly PASS_MAX - 1 long passwords.
Instead, I added an extra byte to the allocation to allow a maximum
password length of PASS_MAX (which is the maximum for getpass(3), which
we're replacing.
While doing that, I notice that my previous implementation also had
another bug (minor): The maximum password length was PASS_MAX - 1
instead of PASS_MAX. That's also fixed in this commit.
Reported-by: David Mudrich <dmudrich@gmx.de>
Fixes: 155c9421b9 ("libmisc: agetpass(), erase_pass(): Add functions for getting passwords safely")
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Some programs don't support `(uint16_t) -1` or `(uint32_t) -1` as user
or group IDs. This is because `-1` is used as an error code or as an
unspecified ID, e.g. in `chown(2)` parameters, and in the past, `gid_t`
and `uid_t` have changed width. For legacy reasons, those values have
been kept reserved in programs today (for example systemd does this; see
the documentation in the link below).
This should not be confused with catching overflow in the ID values,
since that is already caught by our ERANGE checks. This is about not
using reserved values that have been reserved for legacy reasons.
Link: <https://systemd.io/UIDS-GIDS/>
Reviewed-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
These comments should indicate which functions they really wrap.
An alternative would be to remove the line completely to avoid
future copy&paste mistakes.
Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
strncat(3), strlcpy(3), and many other functions are often misused for
catenating strings, when they should never be used for that. strlcat(3)
is good. However, there's no equivalent to strlcat(3) similar to
snprintf(3). Let's add stpecpy(), which is similar to strlcat(3), but
it is also the only function compatible with stpeprintf(), which makes
it more useful than strlcat(3).
Signed-off-by: Alejandro Colomar <alx@kernel.org>
This function allows reducing error checking (since errors are
propagated across chained calls), and also simplifies the calculation of
the start and end of the buffer where the string should be written.
Moreover, the new code is more optimized, since many calls to strlen(3)
have been removed.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
[v]stpeprintf() are similar to [v]snprintf(3), but they allow chaining.
[v]snprintf(3) are very dangerous for catenating strings, since the
obvious ways to do it invoke Undefined Behavior, and the ways that avoid
UB are very error-prone.
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
The intention of the code is just to not report an error message when
'typefile' doesn't exist. If we call access(2) and then fopen(2),
there's a race. It's not a huge problem, and the worst thing that can
happen is reporting an error when the file has been removed after
access(2). It's not a problem, but we can fix the race and at the same
time clarify the intention of not warning about ENOENT and also remove
one syscall. Seems like a win-win.
Suggested-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
- Every non-const pointer converts automatically to void *.
- Every pointer converts automatically to void *.
- void * converts to any other pointer.
- const void * converts to any other const pointer.
- Integer variables convert to each other.
I changed the declaration of a few variables in order to allow removing
a cast.
However, I didn't attempt to edit casts inside comparisons, since they
are very delicate. I also kept casts in variadic functions, since they
are necessary, and in allocation functions, because I have other plans
for them.
I also changed a few casts to int that are better as ptrdiff_t.
This change has triggered some warnings about const correctness issues,
which have also been fixed in this patch (see for example src/login.c).
Signed-off-by: Alejandro Colomar <alx@kernel.org>
In variadic functions we still do the cast. In POSIX, it's not
necessary, since NULL is required to be of type 'void *', and 'void *'
is guaranteed to have the same alignment and representation as 'char *'.
However, since ISO C still doesn't mandate that, and moreover they're
doing dubious stuff by adding nullptr, let's be on the cautious side.
Also, C++ requires that NULL is _not_ 'void *', but either plain 0 or
some magic stuff.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
If the file referenced by ENV_TZ has a zero length string, then an out
of boundary write occurs. Also the result can be wrong because it is
assumed that the file will always end with a newline.
Only override a newline character with '\0' to avoid these cases.
This cannot be considered to be security relevant because login.defs
and its contained references to system files should be trusted to begin
with.
Proof of Concept:
1. Compile shadow's su with address sanitizer and --without-libpam
2. Setup your /etc/login.defs to contain ENV_TZ=/etc/tzname
3. Prepare /etc/tzname to contain a '\0' byte at the beginning
`python -c "print('\x00')" > /etc/tzname`
4. Use su
`su -l`
You can see the following output:
`tz.c:45:8: runtime error: index 18446744073709551615 out of bounds for type 'char [8192]'`
Signed-off-by: Samanta Navarro <ferivoz@riseup.net>
__int128, which is needed for optimizing that part of the range, is not
always available. We need the unoptimized version for portability
reasons.
Closes: <https://github.com/shadow-maint/shadow/issues/634>
Fixes: 1a0e13f94e ("Optimize csrand_uniform()")
Reported-by: Adam Sampson <ats@offog.org>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
We do need the unoptimized version of csrand_uniform() for high values
of `n`, since the optimized version depends on having __int128, and it's
not available on several platforms, including ARMv7, IA32, and MK68k.
This reverts commit 848f53c1d3c1362c86d3baab6906e1e4419d2634; however,
I applied some tweaks to the reverted commit.
Reported-by: Adam Sampson <ats@offog.org>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Now that we optimized csrand_uniform(), we don't need these functions.
This reverts commit 7c8fe291b1260e127c10562bfd7616961013730f.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Use a different algorithm to minimize rejection. This is essentially
the same algorithm implemented in the Linux kernel for
__get_random_u32_below(), but written in a more readable way, and
avoiding microopimizations that make it less readable.
Which (the Linux kernel implementation) is itself based on Daniel
Lemire's algorithm from "Fast Random Integer Generation in an Interval",
linked below. However, I couldn't really understand that paper very
much, so I had to reconstruct the proofs from scratch, just from what I
could understand from the Linux kernel implementation source code.
I constructed some graphical explanation of how it works, and why it
is optimal, because I needed to visualize it to understand it. It is
published in the GitHub pull request linked below.
Here goes a wordy explanation of why this algorithm based on
multiplication is better optimized than my original implementation based
on masking.
masking:
It discards the extra bits of entropy that are not necessary for
this operation. This works as if dividing the entire space of
possible csrand() values into smaller spaces of a size that is
a smaller power of 2. Each of those smaller spaces has a
rejection band, so we get as many rejection bands as spaces
there are. For smaller values of 'n', the size of each
rejection band is smaller, but having more rejection bands
compensates for this, and results in the same inefficiency as
for large values of 'n'.
multiplication:
It divides the entire space of possible random numbers in
chunks of size exactly 'n', so that there is only one rejection
band that is the remainder of `2^64 % n`. The worst case is
still similar to the masking algorithm, a rejection band that is
almost half the entire space (n = 2^63 + 1), but for lower
values of 'n', by only having one small rejection band, it is
much faster than the masking algorithm.
This algorithm, however, has one caveat: the implementation
is harder to read, since it relies on several bitwise tricky
operations to perform operations like `2^64 % n`, `mult % 2^64`,
and `mult / 2^64`. And those operations are different depending
on the number of bits of the maximum possible random number
generated by the function. This means that while this algorithm
could also be applied to get uniform random numbers in the range
[0, n-1] quickly from a function like rand(3), which only
produces 31 bits of (non-CS) random numbers, it would need to be
implemented differently. However, that's not a concern for us,
it's just a note so that nobody picks this code and expects it
to just work with rand(3) (which BTW I tried for testing it, and
got a bit confused until I realized this).
Finally, here's some light testing of this implementation, just to know
that I didn't goof it. I pasted this function into a standalone
program, and run it many times to find if it has any bias (I tested also
to see how many iterations it performs, and it's also almost always 1,
but that test is big enough to not paste it here).
int main(int argc, char *argv[])
{
printf("%lu\n", csrand_uniform(atoi(argv[1])));
}
$ seq 1 1000 | while read _; do ./a.out 3; done | grep 1 | wc -l
341
$ seq 1 1000 | while read _; do ./a.out 3; done | grep 1 | wc -l
339
$ seq 1 1000 | while read _; do ./a.out 3; done | grep 1 | wc -l
338
$ seq 1 1000 | while read _; do ./a.out 3; done | grep 2 | wc -l
336
$ seq 1 1000 | while read _; do ./a.out 3; done | grep 2 | wc -l
328
$ seq 1 1000 | while read _; do ./a.out 3; done | grep 2 | wc -l
335
$ seq 1 1000 | while read _; do ./a.out 3; done | grep 0 | wc -l
332
$ seq 1 1000 | while read _; do ./a.out 3; done | grep 0 | wc -l
331
$ seq 1 1000 | while read _; do ./a.out 3; done | grep 0 | wc -l
327
This isn't a complete test for a cryptographically-secure random number
generator, of course, but I leave that for interested parties.
Link: <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e9a688bcb19348862afe30d7c85bc37c4c293471>
Link: <https://github.com/shadow-maint/shadow/pull/624#discussion_r1059574358>
Link: <https://arxiv.org/abs/1805.10941>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Cristian Rodríguez <crrodriguez@opensuse.org>
Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: Björn Esser <besser82@fedoraproject.org>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Joseph Myers <joseph@codesourcery.com>
Cc: Sam James <sam@gentoo.org>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
[Daniel Lemire: Added link to research paper in source code]
Cc: Daniel Lemire <daniel@lemire.me>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
The old code didn't produce very good random numbers. It had a bias.
And that was from performing some unnecessary floating-point
calculations that overcomplicate the problem.
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Cristian Rodríguez <crrodriguez@opensuse.org>
Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: Björn Esser <besser82@fedoraproject.org>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Joseph Myers <joseph@codesourcery.com>
Cc: Sam James <sam@gentoo.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
This API is similar to arc4random_uniform(3). However, for an input of
0, this function is equivalent to csrand(), while arc4random_uniform(0)
returns 0.
This function will be used to reimplement csrand_interval() as a wrapper
around this one.
The current implementation of csrand_interval() doesn't produce very good
random numbers. It has a bias. And that comes from performing some
unnecessary floating-point calculations that overcomplicate the problem.
Looping until the random number hits within bounds is unbiased, and
truncating unwanted bits makes the overhead of the loop very small.
We could reduce loop overhead even more, by keeping unused bits of the
random number, if the width of the mask is not greater than
ULONG_WIDTH/2, however, that complicates the code considerably, and I
prefer to be a bit slower but have simple code.
BTW, Björn really deserves the copyright for csrand() (previously known
as read_random_bytes()), since he rewrote it almost from scratch last
year, and I kept most of its contents. Since he didn't put himself in
the copyright back then, and BSD-3-Clause doesn't allow me to attribute
derived works, I won't add his name, but if he asks, he should be put in
the copyright too.
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Cristian Rodríguez <crrodriguez@opensuse.org>
Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: Björn Esser <besser82@fedoraproject.org>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Joseph Myers <joseph@codesourcery.com>
Cc: Sam James <sam@gentoo.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
These functions implement bit manipulation APIs, which will be added to
C23, so that in the far future, we will be able to replace our functions
by the standard ones, just by adding the stdc_ prefix, and including
<stdbit.h>.
However, we need to avoid UB for an input of 0, so slightly deviate from
C23, and use a different name (with _wrap) for distunguishing our API
from the standard one.
Cc: Joseph Myers <joseph@codesourcery.com>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
A set of APIs similar to arc4random(3) is complex enough to deserve its
own file.
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Cristian Rodríguez <crrodriguez@opensuse.org>
Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: Björn Esser <besser82@fedoraproject.org>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: Joseph Myers <joseph@codesourcery.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
arc4random(3) returns a number.
arc4random_buf(3) fills a buffer.
arc4random_uniform(3) returns a number less than a bound.
and I'd add a hypothetical one which we use:
*_interval() should return a number within the interval [min, max].
In reality, the function being called csrand() in this patch is not
really cryptographically secure, since it had a bias, but a subsequent
patch will fix that.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
We were always casting the result to u_long. Better just use that type
in the function. Since we're returning u_long, it makes sense to also
specify the input as u_long. In fact, that'll help for doing bitwise
operations inside this function.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
I have plans to split this function in smaller functions that implement
bits of this functionallity, to simplify the implementation. So, let's
use names that distinguish them.
This one produces a number within an interval, so make that clear. Also
make clear that the function produces cryptographically-secure numbers.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Comparisons if different signedness can result in unexpected results.
Add casts to ensure operants are of the same type.
gettime.c: In function 'gettime':
gettime.c:58:26: warning: comparison of integer expressions of different signedness: 'long long unsigned int' and 'time_t' {aka 'long int'} [-Wsign-compare]
58 | } else if (epoch > fallback) {
| ^
Cast to time_t, since epoch is less than ULONG_MAX at this point.
idmapping.c: In function 'write_mapping':
idmapping.c:202:48: warning: comparison of integer expressions of different signedness: 'int' and 'long unsigned int' [-Wsign-compare]
202 | if ((written <= 0) || (written >= (bufsize - (pos - buf)))) {
| ^~
newgidmap.c: In function ‘main’:
newgidmap.c:178:40: warning: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Wsign-compare]
178 | if ((written <= 0) || (written >= sizeof(proc_dir_name))) {
| ^~
newuidmap.c: In function ‘main’:
newuidmap.c:107:40: warning: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Wsign-compare]
107 | if ((written <= 0) || (written >= sizeof(proc_dir_name))) {
| ^~