Add explanations of encrypted passwords, and fork vs vfork.
This commit is contained in:
parent
08a1b5095d
commit
b1b3cee831
@ -12,6 +12,11 @@
|
||||
</ul>
|
||||
<li><a href="#adding">Adding an applet to busybox</a></li>
|
||||
<li><a href="#standards">What standards does busybox adhere to?</a></li>
|
||||
<li><a href="#tips">Tips and tricks.</a></li>
|
||||
<ul>
|
||||
<li><a href="#tips_encrypted_passwords">Encrypted Passwords</a></li>
|
||||
<li><a href="#tips_vfork">Fork and vfork</a></li>
|
||||
</ul>
|
||||
</ul>
|
||||
|
||||
<h2><b><a name="goals" />What are the goals of busybox?</b></h2>
|
||||
@ -172,6 +177,116 @@ applet is otherwise finished. When polishing and testing a busybox applet,
|
||||
we ensure we have at least the option of full standards compliance, or else
|
||||
document where we (intentionally) fall short.</p>
|
||||
|
||||
<h2><a name="tips" />Programming tips and tricks.</a></h2>
|
||||
|
||||
<p>Various things busybox uses that aren't particularly well documented
|
||||
elsewhere.</p>
|
||||
|
||||
<h2><a name="tips_encrypted_passwords">Encrypted Passwords</a></h2>
|
||||
|
||||
<p>Password fields in /etc/passwd and /etc/shadow are in a special format.
|
||||
If the first character isn't '$', then it's an old DES style password. If
|
||||
the first character is '$' then the password is actually three fields
|
||||
separated by '$' characters:</p>
|
||||
<pre>
|
||||
<b>$type$salt$encrypted_password</b>
|
||||
</pre>
|
||||
|
||||
<p>The "type" indicates which encryption algorithm to use: 1 for MD5 and 2 for SHA1.</p>
|
||||
|
||||
<p>The "salt" is a bunch of ramdom characters (generally 8) the encryption
|
||||
algorithm uses to perturb the password in a known and reproducible way (such
|
||||
as by appending the random data to the unencrypted password, or combining
|
||||
them with exclusive or). Salt is randomly generated when setting a password,
|
||||
and then the same salt value is re-used when checking the password. (Salt is
|
||||
thus stored unencrypted.)</p>
|
||||
|
||||
<p>The advantage of using salt is that the same cleartext password encrypted
|
||||
with a different salt value produces a different encrypted value.
|
||||
If each encrypted password uses a different salt value, an attacker is forced
|
||||
to do the cryptographic math all over again for each password they want to
|
||||
check. Without salt, they could simply produce a big dictionary of commonly
|
||||
used passwords ahead of time, and look up each password in a stolen password
|
||||
file to see if it's a known value. (Even if there are billions of possible
|
||||
passwords in the dictionary, checking each one is just a binary search against
|
||||
a file only a few gigabytes long.) With salt they can't even tell if two
|
||||
different users share the same password without guessing what that password
|
||||
is and decrypting it. They also can't precompute the attack dictionary for
|
||||
a specific password until they know what the salt value is.</p>
|
||||
|
||||
<p>The third field is the encrypted password (plus the salt). For md5 this
|
||||
is 22 bytes.</p>
|
||||
|
||||
<p>The busybox function to handle all this is pw_encrypt(clear, salt) in
|
||||
"libbb/pw_encrypt.c". The first argument is the clear text password to be
|
||||
encrypted, and the second is a string in "$type$salt$password" format, from
|
||||
which the "type" and "salt" fields will be extracted to produce an encrypted
|
||||
value. (Only the first two fields are needed, the third $ is equivalent to
|
||||
the end of the string.) The return value is an encrypted password in
|
||||
/etc/passwd format, with all three $ separated fields. It's stored in
|
||||
a static buffer, 128 bytes long.</p>
|
||||
|
||||
<p>So when checking an existing password, if pw_encrypt(text,
|
||||
old_encrypted_password) returns a string that compares identical to
|
||||
old_encrypted_password, you've got the right password. When setting a new
|
||||
password, generate a random 8 character salt string, put it in the right
|
||||
format with sprintf(buffer, "$%c$%s", type, salt), and feed buffer as the
|
||||
second argument to pw_encrypt(text,buffer).</p>
|
||||
|
||||
<h2><a name="tips_vfork">Fork and vfork</a></h2>
|
||||
|
||||
<p>On systems that haven't got a Memory Management Unit, fork() is unreasonably
|
||||
expensive to implement, so a less capable function called vfork() is used
|
||||
instead.</p>
|
||||
|
||||
<p>The reason vfork() exists is that if you haven't got an MMU then you can't
|
||||
simply set up a second set of page tables and share the physical memory via
|
||||
copy-on-write, which is what fork() normally does. This means that actually
|
||||
forking has to copy all the parent's memory (which could easily be tens of
|
||||
megabytes). And you have to do this even though that memory gets freed again
|
||||
as soon as the exec happens, so it's probably all a big waste of time.</p>
|
||||
|
||||
<p>This is not only slow and a waste of space, it also causes totally
|
||||
unnecessary memory usage spikes based on how big the _parent_ process is (not
|
||||
the child), and these spikes are quite likely to trigger an out of memory
|
||||
condition on small systems (which is where nommu is common anyway). So
|
||||
although you _can_ emulate a real fork on a nommu system, you really don't
|
||||
want to.</p>
|
||||
|
||||
<p>In theory, vfork() is just a fork() that writeably shares the heap and stack
|
||||
rather than copying it (so what one process writes the other one sees). In
|
||||
practice, vfork() has to suspend the parent process until the child does exec,
|
||||
at which point the parent wakes up and resumes by returning from the call to
|
||||
vfork(). All modern kernel/libc combinations implement vfork() to put the
|
||||
parent to sleep until the child does its exec. There's just no other way to
|
||||
make it work: they're sharing the same stack, so if either one returns from its
|
||||
function it stomps on the callstack so that when the other process returns,
|
||||
hilarity ensues. In fact without suspending the parent there's no way to even
|
||||
store separate copies of the return value (the pid) from the vfork() call
|
||||
itself: both assignments write into the same memory location.</p>
|
||||
|
||||
<p>One way to understand (and in fact implement) vfork() is this: imagine
|
||||
the parent does a setjmp and then continues on (pretending to be the child)
|
||||
until the exec() comes around, then the _exec_ does the actual fork, and the
|
||||
parent does a longjmp back to the original vfork call and continues on from
|
||||
there. (It thus becomes obvious why the child can't return, or modify
|
||||
local variables it doesn't want the parent to see changed when it resumes.)
|
||||
|
||||
<p>Note a common mistake: the need for vfork doesn't mean you can't have two
|
||||
processes running at the same time. It means you can't have two processes
|
||||
sharing the same memory without stomping all over each other. As soon as
|
||||
the child calls exec(), the parent resumes.</p>
|
||||
|
||||
<p>(Now in theory, a nommu system could just copy the _stack_ when it forks
|
||||
(which presumably is much shorter than the heap), and leave the heap shared.
|
||||
In practice, you've just wound up in a multi-threaded situation and you can't
|
||||
do a malloc() or free() on your heap without freeing the other process's memory
|
||||
(and if you don't have the proper locking for being threaded, corrupting the
|
||||
heap if both of you try to do it at the same time and wind up stomping on
|
||||
each other while traversing the free memory lists). The thing about vfork is
|
||||
that it's a big red flag warning "there be dragons here" rather than
|
||||
something subtle and thus even more dangerous.)</p>
|
||||
|
||||
<br>
|
||||
<br>
|
||||
<br>
|
||||
|
Loading…
Reference in New Issue
Block a user