expand documentation in the README

2018-08-30 11:07:20 -04:00
parent 3f80895822
commit dba330fde7
1 changed files with 48 additions and 4 deletions
--- a/README.md
+++ b/README.md
@@ -8,6 +8,10 @@ However, using more recent releases is highly recommended. Older versions of
 the dependencies may be compatible at the moment but are not tested and will
 explicitly not be supported.

+Ports to Android (Bionic libc) and musl libc will be created later. The initial
+target is glibc for ease of development, since it supports replacing the malloc
+implementation at runtime using dynamic linking including via `LD_PRELOAD`.
+
 # Basic design

 The current design is very simple and will become a bit more sophisticated as
@@ -37,10 +41,16 @@ allocation and then unmapped on free.
 * Fully out-of-line metadata
 * Deterministic detection of any invalid free (unallocated, unaligned, etc.)
 * Isolated memory region for slab allocations
-    * Divided up into isolated inner regions for each size class, with a high
-      entropy random base for each one
-    * No deterministic offsets from one size class to another
+    * Divided up into isolated inner regions for each size class
+        * High entropy random base for each size class region
+        * No deterministic / low entropy offsets between allocations with
+          different size classes
    * Metadata is completely outside the slab allocation region
+        * No references to metadata within the slab allocation region
+        * No deterministic / low entropy offsets to metadata
+    * Entire slab region starts out non-readable and non-writable
+    * Slabs beyond the cache limit are purged and become non-readable and
+      non-writable memory again
 * Fine-grained randomization within memory regions
    * Randomly sized guard regions for large allocations
    * Random slot selection within slabs
@@ -63,11 +73,45 @@ allocation and then unmapped on free.
  leaving slab size class regions interspersed with guard pages
 * Zero size allocations are memory protected
 * [mostly in-progress] Protected allocator metadata
-* [in-progress] Extension for retrieving the size of allocations with fallback
+* Extension for retrieving the size of allocations with fallback
  to a sentinel for pointers not managed by the allocator
    * Can also return accurate values for pointers *within* small allocations
    * The same applies to pointers within the first page of large allocations,
      otherwise it currently has to return a sentinel
+* No alignment tricks interfering with ASLR like jemalloc, PartitionAlloc, etc.
+* No usage of the legacy brk heap
+* Aggressive sanity checks
+    * Errors other than ENOMEM from mmap, munmap, mprotect and mremap treated
+      as fatal, which can help to decent memory management gone wrong elsewhere
+      in the program
+
+# Randomness
+
+The current implementation of random number generation for randomization-based
+mitigations is based on generating a keystream from a stream cipher (ChaCha8)
+in small chunks. A separate CSPRNG is used for each small size class, large
+allocations, etc. in order to fit into the existing fine-grained locking model
+without needing to waste memory per thread by having the CSPRNG state in Thread
+Local Storage. Similarly, it's protected via the same approach taken for the
+rest of the metadata. The stream cipher is regularly reseeded from the OS to
+provide backtracking and prediction resistance with a negligible cost. The
+reseed interval simply needs to be adjusted to the point that it stops
+registering as having any significant performance impact. The performance
+impact on recent Linux kernels is primarily from the high cost of system calls
+and locking since the implementation is quite efficient (ChaCha20), especially
+for just generating the key and nonce for another stream cipher (ChaCha8).
+
+ChaCha8 is a great fit because it's extremely fast across platforms without
+relying on hardware support or complex platform-specific code. The security
+margins of ChaCha20 would be completely overkill for the use case. Using
+ChaCha8 avoids needing to resort to a non-cryptographically secure PRNG or
+something without a lot of scrunity. The current implementation is simply the
+reference implementation of ChaCha8 converted into a pure keystream by ripping
+out the XOR of the message into the keystream.
+
+The random range generation functions are a highly optimized implementation
+too. Traditional uniform random number generation within a range is very high
+overhead and can easily dwarf the cost of an efficient CSPRNG.

 # Size classes