Skip to content

STL Hardening

Stephan T. Lavavej edited this page Jan 24, 2025 · 51 revisions

Terminology

  • CDL is _CONTAINER_DEBUG_LEVEL.
    • This was an earlier attempt at something resembling hardening which was never formally designed, and which was never adopted in practice. It is not enabled by default.
  • IDL is _ITERATOR_DEBUG_LEVEL.
    • Release mode defaults to IDL=0 (no checking).
    • Debug mode defaults to IDL=2 (exhaustive correctness checking, extremely expensive).
    • Attempting to enable IDL=2 in release mode is flatly forbidden with a hard #error.
    • Release mode IDL=1 (formerly known as _SECURE_SCL in VS 2005/2008) was the first attempt at something resembling a security-hardened mode. This is an ABI-breaking option, changing the representations of containers and iterators, and introduces dynamic memory allocations. It is quite expensive (2x perf penalty in the worst case), which users rejected. IDL=1 remains technically supported, but it is never enabled by default, virtually no users are aware of it, and we strongly discourage its use. It will be eliminated when we can break ABI in vNext.

Plan

  • Review:
  • Choose a consistent policy for what should be hardened
  • Design how this will be controlled, and its defaults
  • Design a termination mechanism
  • Bonus: Design destructor pointer tombstones
  • Make the code changes (should be the easy part as 90% of the work is already present)
    • We should not use [[likely]], and should remove our existing usage
    • Gaby suggested adding an #error if anyone attempts to define CDL after it has been removed
  • Optimizations to mitigate perf impact, once we have examples of where the perf impact will be
    • Loop in the compiler back-end
    • Additional library changes may also be necessary

Optimization issues

  • VSO-1556181 gsl::span CQ deficiency: predicate inference weakness #1
  • VSO-1556194 gsl::span CQ deficiency: useless multibyte copy
  • VSO-1556195 gsl::span CQ deficiency: predicate inference weakness #2
  • Stephan pinged Troy Johnson about the above issues on 2025-01-21.

Reference

  • #5090 "Implement a hardened mode"

P3471R1 Standard Library Hardening

Stephan's highlights:

Hardening needs to be lightweight enough to be used in production environments. In our experience, a debug-only approach is not sufficient to really move the needle security-wise. Thus, the goal is to identify the minimal set of checks that would deliver the most value to users while requiring minimal overhead, with a particular focus on memory safety as a major source of security vulnerabilities. Checking all preconditions in the library is an explicit non-goal, and would be impossible for many semantic requirements anyway.

To reiterate the last point, an important design principle is that hardening needs to be lightweight enough for production use by a wide variety of real-world programs. In our experience in libc++, a small set of checks that is widely used delivers far more value than a more extensive set of checks that is only enabled by select few users. Thankfully, many of the most valuable checks, such as checking for out-of-bounds access in standard containers, also happen to be relatively cheap.

In the initial proposal, we decide to focus on hardened preconditions that prevent out-of-bounds memory access, i.e., compromise the memory safety of the program. These are some of the most valuable for the user since they help prevent potential security vulnerabilities; many of them are also relatively cheap to implement. More hardened preconditions can potentially be added in the future, but the intent is for their number to be limited to keep hardening viable for production use. Specifically, the proposal is to add hardened preconditions to:

  • Accessors of sequence containers, std::span, std::mdspan, std::string, std::string_view and other similar classes that might attempt to access non-existent elements (e.g. back() on an empty container or operator[] with an invalid index).
  • Modifiers of sequence containers and string classes that assume the container to be non-empty (e.g. pop_back()).
  • Accessors of optional and expected that expect the object to be non-empty.

In our experience, hardening all of these operations is trivial to implement and provides significant security value.

This provides a consistent policy for what should be hardened.

Control Macros

Coarse-Grained

Setting the coarse-grained macro to 1 enables hardening; 0 disables hardening.

We plan to initially release this as disabled-by-default, as optimizer work will be ongoing. We intend to change it to enabled-by-default in a future release. (That way, 17.x will not have it enabled by default.)

  • _MSVC_STL_HARDENING

Fine-Grained

The fine-grained macros provide per-class control, and default to the value of the coarse-grained macro. Coarse-grained 1 + fine-grained 0 allows targeted opt-out for specific classes causing performance issues. Coarse-grained 0 + fine-grained 1 allows targeted opt-in for gradual adoption.

  • _MSVC_STL_HARDENING_VECTOR
  • _MSVC_STL_HARDENING_DEQUE
  • _MSVC_STL_HARDENING_LIST
  • _MSVC_STL_HARDENING_FORWARD_LIST
  • _MSVC_STL_HARDENING_ARRAY
  • _MSVC_STL_HARDENING_BASIC_STRING
  • _MSVC_STL_HARDENING_BASIC_STRING_VIEW
  • _MSVC_STL_HARDENING_SPAN
  • _MSVC_STL_HARDENING_MDSPAN
  • _MSVC_STL_HARDENING_OPTIONAL
  • _MSVC_STL_HARDENING_EXPECTED
  • _MSVC_STL_HARDENING_RANGES_VIEW_INTERFACE
  • _MSVC_STL_HARDENING_VALARRAY
  • _MSVC_STL_HARDENING_BITSET

Rationale

We've successfully used coarse-grained + fine-grained macros for deprecation warnings. They're simple to explain to users (only the coarse-grained macro needs to be initially mentioned), provide as much control to users as they desire, and have very little maintenance burden. Importantly, they don't lead to a combinatoric explosion of mode complexity - all hardening checks are independent, so the ability to enable arbitrary subsets of checks doesn't increase testing burden.

There are three possibilities for the default setting:

  • Disabled-by-default (i.e. opt-in) would lead to the least security improvement. Nobody discovers optional modes. This would be adopted internally, but by <1% of external users. It would lead to the least performance complaints but would not reduce the amount of optimizer work needed. It could actually make optimizer issues harder to discover, with less usage.
  • Controlled by some other setting, e.g. implied by /sdl.
  • Enabled-by-default (i.e. opt-out) would lead to the most security improvement. Some users will encounter performance issues, but a quick search will lead them to the opt-out macro, and the optimizer team will receive valuable reports to investigate.

Stephan believes that if we're serious about doing this, if security is the company's top priority, then STL Hardening needs to be enabled by default. There is the risk of a counterproductive backlash, but:

  • We've learned from _SECURE_SCL's mistakes and the perf impact will be far less, even before the optimizer team can address issues.
  • Range-based for-loops (which did not exist in the 2005-2008 timeframe) and range-based algorithms sidestep the cost of operator[] checking.
  • Being ABI-independent means that code can easily opt-out.

Termination Mechanism

_STL_VERIFY, currently used by CDL and IDL, is reasonable.

  • Defining _STL_CRT_SECURE_INVALID_PARAMETER to call a termination function will entirely replace the default behavior. Otherwise,
    • The termination function must not throw, and must not return. (Attempting to throw would often slam into noexcept.) It should be marked [[noreturn]]. Beyond that, it can do whatever the user wants in order to terminate.
  • Defining _STL_CALL_ABORT_INSTEAD_OF_INVALID_PARAMETER will call ::abort(). Otherwise,
  • This calls the CRT's ::_invalid_parameter_noinfo_noreturn(), which eventually calls __fastfail(FAST_FAIL_INVALID_ARG).

Audit of CDL checks

As of 2024-12-09. We haven't added new CDL checks since then.

Sequence containers

Class Member Notes
vector operator[] ✅ P3471R1's proposed wording
vector front ✅ P3471R1's design
vector back ✅ P3471R1's design
vector pop_back ✅ P3471R1's design
vector<bool> operator[] ✅ P3471R1's proposed wording
vector<bool> front ✅ P3471R1's design
vector<bool> back ✅ P3471R1's design
vector<bool> pop_back ✅ P3471R1's design
⚠️ Missing from CDL!
deque operator[] ✅ P3471R1's proposed wording
deque front ✅ P3471R1's design
deque back ✅ P3471R1's design
deque pop_front ✅ P3471R1's design
⚠️ Missing from CDL!
deque pop_back ✅ P3471R1's design
⚠️ Missing from CDL!
list front ✅ P3471R1's design
list back ✅ P3471R1's design
list pop_front ✅ P3471R1's design
list pop_back ✅ P3471R1's design
forward_list front ✅ P3471R1's design
forward_list pop_front ✅ P3471R1's design
⚠️ Missing from CDL!
array operator[] ✅ P3471R1's proposed wording
array<T, 0> operator[] ✅ P3471R1's proposed wording
array<T, 0> front ✅ P3471R1's design
array<T, 0> back ✅ P3471R1's design

Strings

Class Member Notes
basic_string operator[] ✅ P3471R1's proposed wording
basic_string front ✅ P3471R1's design
basic_string back ✅ P3471R1's design
basic_string resize_and_overwrite ❌ Should be excluded - not hardened by libc++, difficult to mess up the operation's returned size
basic_string pop_back ✅ P3471R1's design
⚠️ Missing from CDL!
basic_string_view basic_string_view(P, N) ❌ Should be excluded - checking for (null, non-zero) is not critical
basic_string_view operator[] ✅ P3471R1's proposed wording
basic_string_view front ✅ P3471R1's design
basic_string_view back ✅ P3471R1's design
basic_string_view remove_prefix ✅ Morally equivalent to pop_front
basic_string_view remove_suffix ✅ Morally equivalent to pop_back

Spans

Class Member Notes
span span(ARGS) ✅ Limited to spans with static extents, verifying valid construction is important to verify that elements actually exist
span first ✅ Similar to pop_back
span last ✅ Similar to pop_front
span subspan ✅ Similar to pop_front and pop_back
span size_bytes ❌ Should be excluded - size would have to be extremely bogus for size_bytes to overflow, i.e. the span is already doomed
span operator[] ✅ P3471R1's proposed wording
span front ✅ P3471R1's design
span back ✅ P3471R1's design
mdspan static_extent ❌ Should be excluded - not checking all library preconditions
mdspan mdspan(OTHER) ❌ Should be excluded - not checking all library preconditions
mdspan operator[] ✅ P3471R1's design
mdspan size ❌ Should be excluded - checking for a pathological overflow case

optional and expected

Class Member Notes
optional operator* ✅ P3471R1's proposed wording
optional operator-> ✅ P3471R1's proposed wording
expected<T, E> operator-> ✅ P3471R1's proposed wording
expected<T, E> operator* ✅ P3471R1's proposed wording
expected<T, E> error ✅ Logically follows from P3471R1's design
expected<void, E> operator* ✅ Precondition with no behavior; when has_value() is false, then an operation has failed
expected<void, E> error ✅ Logically follows from P3471R1's design

Ranges

Class Member Notes
ranges::view_interface operator[] ✅ P3471R1's design
ranges::view_interface front ✅ P3471R1's design
ranges::view_interface back ✅ P3471R1's design
iota_view iota_view(ARGS) ❌ Should be excluded - not checking all library preconditions
repeat_view::iterator operator++ ❌ Should be excluded - not checking all library preconditions
repeat_view::iterator operator-- ❌ Should be excluded - not checking all library preconditions
repeat_view::iterator operator+= ❌ Should be excluded - not checking all library preconditions
repeat_view::iterator operator-= ❌ Should be excluded - not checking all library preconditions
repeat_view repeat_view(ARGS) ❌ Should be excluded - not checking all library preconditions
filter_view pred ❌ Should be excluded - not checking all library preconditions
filter_view begin ❌ Should be excluded - not checking all library preconditions
take_view take_view(ARGS) ❌ Should be excluded - not checking all library preconditions
take_while_view pred ❌ Should be excluded - not checking all library preconditions
take_while_view end ❌ Should be excluded - not checking all library preconditions
drop_view drop_view(ARGS) ❌ Should be excluded - not checking all library preconditions
drop_while_view pred ❌ Should be excluded - not checking all library preconditions
drop_while_view begin ❌ Should be excluded - not checking all library preconditions
views::counted operator() ❌ Should be excluded - not checking all library preconditions
chunk_view chunk_view(ARGS) ❌ Should be excluded - not checking all library preconditions
slide_view slide_view(ARGS) ❌ Should be excluded - not checking all library preconditions
chunk_by_view pred ❌ Should be excluded - not checking all library preconditions
chunk_by_view begin ❌ Should be excluded - not checking all library preconditions
stride_view stride_view(ARGS) ❌ Should be excluded - not checking all library preconditions
cartesian_product_view size ❌ Should be excluded - not checking all library preconditions

generator

Class Member Notes
generator::iterator operator* ❌ Should be excluded - not checking all library preconditions
generator::iterator operator++ ❌ Should be excluded - not checking all library preconditions
generator begin ❌ Should be excluded - not checking all library preconditions

valarray and bitset

Class Member Notes
valarray operator[] ✅ P3471R1's design
valarray 28 binary ops ❌ Should be excluded - checking that the valarrays are the same size is not relevant for memory safety, as long as operator[] is hardened
bitset operator[] ✅ P3471R1's design
⚠️ Missing from CDL! (Thanks, libc++!)

mdspan support types

Class Member Notes
extents extents(ARGS) ❌ Should be excluded - not checking all library preconditions
extents static_extent ❌ Should be excluded - not checking all library preconditions
extents extent ❌ Should be excluded - not checking all library preconditions
layout_left::mapping mapping(ARGS) ❌ Should be excluded - not checking all library preconditions
layout_left::mapping stride ❌ Should be excluded - not checking all library preconditions
layout_left::mapping operator() ❌ Should be excluded - not checking all library preconditions
layout_right::mapping mapping(ARGS) ❌ Should be excluded - not checking all library preconditions
layout_right::mapping stride ❌ Should be excluded - not checking all library preconditions
layout_right::mapping operator() ❌ Should be excluded - not checking all library preconditions
layout_stride::mapping mapping(ARGS) ❌ Should be excluded - not checking all library preconditions
layout_stride::mapping stride ❌ Should be excluded - not checking all library preconditions
layout_stride::mapping operator() ❌ Should be excluded - not checking all library preconditions

Other types

Class Member Notes
condition_variable wait_until(ARGS) ❌ Should be excluded - not relevant for memory safety

Stephan's notes:

Hardening modes

  • They default to none, resulting in virtually no security improvement. Nobody discovers optional modes.
    • Vendors can set different defaults, though.
  • Their fast mode is similar to what we're doing: "a set of security-critical checks that can be done with relatively little overhead in constant time and are intended to be used in production".
  • Their extensive mode is vaguely similar to what CDL was doing. Few users would be motivated to seek out additional, non-security-critical, more-performance-impacting checks for production.
  • Their debug mode is similar to IDL=2, which we enable by default in debug mode.

Assertion categories for fast mode

  • valid-element-access is similar to P3471R1's design. (Excluding their discussion of ABI-breaking iterator checks.)
  • valid-input-range is not part of P3471R1's design. These checks don't appear to be especially valuable:
    • Reversed iterator pairs can be detected when they're random-access (we do this in IDL=2), but this is pretty hard to mess up. Most algorithms are called on entire containers, with v.begin(), v.end() directly in the source code. (Unlike vec[idx] where idx is some runtime value.) C++20 range-based algorithms make this even more difficult to mess up.
    • Detecting different-parent iterator pairs requires breaking ABI. (We do this in IDL=2.) It's also pretty hard to mess up.

Termination mechanism

They use LLVM's __builtin_trap() which emits a single instruction, or abort() if unavailable on the target. They say this minimizes the code size penalty.

For us, that would be analogous to directly using __fastfail. Calling ::_invalid_parameter_noinfo_noreturn(), which is separately compiled in the CRT, should have a reasonably small code size penalty (one function call, no arguments).

Vendors, but not users, can override their termination mechanism.

ABI

Their document discusses ABI-breaking bounded iterator modes. This is out of scope for us (ABI-breaking modes are effectively impossible to adopt).

Types

We should harden bitset, not currently covered by CDL.

Bonus: Destructor pointer tombstones

This is complementary to precondition hardening, which prevents access to non-existent elements. When containers and smart pointers are destroyed, their owned elements/objects no longer exist. To resist use-after-free mistakes, the destructors of containers and smart pointers should set their pointers to some invalid value, instead of continuing to point to deallocated memory.

Choosing a tombstone value

Null is an obvious possibility, and it's certainly safer than nothing, but it could tempt users into making non-Standard assumptions. (We previously encountered cases where users assumed that basic_string could be repeatedly destroyed without harm.)

Windows guarantees that the first 64K of address space is permanently bogus. Stephan believes that decimal 19937 is a good sentinel value, for the following reasons:

  • It's associated with the STL (mt19937), potentially making it easier to figure out "where did this value come from?" during debugging
  • It's in the middle-ish of 64K, ensuring that moderate-size additions and subtractions remain bogus
  • It's somewhat less than the 32K midpoint, leaving more room for addition offsets (which are much more likely when indexing)
  • It's odd (prime!), ensuring that it isn't usefully aligned for anything

Users may want to customize this, potentially to a value that's chosen at runtime.

Classes

  • The most important containers:
    • vector
    • basic_string
  • The smart pointers:
    • unique_ptr
    • shared_ptr
  • Vaguely like a smart pointer (can own a dynamically allocated function object):
    • function
  • Vaguely like a zero-or-one container:
    • optional
      • Setting it to empty will work with precondition hardening to prevent access to the object.
      • Even though a T& or T* could have been previously obtained, we should not attempt to scribble over T's bytes. That would be much more expensive, we don't know what bytes are valid for T (indeed, they may all be valid), and we should focus on cheap and significant interventions.

Control Macro

_MSVC_STL_DESTRUCTOR_TOMBSTONES

Defining this to 1 enables the feature; defining it to 0 disables the feature.

We plan to initially ship this as disabled-by-default, but plan to change it to enabled-by-default in a future release.

Value Macro

_MSVC_STL_UINTPTR_TOMBSTONE_VALUE

This defaults to uintptr_t{19937}, but can be entirely replaced. If replaced with a function call, it should be ::fully::qualified().

If users replace this with uintptr_t{0}, they should get destructor idempotency for vector and basic_string (which we don't recommend, but previously had to support).

Special considerations

This is applicable to raw pointers only. We'll use if constexpr to avoid messing with (extremely unusual) fancy pointers.

This is applicable to runtime only. We'll use _Is_constant_evaluated() to avoid interfering with constexpr evaluation (where we can't use reinterpret_cast).

We'll need to overhaul the VSO_0000000_wcfb01_idempotent_container_destructors test.

We should check whether the optimizer skips such assignments.

For basic_string and function, we should think about whether the class should be set to "small mode", or "large mode" with tombstone pointer.