1=============== 2ShadowCallStack 3=============== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11ShadowCallStack is an **experimental** instrumentation pass, currently only 12implemented for x86_64 and aarch64, that protects programs against return 13address overwrites (e.g. stack buffer overflows.) It works by saving a 14function's return address to a separately allocated 'shadow call stack' 15in the function prolog and checking the return address on the stack against 16the shadow call stack in the function epilog. 17 18Comparison 19---------- 20 21To optimize for memory consumption and cache locality, the shadow call stack 22stores an index followed by an array of return addresses. This is in contrast 23to other schemes, like :doc:`SafeStack`, that mirror the entire stack and 24trade-off consuming more memory for shorter function prologs and epilogs with 25fewer memory accesses. Similarly, `Return Flow Guard`_ consumes more memory with 26shorter function prologs and epilogs than ShadowCallStack but suffers from the 27same race conditions (see `Security`_). Intel `Control-flow Enforcement Technology`_ 28(CET) is a proposed hardware extension that would add native support to 29use a shadow stack to store/check return addresses at call/return time. It 30would not suffer from race conditions at calls and returns and not incur the 31overhead of function instrumentation, but it does require operating system 32support. 33 34.. _`Return Flow Guard`: https://xlab.tencent.com/en/2016/11/02/return-flow-guard/ 35.. _`Control-flow Enforcement Technology`: https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf 36 37Compatibility 38------------- 39 40ShadowCallStack currently only supports x86_64 and aarch64. A runtime is not 41currently provided in compiler-rt so one must be provided by the compiled 42application. 43 44On aarch64, the instrumentation makes use of the platform register ``x18``. 45On some platforms, ``x18`` is reserved, and on others, it is designated as 46a scratch register. This generally means that any code that may run on the 47same thread as code compiled with ShadowCallStack must either target one 48of the platforms whose ABI reserves ``x18`` (currently Darwin, Fuchsia and 49Windows) or be compiled with the flag ``-ffixed-x18``. 50 51Security 52======== 53 54ShadowCallStack is intended to be a stronger alternative to 55``-fstack-protector``. It protects from non-linear overflows and arbitrary 56memory writes to the return address slot; however, similarly to 57``-fstack-protector`` this protection suffers from race conditions because of 58the call-return semantics on x86_64. There is a short race between the call 59instruction and the first instruction in the function that reads the return 60address where an attacker could overwrite the return address and bypass 61ShadowCallStack. Similarly, there is a time-of-check-to-time-of-use race in the 62function epilog where an attacker could overwrite the return address after it 63has been checked and before it has been returned to. Modifying the call-return 64semantics to fix this on x86_64 would incur an unacceptable performance overhead 65due to return branch prediction. 66 67The instrumentation makes use of the ``gs`` segment register on x86_64, 68or the ``x18`` register on aarch64, to reference the shadow call stack 69meaning that references to the shadow call stack do not have to be stored in 70memory. This makes it possible to implement a runtime that avoids exposing 71the address of the shadow call stack to attackers that can read arbitrary 72memory. However, attackers could still try to exploit side channels exposed 73by the operating system `[1]`_ `[2]`_ or processor `[3]`_ to discover the 74address of the shadow call stack. 75 76.. _`[1]`: https://eyalitkin.wordpress.com/2017/09/01/cartography-lighting-up-the-shadows/ 77.. _`[2]`: https://www.blackhat.com/docs/eu-16/materials/eu-16-Goktas-Bypassing-Clangs-SafeStack.pdf 78.. _`[3]`: https://www.vusec.net/projects/anc/ 79 80On x86_64, leaf functions are optimized to store the return address in a 81free register and avoid writing to the shadow call stack if a register is 82available. Very short leaf functions are uninstrumented if their execution 83is judged to be shorter than the race condition window intrinsic to the 84instrumentation. 85 86On aarch64, the architecture's call and return instructions (``bl`` and 87``ret``) operate on a register rather than the stack, which means that 88leaf functions are generally protected from return address overwrites even 89without ShadowCallStack. It also means that ShadowCallStack on aarch64 is not 90vulnerable to the same types of time-of-check-to-time-of-use races as x86_64. 91 92Usage 93===== 94 95To enable ShadowCallStack, just pass the ``-fsanitize=shadow-call-stack`` 96flag to both compile and link command lines. On aarch64, you also need to pass 97``-ffixed-x18`` unless your target already reserves ``x18``. 98 99Low-level API 100------------- 101 102``__has_feature(shadow_call_stack)`` 103~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 104 105In some cases one may need to execute different code depending on whether 106ShadowCallStack is enabled. The macro ``__has_feature(shadow_call_stack)`` can 107be used for this purpose. 108 109.. code-block:: c 110 111 #if defined(__has_feature) 112 # if __has_feature(shadow_call_stack) 113 // code that builds only under ShadowCallStack 114 # endif 115 #endif 116 117``__attribute__((no_sanitize("shadow-call-stack")))`` 118~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 119 120Use ``__attribute__((no_sanitize("shadow-call-stack")))`` on a function 121declaration to specify that the shadow call stack instrumentation should not be 122applied to that function, even if enabled globally. 123 124Example 125======= 126 127The following example code: 128 129.. code-block:: c++ 130 131 int foo() { 132 return bar() + 1; 133 } 134 135Generates the following x86_64 assembly when compiled with ``-O2``: 136 137.. code-block:: gas 138 139 push %rax 140 callq foo 141 add $0x1,%eax 142 pop %rcx 143 retq 144 145or the following aarch64 assembly: 146 147.. code-block:: none 148 149 stp x29, x30, [sp, #-16]! 150 mov x29, sp 151 bl bar 152 add w0, w0, #1 153 ldp x29, x30, [sp], #16 154 ret 155 156 157Adding ``-fsanitize=shadow-call-stack`` would output the following x86_64 158assembly: 159 160.. code-block:: gas 161 162 mov (%rsp),%r10 163 xor %r11,%r11 164 addq $0x8,%gs:(%r11) 165 mov %gs:(%r11),%r11 166 mov %r10,%gs:(%r11) 167 push %rax 168 callq foo 169 add $0x1,%eax 170 pop %rcx 171 xor %r11,%r11 172 mov %gs:(%r11),%r10 173 mov %gs:(%r10),%r10 174 subq $0x8,%gs:(%r11) 175 cmp %r10,(%rsp) 176 jne trap 177 retq 178 179 trap: 180 ud2 181 182or the following aarch64 assembly: 183 184.. code-block:: none 185 186 str x30, [x18], #8 187 stp x29, x30, [sp, #-16]! 188 mov x29, sp 189 bl bar 190 add w0, w0, #1 191 ldp x29, x30, [sp], #16 192 ldr x30, [x18, #-8]! 193 ret 194