14418919fSjohnjiang..  SPDX-License-Identifier: BSD-3-Clause
24418919fSjohnjiang    Copyright(c) 2019 Intel Corporation.
34418919fSjohnjiang
44418919fSjohnjiangStack Library
54418919fSjohnjiang=============
64418919fSjohnjiang
74418919fSjohnjiangDPDK's stack library provides an API for configuration and use of a bounded
84418919fSjohnjiangstack of pointers.
94418919fSjohnjiang
104418919fSjohnjiangThe stack library provides the following basic operations:
114418919fSjohnjiang
124418919fSjohnjiang*  Create a uniquely named stack of a user-specified size and using a
134418919fSjohnjiang   user-specified socket, with either standard (lock-based) or lock-free
144418919fSjohnjiang   behavior.
154418919fSjohnjiang
164418919fSjohnjiang*  Push and pop a burst of one or more stack objects (pointers). These function
174418919fSjohnjiang   are multi-threading safe.
184418919fSjohnjiang
194418919fSjohnjiang*  Free a previously created stack.
204418919fSjohnjiang
214418919fSjohnjiang*  Lookup a pointer to a stack by its name.
224418919fSjohnjiang
234418919fSjohnjiang*  Query a stack's current depth and number of free entries.
244418919fSjohnjiang
254418919fSjohnjiangImplementation
264418919fSjohnjiang~~~~~~~~~~~~~~
274418919fSjohnjiang
284418919fSjohnjiangThe library supports two types of stacks: standard (lock-based) and lock-free.
294418919fSjohnjiangBoth types use the same set of interfaces, but their implementations differ.
304418919fSjohnjiang
31*2d9fd380Sjfb8856606.. _Stack_Library_Std_Stack:
32*2d9fd380Sjfb8856606
334418919fSjohnjiangLock-based Stack
344418919fSjohnjiang----------------
354418919fSjohnjiang
364418919fSjohnjiangThe lock-based stack consists of a contiguous array of pointers, a current
374418919fSjohnjiangindex, and a spinlock. Accesses to the stack are made multi-thread safe by the
384418919fSjohnjiangspinlock.
394418919fSjohnjiang
40*2d9fd380Sjfb8856606.. _Stack_Library_LF_Stack:
41*2d9fd380Sjfb8856606
424418919fSjohnjiangLock-free Stack
434418919fSjohnjiang------------------
444418919fSjohnjiang
454418919fSjohnjiangThe lock-free stack consists of a linked list of elements, each containing a
464418919fSjohnjiangdata pointer and a next pointer, and an atomic stack depth counter. The
474418919fSjohnjianglock-free property means that multiple threads can push and pop simultaneously,
484418919fSjohnjiangand one thread being preempted/delayed in a push or pop operation will not
494418919fSjohnjiangimpede the forward progress of any other thread.
504418919fSjohnjiang
514418919fSjohnjiangThe lock-free push operation enqueues a linked list of pointers by pointing the
524418919fSjohnjianglist's tail to the current stack head, and using a CAS to swing the stack head
534418919fSjohnjiangpointer to the head of the list. The operation retries if it is unsuccessful
544418919fSjohnjiang(i.e. the list changed between reading the head and modifying it), else it
554418919fSjohnjiangadjusts the stack length and returns.
564418919fSjohnjiang
574418919fSjohnjiangThe lock-free pop operation first reserves one or more list elements by
584418919fSjohnjiangadjusting the stack length, to ensure the dequeue operation will succeed
594418919fSjohnjiangwithout blocking. It then dequeues pointers by walking the list -- starting
604418919fSjohnjiangfrom the head -- then swinging the head pointer (using a CAS as well). While
614418919fSjohnjiangwalking the list, the data pointers are recorded in an object table.
624418919fSjohnjiang
634418919fSjohnjiangThe linked list elements themselves are maintained in a lock-free LIFO, and are
644418919fSjohnjiangallocated before stack pushes and freed after stack pops. Since the stack has a
654418919fSjohnjiangfixed maximum depth, these elements do not need to be dynamically created.
664418919fSjohnjiang
674418919fSjohnjiangThe lock-free behavior is selected by passing the *RTE_STACK_F_LF* flag to
684418919fSjohnjiangrte_stack_create().
694418919fSjohnjiang
704418919fSjohnjiangPreventing the ABA Problem
714418919fSjohnjiang^^^^^^^^^^^^^^^^^^^^^^^^^^
724418919fSjohnjiang
734418919fSjohnjiangTo prevent the ABA problem, this algorithm stack uses a 128-bit
744418919fSjohnjiangcompare-and-swap instruction to atomically update both the stack top pointer
754418919fSjohnjiangand a modification counter. The ABA problem can occur without a modification
764418919fSjohnjiangcounter if, for example:
774418919fSjohnjiang
784418919fSjohnjiang1. Thread A reads head pointer X and stores the pointed-to list element.
794418919fSjohnjiang2. Other threads modify the list such that the head pointer is once again X,
804418919fSjohnjiang   but its pointed-to data is different than what thread A read.
814418919fSjohnjiang3. Thread A changes the head pointer with a compare-and-swap and succeeds.
824418919fSjohnjiang
834418919fSjohnjiangIn this case thread A would not detect that the list had changed, and would
844418919fSjohnjiangboth pop stale data and incorrect change the head pointer. By adding a
854418919fSjohnjiangmodification counter that is updated on every push and pop as part of the
864418919fSjohnjiangcompare-and-swap, the algorithm can detect when the list changes even if the
874418919fSjohnjianghead pointer remains the same.
88