1.. _Task_API:
2
3Migrating from low-level task API
4=================================
5
6The low-level task API of Intel(R) Threading Building Blocks (TBB) was considered complex and hence
7error-prone, which was the primary reason it had been removed from oneAPI Threading Building Blocks
8(oneTBB). This guide helps with the migration from TBB to oneTBB for the use cases where low-level
9task API is used.
10
11Spawning of individual tasks
12----------------------------
13For most use cases, the spawning of individual tasks can be replaced with the use of either
14``oneapi::tbb::task_group`` or ``oneapi::tbb::parallel_invoke``.
15
16For example, ``RootTask``, ``ChildTask1``, and ``ChildTask2`` are the user-side functors that
17inherit ``tbb::task`` and implement its interface. Then spawning of ``ChildTask1`` and
18``ChildTask2`` tasks that can execute in parallel with each other and waiting on the ``RootTask`` is
19implemented as:
20
21.. code:: cpp
22
23    #include <tbb/task.h>
24
25    int main() {
26        // Assuming RootTask, ChildTask1, and ChildTask2 are defined.
27        RootTask& root = *new(tbb::task::allocate_root()) RootTask{};
28
29        ChildTask1& child1 = *new(root.allocate_child()) ChildTask1{/*params*/};
30        ChildTask2& child2 = *new(root.allocate_child()) ChildTask2{/*params*/};
31
32        root.set_ref_count(3);
33
34        tbb::task::spawn(child1);
35        tbb::task::spawn(child2);
36
37        root.wait_for_all();
38    }
39
40
41Using ``oneapi::tbb::task_group``
42^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
43The code above can be rewritten using ``oneapi::tbb::task_group``:
44
45.. code:: cpp
46
47    #include <oneapi/tbb/task_group.h>
48
49    int main() {
50        // Assuming ChildTask1, and ChildTask2 are defined.
51        oneapi::tbb::task_group tg;
52        tg.run(ChildTask1{/*params*/});
53        tg.run(ChildTask2{/*params*/});
54        tg.wait();
55    }
56
57The code looks more concise now. It also enables lambda functions and does not require you to
58implement ``tbb::task`` interface that overrides the ``tbb::task* tbb::task::execute()`` virtual
59method. With this new approach, you work with functors in a C++-standard way by implementing ``void
60operator() const``:
61
62.. code:: cpp
63
64    struct Functor {
65        // Member to be called when object of this type are passed into
66        // oneapi::tbb::task_group::run() method
67        void operator()() const {}
68    };
69
70
71Using ``oneapi::tbb::parallel_invoke``
72^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
73It is also possible to use ``oneapi::tbb::parallel_invoke`` to rewrite the original code and make it
74even more concise:
75
76.. code:: cpp
77
78    #include <oneapi/tbb/parallel_invoke.h>
79
80    int main() {
81        // Assuming ChildTask1, and ChildTask2 are defined.
82        oneapi::tbb::parallel_invoke(
83            ChildTask1{/*params*/},
84            ChildTask2{/*params*/}
85        );
86    }
87
88
89Adding more work during task execution
90--------------------------------------
91``oneapi::tbb::parallel_invoke`` follows a blocking style of programming, which means that it
92completes only when all functors passed to the parallel pattern complete their execution.
93
94In TBB, cases when the amount of work is not known in advance and the work needs to be added during
95the execution of a parallel algorithm were mostly covered by ``tbb::parallel_do`` high-level
96parallel pattern. The ``tbb::parallel_do`` algorithm logic may be implemented using the task API as:
97
98.. code:: cpp
99
100    #include <cstddef>
101    #include <vector>
102    #include <tbb/task.h>
103
104    // Assuming RootTask and OtherWork are defined and implement tbb::task interface.
105
106    struct Task : public tbb::task {
107        Task(tbb::task& root, int i)
108            : m_root(root), m_i(i)
109        {}
110
111        tbb::task* execute() override {
112            // ... do some work for item m_i ...
113
114            if (add_more_parallel_work) {
115                tbb::task& child = *new(m_root.allocate_child()) OtherWork;
116                tbb::task::spawn(child);
117            }
118            return nullptr;
119        }
120
121        tbb::task& m_root;
122        int m_i;
123    };
124
125    int main() {
126        std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 };
127        RootTask& root = *new(tbb::task::allocate_root()) RootTask{/*params*/};
128
129        root.set_ref_count(items.size() + 1);
130
131        for (std::size_t i = 0; i < items.size(); ++i) {
132            Task& task = *new(root.allocate_child()) Task(root, items[i]);
133            tbb::task::spawn(task);
134        }
135
136        root.wait_for_all();
137        return 0;
138    }
139
140In oneTBB ``tbb::parallel_do`` interface was removed. Instead, the functionality of adding new work
141was included into the ``oneapi::tbb::parallel_for_each`` interface.
142
143The previous use case can be rewritten in oneTBB as follows:
144
145.. code:: cpp
146
147    #include <vector>
148    #include <oneapi/tbb/parallel_for_each.h>
149
150    int main() {
151        std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 };
152
153        oneapi::tbb::parallel_for_each(
154            items.begin(), items.end(),
155            [](int& i, tbb::feeder<int>& feeder) {
156
157                // ... do some work for item i ...
158
159                if (add_more_parallel_work)
160                    feeder.add(i);
161            }
162        );
163    }
164
165Since both TBB and oneTBB support nested expressions, you can run additional functors from within an
166already running functor.
167
168The previous use case can be rewritten using ``oneapi::tbb::task_group`` as:
169
170.. code:: cpp
171
172    #include <cstddef>
173    #include <vector>
174    #include <oneapi/tbb/task_group.h>
175
176    int main() {
177        std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 };
178
179        oneapi::tbb::task_group tg;
180        for (std::size_t i = 0; i < items.size(); ++i) {
181            tg.run([&i = items[i], &tg] {
182
183                // ... do some work for item i ...
184
185                if (add_more_parallel_work)
186                    // Assuming OtherWork is defined.
187                    tg.run(OtherWork{});
188
189            });
190        }
191        tg.wait();
192    }
193
194
195Task recycling
196--------------
197You can re-run the functor by passing ``*this`` to the ``oneapi::tbb::task_group::run()``
198method. The functor will be copied in this case. However, its state can be shared among instances:
199
200.. code:: cpp
201
202    #include <memory>
203    #include <oneapi/tbb/task_group.h>
204
205    struct SharedStateFunctor {
206        std::shared_ptr<Data> m_shared_data;
207        oneapi::tbb::task_group& m_task_group;
208
209        void operator()() const {
210            // do some work processing m_shared_data
211
212            if (has_more_work)
213                m_task_group.run(*this);
214
215            // Note that this might be concurrently accessing m_shared_data already
216        }
217    };
218
219    int main() {
220        // Assuming Data is defined.
221        std::shared_ptr<Data> data = std::make_shared<Data>(/*params*/);
222        oneapi::tbb::task_group tg;
223        tg.run(SharedStateFunctor{data, tg});
224        tg.wait();
225    }
226
227Such patterns are particularly useful when the work within a functor is not completed but there is a
228need for the task scheduler to react to outer circumstances, such as cancellation of group
229execution. To avoid issues with concurrent access, it is recommended to submit it for re-execution
230as the last step:
231
232.. code:: cpp
233
234    #include <memory>
235    #include <oneapi/tbb/task_group.h>
236
237    struct SharedStateFunctor {
238        std::shared_ptr<Data> m_shared_data;
239        oneapi::tbb::task_group& m_task_group;
240
241        void operator()() const {
242            // do some work processing m_shared_data
243
244            if (need_to_yield) {
245                m_task_group.run(*this);
246                return;
247            }
248        }
249    };
250
251    int main() {
252        // Assuming Data is defined.
253        std::shared_ptr<Data> data = std::make_shared<Data>(/*params*/);
254        oneapi::tbb::task_group tg;
255        tg.run(SharedStateFunctor{data, tg});
256        tg.wait();
257    }
258
259
260Recycling as child or continuation
261^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
262In oneTBB this kind of recycling is done manually. You have to track when it is time to run the
263task:
264
265.. code:: cpp
266
267    #include <cstddef>
268    #include <vector>
269    #include <atomic>
270    #include <cassert>
271    #include <oneapi/tbb/task_group.h>
272
273    struct ContinuationTask {
274        ContinuationTask(std::vector<int>& data, int& result)
275            : m_data(data), m_result(result)
276        {}
277
278        void operator()() const {
279            for (const auto& item : m_data)
280                m_result += item;
281        }
282
283        std::vector<int>& m_data;
284        int& m_result;
285    };
286
287    struct ChildTask {
288        ChildTask(std::vector<int>& data, int& result,
289                  std::atomic<std::size_t>& tasks_left, std::atomic<std::size_t>& tasks_done,
290                  oneapi::tbb::task_group& tg)
291            : m_data(data), m_result(result), m_tasks_left(tasks_left), m_tasks_done(tasks_done), m_tg(tg)
292        {}
293
294        void operator()() const {
295            std::size_t index = --m_tasks_left;
296            m_data[index] = produce_item_for(index);
297            std::size_t done_num = ++m_tasks_done;
298            if (index % 2 != 0) {
299                // Recycling as child
300                m_tg.run(*this);
301                return;
302            } else if (done_num == m_data.size()) {
303                assert(m_tasks_left == 0);
304                // Spawning a continuation that does reduction
305                m_tg.run(ContinuationTask(m_data, m_result));
306            }
307        }
308        std::vector<int>& m_data;
309        int& m_result;
310        std::atomic<std::size_t>& m_tasks_left;
311        std::atomic<std::size_t>& m_tasks_done;
312        oneapi::tbb::task_group& m_tg;
313    };
314
315
316    int main() {
317        int result = 0;
318        std::vector<int> items(10, 0);
319        std::atomic<std::size_t> tasks_left{items.size()};
320        std::atomic<std::size_t> tasks_done{0};
321
322        oneapi::tbb::task_group tg;
323        for (std::size_t i = 0; i < items.size(); i+=2) {
324            tg.run(ChildTask(items, result, tasks_left, tasks_done, tg));
325        }
326        tg.wait();
327    }
328
329
330Scheduler Bypass
331----------------
332
333TBB ``task::execute()`` method can return a pointer to a task that can be executed next by the current thread.
334This might reduce scheduling overheads compared to direct ``spawn``. Similar to ``spawn``, the returned task
335is not guaranteed to be executed next by the current thread.
336
337.. code:: cpp
338
339    #include <tbb/task.h>
340
341    // Assuming OtherTask is defined.
342
343    struct Task : tbb::task {
344        task* execute(){
345            // some work to do ...
346
347            auto* other_p = new(this->parent().allocate_child()) OtherTask{};
348            this->parent().add_ref_count();
349
350            return other_p;
351        }
352    };
353
354    int main(){
355        // Assuming RootTask is  defined.
356        RootTask& root = *new(tbb::task::allocate_root()) RootTask{};
357
358        Task& child = *new(root.allocate_child()) Task{/*params*/};
359
360        root.add_ref_count();
361
362        tbb::task_spawn(child);
363
364        root.wait_for_all();
365    }
366
367In oneTBB, this can be done using ``oneapi::tbb::task_group``.
368
369.. code:: cpp
370
371    #include <oneapi/tbb/task_group.h>
372
373    // Assuming OtherTask is defined.
374
375    int main(){
376        oneapi::tbb::task_group tg;
377
378        tg.run([&tg](){
379            //some work to do ...
380
381            return tg.defer(OtherTask{});
382        });
383
384        tg.wait();
385    }
386
387Here ``oneapi::tbb::task_group::defer`` adds a new task into the ``tg``. However, the task is not put into a
388queue of tasks ready for execution via ``oneapi::tbb::task_group::run``, but bypassed to the executing thread directly
389via function return value.
390
391Deferred task creation
392----------------------
393The TBB low-level task API separates the task creation from the actual spawning. This separation allows to
394postpone the task spawning, while the parent task and final result production are blocked from premature leave.
395For example, ``RootTask``, ``ChildTask``, and ``CallBackTask`` are the user-side functors that
396inherit ``tbb::task`` and implement its interface. Then, blocking the ``RootTask`` from leaving prematurely
397and waiting on it is implemented as follows:
398
399.. code:: cpp
400
401    #include <tbb/task.h>
402
403    int main() {
404        // Assuming RootTask, ChildTask, and CallBackTask are defined.
405        RootTask& root = *new(tbb::task::allocate_root()) RootTask{};
406
407        ChildTask&    child    = *new(root.allocate_child()) ChildTask{/*params*/};
408        CallBackTask& cb_task  = *new(root.allocate_child()) CallBackTask{/*params*/};
409
410        root.set_ref_count(3);
411
412        tbb::task::spawn(child);
413
414        register_callback([cb_task&](){
415            tbb::task::enqueue(cb_task);
416        });
417
418        root.wait_for_all();
419        // Control flow will reach here only after both ChildTask and CallBackTask are executed,
420        // i.e. after the callback is called
421    }
422
423In oneTBB, this can be done using ``oneapi::tbb::task_group``.
424
425.. code:: cpp
426
427    #include <oneapi/tbb/task_group.h>
428
429    int main(){
430        oneapi::tbb::task_group tg;
431        oneapi::tbb::task_arena arena;
432        // Assuming ChildTask and CallBackTask are defined.
433
434        auto cb = tg.defer(CallBackTask{/*params*/});
435
436        register_callback([&tg, c = std::move(cb), &arena]{
437            arena.enqueue(c);
438        });
439
440        tg.run(ChildTask{/*params*/});
441
442
443        tg.wait();
444        // Control flow gets here once both ChildTask and CallBackTask are executed
445        // i.e. after the callback is called
446    }
447
448Here ``oneapi::tbb::task_group::defer`` adds a new task into the ``tg``. However, the task is not spawned until
449``oneapi::tbb::task_arena::enqueue`` is called.
450
451.. note::
452   The call to ``oneapi::tbb::task_group::wait`` will not return control until both ``ChildTask`` and
453   ``CallBackTask`` are executed.
454