1.. _Task_API: 2 3Migrating from low-level task API 4================================= 5 6The low-level task API of Intel(R) Threading Building Blocks (TBB) was considered complex and hence 7error-prone, which was the primary reason it had been removed from oneAPI Threading Building Blocks 8(oneTBB). This guide helps with the migration from TBB to oneTBB for the use cases where low-level 9task API is used. 10 11Spawning of individual tasks 12---------------------------- 13For most use cases, the spawning of individual tasks can be replaced with the use of either 14``oneapi::tbb::task_group`` or ``oneapi::tbb::parallel_invoke``. 15 16For example, ``RootTask``, ``ChildTask1``, and ``ChildTask2`` are the user-side functors that 17inherit ``tbb::task`` and implement its interface. Then spawning of ``ChildTask1`` and 18``ChildTask2`` tasks that can execute in parallel with each other and waiting on the ``RootTask`` is 19implemented as: 20 21.. code:: cpp 22 23 #include <tbb/task.h> 24 25 int main() { 26 // Assuming RootTask, ChildTask1, and ChildTask2 are defined. 27 RootTask& root = *new(tbb::task::allocate_root()) RootTask{}; 28 29 ChildTask1& child1 = *new(root.allocate_child()) ChildTask1{/*params*/}; 30 ChildTask2& child2 = *new(root.allocate_child()) ChildTask2{/*params*/}; 31 32 root.set_ref_count(3); 33 34 tbb::task::spawn(child1); 35 tbb::task::spawn(child2); 36 37 root.wait_for_all(); 38 } 39 40 41Using ``oneapi::tbb::task_group`` 42^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 43The code above can be rewritten using ``oneapi::tbb::task_group``: 44 45.. code:: cpp 46 47 #include <oneapi/tbb/task_group.h> 48 49 int main() { 50 // Assuming ChildTask1, and ChildTask2 are defined. 51 oneapi::tbb::task_group tg; 52 tg.run(ChildTask1{/*params*/}); 53 tg.run(ChildTask2{/*params*/}); 54 tg.wait(); 55 } 56 57The code looks more concise now. It also enables lambda functions and does not require you to 58implement ``tbb::task`` interface that overrides the ``tbb::task* tbb::task::execute()`` virtual 59method. With this new approach, you work with functors in a C++-standard way by implementing ``void 60operator() const``: 61 62.. code:: cpp 63 64 struct Functor { 65 // Member to be called when object of this type are passed into 66 // oneapi::tbb::task_group::run() method 67 void operator()() const {} 68 }; 69 70 71Using ``oneapi::tbb::parallel_invoke`` 72^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 73It is also possible to use ``oneapi::tbb::parallel_invoke`` to rewrite the original code and make it 74even more concise: 75 76.. code:: cpp 77 78 #include <oneapi/tbb/parallel_invoke.h> 79 80 int main() { 81 // Assuming ChildTask1, and ChildTask2 are defined. 82 oneapi::tbb::parallel_invoke( 83 ChildTask1{/*params*/}, 84 ChildTask2{/*params*/} 85 ); 86 } 87 88 89Adding more work during task execution 90-------------------------------------- 91``oneapi::tbb::parallel_invoke`` follows a blocking style of programming, which means that it 92completes only when all functors passed to the parallel pattern complete their execution. 93 94In TBB, cases when the amount of work is not known in advance and the work needs to be added during 95the execution of a parallel algorithm were mostly covered by ``tbb::parallel_do`` high-level 96parallel pattern. The ``tbb::parallel_do`` algorithm logic may be implemented using the task API as: 97 98.. code:: cpp 99 100 #include <cstddef> 101 #include <vector> 102 #include <tbb/task.h> 103 104 // Assuming RootTask and OtherWork are defined and implement tbb::task interface. 105 106 struct Task : public tbb::task { 107 Task(tbb::task& root, int i) 108 : m_root(root), m_i(i) 109 {} 110 111 tbb::task* execute() override { 112 // ... do some work for item m_i ... 113 114 if (add_more_parallel_work) { 115 tbb::task& child = *new(m_root.allocate_child()) OtherWork; 116 tbb::task::spawn(child); 117 } 118 return nullptr; 119 } 120 121 tbb::task& m_root; 122 int m_i; 123 }; 124 125 int main() { 126 std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 }; 127 RootTask& root = *new(tbb::task::allocate_root()) RootTask{/*params*/}; 128 129 root.set_ref_count(items.size() + 1); 130 131 for (std::size_t i = 0; i < items.size(); ++i) { 132 Task& task = *new(root.allocate_child()) Task(root, items[i]); 133 tbb::task::spawn(task); 134 } 135 136 root.wait_for_all(); 137 return 0; 138 } 139 140In oneTBB ``tbb::parallel_do`` interface was removed. Instead, the functionality of adding new work 141was included into the ``oneapi::tbb::parallel_for_each`` interface. 142 143The previous use case can be rewritten in oneTBB as follows: 144 145.. code:: cpp 146 147 #include <vector> 148 #include <oneapi/tbb/parallel_for_each.h> 149 150 int main() { 151 std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 }; 152 153 oneapi::tbb::parallel_for_each( 154 items.begin(), items.end(), 155 [](int& i, tbb::feeder<int>& feeder) { 156 157 // ... do some work for item i ... 158 159 if (add_more_parallel_work) 160 feeder.add(i); 161 } 162 ); 163 } 164 165Since both TBB and oneTBB support nested expressions, you can run additional functors from within an 166already running functor. 167 168The previous use case can be rewritten using ``oneapi::tbb::task_group`` as: 169 170.. code:: cpp 171 172 #include <cstddef> 173 #include <vector> 174 #include <oneapi/tbb/task_group.h> 175 176 int main() { 177 std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 }; 178 179 oneapi::tbb::task_group tg; 180 for (std::size_t i = 0; i < items.size(); ++i) { 181 tg.run([&i = items[i], &tg] { 182 183 // ... do some work for item i ... 184 185 if (add_more_parallel_work) 186 // Assuming OtherWork is defined. 187 tg.run(OtherWork{}); 188 189 }); 190 } 191 tg.wait(); 192 } 193 194 195Task recycling 196-------------- 197You can re-run the functor by passing ``*this`` to the ``oneapi::tbb::task_group::run()`` 198method. The functor will be copied in this case. However, its state can be shared among instances: 199 200.. code:: cpp 201 202 #include <memory> 203 #include <oneapi/tbb/task_group.h> 204 205 struct SharedStateFunctor { 206 std::shared_ptr<Data> m_shared_data; 207 oneapi::tbb::task_group& m_task_group; 208 209 void operator()() const { 210 // do some work processing m_shared_data 211 212 if (has_more_work) 213 m_task_group.run(*this); 214 215 // Note that this might be concurrently accessing m_shared_data already 216 } 217 }; 218 219 int main() { 220 // Assuming Data is defined. 221 std::shared_ptr<Data> data = std::make_shared<Data>(/*params*/); 222 oneapi::tbb::task_group tg; 223 tg.run(SharedStateFunctor{data, tg}); 224 tg.wait(); 225 } 226 227Such patterns are particularly useful when the work within a functor is not completed but there is a 228need for the task scheduler to react to outer circumstances, such as cancellation of group 229execution. To avoid issues with concurrent access, it is recommended to submit it for re-execution 230as the last step: 231 232.. code:: cpp 233 234 #include <memory> 235 #include <oneapi/tbb/task_group.h> 236 237 struct SharedStateFunctor { 238 std::shared_ptr<Data> m_shared_data; 239 oneapi::tbb::task_group& m_task_group; 240 241 void operator()() const { 242 // do some work processing m_shared_data 243 244 if (need_to_yield) { 245 m_task_group.run(*this); 246 return; 247 } 248 } 249 }; 250 251 int main() { 252 // Assuming Data is defined. 253 std::shared_ptr<Data> data = std::make_shared<Data>(/*params*/); 254 oneapi::tbb::task_group tg; 255 tg.run(SharedStateFunctor{data, tg}); 256 tg.wait(); 257 } 258 259 260Recycling as child or continuation 261^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 262In oneTBB this kind of recycling is done manually. You have to track when it is time to run the 263task: 264 265.. code:: cpp 266 267 #include <cstddef> 268 #include <vector> 269 #include <atomic> 270 #include <cassert> 271 #include <oneapi/tbb/task_group.h> 272 273 struct ContinuationTask { 274 ContinuationTask(std::vector<int>& data, int& result) 275 : m_data(data), m_result(result) 276 {} 277 278 void operator()() const { 279 for (const auto& item : m_data) 280 m_result += item; 281 } 282 283 std::vector<int>& m_data; 284 int& m_result; 285 }; 286 287 struct ChildTask { 288 ChildTask(std::vector<int>& data, int& result, 289 std::atomic<std::size_t>& tasks_left, std::atomic<std::size_t>& tasks_done, 290 oneapi::tbb::task_group& tg) 291 : m_data(data), m_result(result), m_tasks_left(tasks_left), m_tasks_done(tasks_done), m_tg(tg) 292 {} 293 294 void operator()() const { 295 std::size_t index = --m_tasks_left; 296 m_data[index] = produce_item_for(index); 297 std::size_t done_num = ++m_tasks_done; 298 if (index % 2 != 0) { 299 // Recycling as child 300 m_tg.run(*this); 301 return; 302 } else if (done_num == m_data.size()) { 303 assert(m_tasks_left == 0); 304 // Spawning a continuation that does reduction 305 m_tg.run(ContinuationTask(m_data, m_result)); 306 } 307 } 308 std::vector<int>& m_data; 309 int& m_result; 310 std::atomic<std::size_t>& m_tasks_left; 311 std::atomic<std::size_t>& m_tasks_done; 312 oneapi::tbb::task_group& m_tg; 313 }; 314 315 316 int main() { 317 int result = 0; 318 std::vector<int> items(10, 0); 319 std::atomic<std::size_t> tasks_left{items.size()}; 320 std::atomic<std::size_t> tasks_done{0}; 321 322 oneapi::tbb::task_group tg; 323 for (std::size_t i = 0; i < items.size(); i+=2) { 324 tg.run(ChildTask(items, result, tasks_left, tasks_done, tg)); 325 } 326 tg.wait(); 327 } 328 329 330Scheduler Bypass 331---------------- 332 333TBB ``task::execute()`` method can return a pointer to a task that can be executed next by the current thread. 334This might reduce scheduling overheads compared to direct ``spawn``. Similar to ``spawn``, the returned task 335is not guaranteed to be executed next by the current thread. 336 337.. code:: cpp 338 339 #include <tbb/task.h> 340 341 // Assuming OtherTask is defined. 342 343 struct Task : tbb::task { 344 task* execute(){ 345 // some work to do ... 346 347 auto* other_p = new(this->parent().allocate_child()) OtherTask{}; 348 this->parent().add_ref_count(); 349 350 return other_p; 351 } 352 }; 353 354 int main(){ 355 // Assuming RootTask is defined. 356 RootTask& root = *new(tbb::task::allocate_root()) RootTask{}; 357 358 Task& child = *new(root.allocate_child()) Task{/*params*/}; 359 360 root.add_ref_count(); 361 362 tbb::task_spawn(child); 363 364 root.wait_for_all();; 365 } 366 367In oneTBB this can be done using the preview feature of ``oneapi::tbb::task_group``. 368 369.. code:: cpp 370 371 #define TBB_PREVIEW_TASK_GROUP_EXTENSIONS 1 372 #include <oneapi/tbb/task_group.h> 373 374 // Assuming OtherTask is defined. 375 376 int main(){ 377 oneapi::tbb::task_group tg; 378 379 tg.run([&tg](){ 380 //some work to do ... 381 382 return tg.defer(OtherTask{}); 383 }); 384 385 tg.wait(); 386 } 387 388Here ``oneapi::tbb::task_group::defer`` adds a new task into the ``tg``. However, the task is not put into a 389queue of tasks ready for execution via ``oneapi::tbb::task_group::run``, but bypassed to the executing thread directly 390via function return value. 391 392Deferred task creation 393---------------------- 394The TBB low-level task API separates the task creation from the actual spawning. This separation allows to 395postpone the task spawning, while the parent task and final result production are blocked from premature leave. 396For example, ``RootTask``, ``ChildTask``, and ``CallBackTask`` are the user-side functors that 397inherit ``tbb::task`` and implement its interface. Then, blocking the ``RootTask`` from leaving prematurely 398and waiting on it is implemented as follows: 399 400.. code:: cpp 401 402 #include <tbb/task.h> 403 404 int main() { 405 // Assuming RootTask, ChildTask, and CallBackTask are defined. 406 RootTask& root = *new(tbb::task::allocate_root()) RootTask{}; 407 408 ChildTask& child = *new(root.allocate_child()) ChildTask{/*params*/}; 409 CallBackTask& cb_task = *new(root.allocate_child()) CallBackTask{/*params*/}; 410 411 root.set_ref_count(3); 412 413 tbb::task::spawn(child); 414 415 register_callback([cb_task&](){ 416 tbb::task::enqueue(cb_task); 417 }); 418 419 root.wait_for_all(); 420 // Control flow will reach here only after both ChildTask and CallBackTask are executed, 421 // i.e. after the callback is called 422 } 423 424In oneTBB this can be done using the preview feature of ``oneapi::tbb::task_group``. 425 426.. code:: cpp 427 428 #define TBB_PREVIEW_TASK_GROUP_EXTENSIONS 1 429 #include <oneapi/tbb/task_group.h> 430 431 int main(){ 432 oneapi::tbb::task_group tg; 433 oneapi::tbb::task_arena arena; 434 // Assuming ChildTask and CallBackTask are defined. 435 436 auto cb = tg.defer(CallBackTask{/*params*/}); 437 438 register_callback([&tg, c = std::move(cb), &arena]{ 439 arena.enqueue(c); 440 }); 441 442 tg.run(ChildTask{/*params*/}); 443 444 445 tg.wait(); 446 // Control flow gets here once both ChildTask and CallBackTask are executed 447 // i.e. after the callback is called 448 } 449 450Here ``oneapi::tbb::task_group::defer`` adds a new task into the ``tg``. However, the task is not spawned until 451``oneapi::tbb::task_arena::enqueue`` is called. 452 453.. note:: 454 The call to ``oneapi::tbb::task_group::wait`` will not return control until both ``ChildTask`` and 455 ``CallBackTask`` are executed. 456