1.. _Task_API: 2 3Migrating from low-level task API 4================================= 5 6The low-level task API of Intel(R) Threading Building Blocks (TBB) was considered complex and hence 7error-prone, which was the primary reason it had been removed from oneAPI Threading Building Blocks 8(oneTBB). This guide helps with the migration from TBB to oneTBB for the use cases where low-level 9task API is used. 10 11Spawning of individual tasks 12---------------------------- 13For most use cases, the spawning of individual tasks can be replaced with the use of either 14``oneapi::tbb::task_group`` or ``oneapi::tbb::parallel_invoke``. 15 16For example, ``RootTask``, ``ChildTask1``, and ``ChildTask2`` are the user-side functors that 17inherit ``tbb::task`` and implement its interface. Then spawning of ``ChildTask1`` and 18``ChildTask2`` tasks that can execute in parallel with each other and waiting on the ``RootTask`` is 19implemented as: 20 21.. code:: cpp 22 23 #include <tbb/task.h> 24 25 int main() { 26 // Assuming RootTask, ChildTask1, and ChildTask2 are defined. 27 RootTask& root = *new(tbb::task::allocate_root()) RootTask{}; 28 29 ChildTask1& child1 = *new(root.allocate_child()) ChildTask1{/*params*/}; 30 ChildTask2& child2 = *new(root.allocate_child()) ChildTask2{/*params*/}; 31 32 root.set_ref_count(3); 33 34 tbb::task::spawn(child1); 35 tbb::task::spawn(child2); 36 37 root.wait_for_all(); 38 } 39 40 41Using ``oneapi::tbb::task_group`` 42^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 43The code above can be rewritten using ``oneapi::tbb::task_group``: 44 45.. code:: cpp 46 47 #include <oneapi/tbb/task_group.h> 48 49 int main() { 50 // Assuming ChildTask1, and ChildTask2 are defined. 51 oneapi::tbb::task_group tg; 52 tg.run(ChildTask1{/*params*/}); 53 tg.run(ChildTask2{/*params*/}); 54 tg.wait(); 55 } 56 57The code looks more concise now. It also enables lambda functions and does not require you to 58implement ``tbb::task`` interface that overrides the ``tbb::task* tbb::task::execute()`` virtual 59method. With this new approach, you work with functors in a C++-standard way by implementing ``void 60operator() const``: 61 62.. code:: cpp 63 64 struct Functor { 65 // Member to be called when object of this type are passed into 66 // oneapi::tbb::task_group::run() method 67 void operator()() const {} 68 }; 69 70 71Using ``oneapi::tbb::parallel_invoke`` 72^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 73It is also possible to use ``oneapi::tbb::parallel_invoke`` to rewrite the original code and make it 74even more concise: 75 76.. code:: cpp 77 78 #include <oneapi/tbb/parallel_invoke.h> 79 80 int main() { 81 // Assuming ChildTask1, and ChildTask2 are defined. 82 oneapi::tbb::parallel_invoke( 83 ChildTask1{/*params*/}, 84 ChildTask2{/*params*/} 85 ); 86 } 87 88 89Adding more work during task execution 90-------------------------------------- 91``oneapi::tbb::parallel_invoke`` follows a blocking style of programming, which means that it 92completes only when all functors passed to the parallel pattern complete their execution. 93 94In TBB, cases when the amount of work is not known in advance and the work needs to be added during 95the execution of a parallel algorithm were mostly covered by ``tbb::parallel_do`` high-level 96parallel pattern. The ``tbb::parallel_do`` algorithm logic may be implemented using the task API as: 97 98.. code:: cpp 99 100 #include <cstddef> 101 #include <vector> 102 #include <tbb/task.h> 103 104 // Assuming RootTask and OtherWork are defined and implement tbb::task interface. 105 106 struct Task : public tbb::task { 107 Task(tbb::task& root, int i) 108 : m_root(root), m_i(i) 109 {} 110 111 tbb::task* execute() override { 112 // ... do some work for item m_i ... 113 114 if (add_more_parallel_work) { 115 tbb::task& child = *new(m_root.allocate_child()) OtherWork; 116 tbb::task::spawn(child); 117 } 118 return nullptr; 119 } 120 121 tbb::task& m_root; 122 int m_i; 123 }; 124 125 int main() { 126 std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 }; 127 RootTask& root = *new(tbb::task::allocate_root()) RootTask{/*params*/}; 128 129 root.set_ref_count(items.size() + 1); 130 131 for (std::size_t i = 0; i < items.size(); ++i) { 132 Task& task = *new(root.allocate_child()) Task(root, items[i]); 133 tbb::task::spawn(task); 134 } 135 136 root.wait_for_all(); 137 return 0; 138 } 139 140In oneTBB ``tbb::parallel_do`` interface was removed. Instead, the functionality of adding new work 141was included into the ``oneapi::tbb::parallel_for_each`` interface. 142 143The previous use case can be rewritten in oneTBB as follows: 144 145.. code:: cpp 146 147 #include <vector> 148 #include <oneapi/tbb/parallel_for_each.h> 149 150 int main() { 151 std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 }; 152 153 oneapi::tbb::parallel_for_each( 154 items.begin(), items.end(), 155 [](int& i, tbb::feeder<int>& feeder) { 156 157 // ... do some work for item i ... 158 159 if (add_more_parallel_work) 160 feeder.add(i); 161 } 162 ); 163 } 164 165Since both TBB and oneTBB support nested expressions, you can run additional functors from within an 166already running functor. 167 168The previous use case can be rewritten using ``oneapi::tbb::task_group`` as: 169 170.. code:: cpp 171 172 #include <cstddef> 173 #include <vector> 174 #include <oneapi/tbb/task_group.h> 175 176 int main() { 177 std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 }; 178 179 oneapi::tbb::task_group tg; 180 for (std::size_t i = 0; i < items.size(); ++i) { 181 tg.run([&i = items[i], &tg] { 182 183 // ... do some work for item i ... 184 185 if (add_more_parallel_work) 186 // Assuming OtherWork is defined. 187 tg.run(OtherWork{}); 188 189 }); 190 } 191 tg.wait(); 192 } 193 194 195Task recycling 196-------------- 197You can re-run the functor by passing ``*this`` to the ``oneapi::tbb::task_group::run()`` 198method. The functor will be copied in this case. However, its state can be shared among instances: 199 200.. code:: cpp 201 202 #include <memory> 203 #include <oneapi/tbb/task_group.h> 204 205 struct SharedStateFunctor { 206 std::shared_ptr<Data> m_shared_data; 207 oneapi::tbb::task_group& m_task_group; 208 209 void operator()() const { 210 // do some work processing m_shared_data 211 212 if (has_more_work) 213 m_task_group.run(*this); 214 215 // Note that this might be concurrently accessing m_shared_data already 216 } 217 }; 218 219 int main() { 220 // Assuming Data is defined. 221 std::shared_ptr<Data> data = std::make_shared<Data>(/*params*/); 222 oneapi::tbb::task_group tg; 223 tg.run(SharedStateFunctor{data, tg}); 224 tg.wait(); 225 } 226 227Such patterns are particularly useful when the work within a functor is not completed but there is a 228need for the task scheduler to react to outer circumstances, such as cancellation of group 229execution. To avoid issues with concurrent access, it is recommended to submit it for re-execution 230as the last step: 231 232.. code:: cpp 233 234 #include <memory> 235 #include <oneapi/tbb/task_group.h> 236 237 struct SharedStateFunctor { 238 std::shared_ptr<Data> m_shared_data; 239 oneapi::tbb::task_group& m_task_group; 240 241 void operator()() const { 242 // do some work processing m_shared_data 243 244 if (need_to_yield) { 245 m_task_group.run(*this); 246 return; 247 } 248 } 249 }; 250 251 int main() { 252 // Assuming Data is defined. 253 std::shared_ptr<Data> data = std::make_shared<Data>(/*params*/); 254 oneapi::tbb::task_group tg; 255 tg.run(SharedStateFunctor{data, tg}); 256 tg.wait(); 257 } 258 259 260Recycling as child or continuation 261^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 262In oneTBB this kind of recycling is done manually. You have to track when it is time to run the 263task: 264 265.. code:: cpp 266 267 #include <cstddef> 268 #include <vector> 269 #include <atomic> 270 #include <cassert> 271 #include <oneapi/tbb/task_group.h> 272 273 struct ContinuationTask { 274 ContinuationTask(std::vector<int>& data, int& result) 275 : m_data(data), m_result(result) 276 {} 277 278 void operator()() const { 279 for (const auto& item : m_data) 280 m_result += item; 281 } 282 283 std::vector<int>& m_data; 284 int& m_result; 285 }; 286 287 struct ChildTask { 288 ChildTask(std::vector<int>& data, int& result, 289 std::atomic<std::size_t>& tasks_left, std::atomic<std::size_t>& tasks_done, 290 oneapi::tbb::task_group& tg) 291 : m_data(data), m_result(result), m_tasks_left(tasks_left), m_tasks_done(tasks_done), m_tg(tg) 292 {} 293 294 void operator()() const { 295 std::size_t index = --m_tasks_left; 296 m_data[index] = produce_item_for(index); 297 std::size_t done_num = ++m_tasks_done; 298 if (index % 2 != 0) { 299 // Recycling as child 300 m_tg.run(*this); 301 return; 302 } else if (done_num == m_data.size()) { 303 assert(m_tasks_left == 0); 304 // Spawning a continuation that does reduction 305 m_tg.run(ContinuationTask(m_data, m_result)); 306 } 307 } 308 std::vector<int>& m_data; 309 int& m_result; 310 std::atomic<std::size_t>& m_tasks_left; 311 std::atomic<std::size_t>& m_tasks_done; 312 oneapi::tbb::task_group& m_tg; 313 }; 314 315 316 int main() { 317 int result = 0; 318 std::vector<int> items(10, 0); 319 std::atomic<std::size_t> tasks_left{items.size()}; 320 std::atomic<std::size_t> tasks_done{0}; 321 322 oneapi::tbb::task_group tg; 323 for (std::size_t i = 0; i < items.size(); i+=2) { 324 tg.run(ChildTask(items, result, tasks_left, tasks_done, tg)); 325 } 326 tg.wait(); 327 } 328 329 330Scheduler Bypass 331---------------- 332 333TBB ``task::execute()`` method can return a pointer to a task that can be executed next by the current thread. 334This might reduce scheduling overheads compared to direct ``spawn``. Similar to ``spawn``, the returned task 335is not guaranteed to be executed next by the current thread. 336 337.. code:: cpp 338 339 #include <tbb/task.h> 340 341 // Assuming OtherTask is defined. 342 343 struct Task : tbb::task { 344 task* execute(){ 345 // some work to do ... 346 347 auto* other_p = new(this->parent().allocate_child()) OtherTask{}; 348 this->parent().add_ref_count(); 349 350 return other_p; 351 } 352 }; 353 354 int main(){ 355 // Assuming RootTask is defined. 356 RootTask& root = *new(tbb::task::allocate_root()) RootTask{}; 357 358 Task& child = *new(root.allocate_child()) Task{/*params*/}; 359 360 root.add_ref_count(); 361 362 tbb::task_spawn(child); 363 364 root.wait_for_all(); 365 } 366 367In oneTBB, this can be done using ``oneapi::tbb::task_group``. 368 369.. code:: cpp 370 371 #include <oneapi/tbb/task_group.h> 372 373 // Assuming OtherTask is defined. 374 375 int main(){ 376 oneapi::tbb::task_group tg; 377 378 tg.run([&tg](){ 379 //some work to do ... 380 381 return tg.defer(OtherTask{}); 382 }); 383 384 tg.wait(); 385 } 386 387Here ``oneapi::tbb::task_group::defer`` adds a new task into the ``tg``. However, the task is not put into a 388queue of tasks ready for execution via ``oneapi::tbb::task_group::run``, but bypassed to the executing thread directly 389via function return value. 390 391Deferred task creation 392---------------------- 393The TBB low-level task API separates the task creation from the actual spawning. This separation allows to 394postpone the task spawning, while the parent task and final result production are blocked from premature leave. 395For example, ``RootTask``, ``ChildTask``, and ``CallBackTask`` are the user-side functors that 396inherit ``tbb::task`` and implement its interface. Then, blocking the ``RootTask`` from leaving prematurely 397and waiting on it is implemented as follows: 398 399.. code:: cpp 400 401 #include <tbb/task.h> 402 403 int main() { 404 // Assuming RootTask, ChildTask, and CallBackTask are defined. 405 RootTask& root = *new(tbb::task::allocate_root()) RootTask{}; 406 407 ChildTask& child = *new(root.allocate_child()) ChildTask{/*params*/}; 408 CallBackTask& cb_task = *new(root.allocate_child()) CallBackTask{/*params*/}; 409 410 root.set_ref_count(3); 411 412 tbb::task::spawn(child); 413 414 register_callback([cb_task&](){ 415 tbb::task::enqueue(cb_task); 416 }); 417 418 root.wait_for_all(); 419 // Control flow will reach here only after both ChildTask and CallBackTask are executed, 420 // i.e. after the callback is called 421 } 422 423In oneTBB, this can be done using ``oneapi::tbb::task_group``. 424 425.. code:: cpp 426 427 #include <oneapi/tbb/task_group.h> 428 429 int main(){ 430 oneapi::tbb::task_group tg; 431 oneapi::tbb::task_arena arena; 432 // Assuming ChildTask and CallBackTask are defined. 433 434 auto cb = tg.defer(CallBackTask{/*params*/}); 435 436 register_callback([&tg, c = std::move(cb), &arena]{ 437 arena.enqueue(c); 438 }); 439 440 tg.run(ChildTask{/*params*/}); 441 442 443 tg.wait(); 444 // Control flow gets here once both ChildTask and CallBackTask are executed 445 // i.e. after the callback is called 446 } 447 448Here ``oneapi::tbb::task_group::defer`` adds a new task into the ``tg``. However, the task is not spawned until 449``oneapi::tbb::task_arena::enqueue`` is called. 450 451.. note:: 452 The call to ``oneapi::tbb::task_group::wait`` will not return control until both ``ChildTask`` and 453 ``CallBackTask`` are executed. 454