1.. _Task_API: 2 3Migrating from low-level task API 4================================= 5 6The low-level task API of Intel(R) Threading Building Blocks (TBB) was considered complex and hence 7error-prone, which was the primary reason it had been removed from oneAPI Threading Building Blocks 8(oneTBB). This guide helps with the migration from TBB to oneTBB for the use cases where low-level 9task API is used. 10 11Spawning of individual tasks 12---------------------------- 13For most use cases, the spawning of individual tasks can be replaced with the use of either 14``oneapi::tbb::task_group`` or ``oneapi::tbb::parallel_invoke``. 15 16For example, ``RootTask``, ``ChildTask1``, and ``ChildTask2`` are the user-side functors that 17inherit ``tbb::task`` and implement its interface. Then spawning of ``ChildTask1`` and 18``ChildTask2`` tasks that can execute in parallel with each other and waiting on the ``RootTask`` is 19implemented as: 20 21.. code:: cpp 22 23 #include <tbb/task.h> 24 25 int main() { 26 // Assuming RootTask, ChildTask1, and ChildTask2 are defined. 27 RootTask& root = *new(tbb::task::allocate_root()) RootTask{}; 28 29 ChildTask1& child1 = *new(root.allocate_child()) ChildTask1{/*params*/}; 30 ChildTask2& child2 = *new(root.allocate_child()) ChildTask2{/*params*/}; 31 32 root.set_ref_count(3); 33 34 tbb::task::spawn(child1); 35 tbb::task::spawn(child2); 36 37 root.wait_for_all(); 38 } 39 40 41Using ``oneapi::tbb::task_group`` 42^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 43The code above can be rewritten using ``oneapi::tbb::task_group``: 44 45.. code:: cpp 46 47 #include <oneapi/tbb/task_group.h> 48 49 int main() { 50 // Assuming ChildTask1, and ChildTask2 are defined. 51 oneapi::tbb::task_group tg; 52 tg.run(ChildTask1{/*params*/}); 53 tg.run(ChildTask2{/*params*/}); 54 tg.wait(); 55 } 56 57The code looks more concise now. It also enables lambda functions and does not require you to 58implement ``tbb::task`` interface that overrides the ``tbb::task* tbb::task::execute()`` virtual 59method. With this new approach, you work with functors in a C++-standard way by implementing ``void 60operator() const``: 61 62.. code:: cpp 63 64 struct Functor { 65 // Member to be called when object of this type are passed into 66 // oneapi::tbb::task_group::run() method 67 void operator()() const {} 68 }; 69 70 71Using ``oneapi::tbb::parallel_invoke`` 72^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 73It is also possible to use ``oneapi::tbb::parallel_invoke`` to rewrite the original code and make it 74even more concise: 75 76.. code:: cpp 77 78 #include <oneapi/tbb/parallel_invoke.h> 79 80 int main() { 81 // Assuming ChildTask1, and ChildTask2 are defined. 82 oneapi::tbb::parallel_invoke( 83 ChildTask1{/*params*/}, 84 ChildTask2{/*params*/} 85 ); 86 } 87 88 89Adding more work during task execution 90-------------------------------------- 91``oneapi::tbb::parallel_invoke`` follows a blocking style of programming, which means that it 92completes only when all functors passed to the parallel pattern complete their execution. 93 94In TBB, cases when the amount of work is not known in advance and the work needs to be added during 95the execution of a parallel algorithm were mostly covered by ``tbb::parallel_do`` high-level 96parallel pattern. The ``tbb::parallel_do`` algorithm logic may be implemented using the task API as: 97 98.. code:: cpp 99 100 #include <cstddef> 101 #include <vector> 102 #include <tbb/task.h> 103 104 // Assuming RootTask and OtherWork are defined and implement tbb::task interface. 105 106 struct Task : public tbb::task { 107 Task(tbb::task& root, int i) 108 : m_root(root), m_i(i) 109 {} 110 111 tbb::task* execute() override { 112 // ... do some work for item m_i ... 113 114 if (add_more_parallel_work) { 115 tbb::task& child = *new(m_root.allocate_child()) OtherWork; 116 tbb::task::spawn(child); 117 } 118 return nullptr; 119 } 120 121 tbb::task& m_root; 122 int m_i; 123 }; 124 125 int main() { 126 std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 }; 127 RootTask& root = *new(tbb::task::allocate_root()) RootTask{/*params*/}; 128 129 root.set_ref_count(items.size() + 1); 130 131 for (std::size_t i = 0; i < items.size(); ++i) { 132 Task& task = *new(root.allocate_child()) Task(root, items[i]); 133 tbb::task::spawn(task); 134 } 135 136 root.wait_for_all(); 137 return 0; 138 } 139 140In oneTBB ``tbb::parallel_do`` interface was removed. Instead, the functionality of adding new work 141was included into the ``oneapi::tbb::parallel_for_each`` interface. 142 143The previous use case can be rewritten in oneTBB as follows: 144 145.. code:: cpp 146 147 #include <vector> 148 #include <oneapi/tbb/parallel_for_each.h> 149 150 int main() { 151 std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 }; 152 153 oneapi::tbb::parallel_for_each( 154 items.begin(), items.end(), 155 [](int& i, tbb::feeder<int>& feeder) { 156 157 // ... do some work for item i ... 158 159 if (add_more_parallel_work) 160 feeder.add(i); 161 } 162 ); 163 } 164 165Since both TBB and oneTBB support nested expressions, you can run additional functors from within an 166already running functor. 167 168The previous use case can be rewritten using ``oneapi::tbb::task_group`` as: 169 170.. code:: cpp 171 172 #include <cstddef> 173 #include <vector> 174 #include <oneapi/tbb/task_group.h> 175 176 int main() { 177 std::vector<int> items = { 0, 1, 2, 3, 4, 5, 6, 7 }; 178 179 oneapi::tbb::task_group tg; 180 for (std::size_t i = 0; i < items.size(); ++i) { 181 tg.run([&i = items[i], &tg] { 182 183 // ... do some work for item i ... 184 185 if (add_more_parallel_work) 186 // Assuming OtherWork is defined. 187 tg.run(OtherWork{}); 188 189 }); 190 } 191 tg.wait(); 192 } 193 194 195Task recycling 196-------------- 197You can re-run the functor by passing ``*this`` to the ``oneapi::tbb::task_group::run()`` 198method. The functor will be copied in this case. However, its state can be shared among instances: 199 200.. code:: cpp 201 202 #include <memory> 203 #include <oneapi/tbb/task_group.h> 204 205 struct SharedStateFunctor { 206 std::shared_ptr<Data> m_shared_data; 207 oneapi::tbb::task_group& m_task_group; 208 209 void operator()() const { 210 // do some work processing m_shared_data 211 212 if (has_more_work) 213 m_task_group.run(*this); 214 215 // Note that this might be concurrently accessing m_shared_data already 216 } 217 }; 218 219 int main() { 220 // Assuming Data is defined. 221 std::shared_ptr<Data> data = std::make_shared<Data>(/*params*/); 222 oneapi::tbb::task_group tg; 223 tg.run(SharedStateFunctor{data, tg}); 224 tg.wait(); 225 } 226 227Such patterns are particularly useful when the work within a functor is not completed but there is a 228need for the task scheduler to react to outer circumstances, such as cancellation of group 229execution. To avoid issues with concurrent access, it is recommended to submit it for re-execution 230as the last step: 231 232.. code:: cpp 233 234 #include <memory> 235 #include <oneapi/tbb/task_group.h> 236 237 struct SharedStateFunctor { 238 std::shared_ptr<Data> m_shared_data; 239 oneapi::tbb::task_group& m_task_group; 240 241 void operator()() const { 242 // do some work processing m_shared_data 243 244 if (need_to_yield) { 245 m_task_group.run(*this); 246 return; 247 } 248 } 249 }; 250 251 int main() { 252 // Assuming Data is defined. 253 std::shared_ptr<Data> data = std::make_shared<Data>(/*params*/); 254 oneapi::tbb::task_group tg; 255 tg.run(SharedStateFunctor{data, tg}); 256 tg.wait(); 257 } 258 259 260Recycling as child or continuation 261^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 262In oneTBB this kind of recycling is done manually. You have to track when it is time to run the 263task: 264 265.. code:: cpp 266 267 #include <cstddef> 268 #include <vector> 269 #include <atomic> 270 #include <cassert> 271 #include <oneapi/tbb/task_group.h> 272 273 struct ContinuationTask { 274 ContinuationTask(std::vector<int>& data, int& result) 275 : m_data(data), m_result(result) 276 {} 277 278 void operator()() const { 279 for (const auto& item : m_data) 280 m_result += item; 281 } 282 283 std::vector<int>& m_data; 284 int& m_result; 285 }; 286 287 struct ChildTask { 288 ChildTask(std::vector<int>& data, int& result, 289 std::atomic<std::size_t>& tasks_left, std::atomic<std::size_t>& tasks_done, 290 oneapi::tbb::task_group& tg) 291 : m_data(data), m_result(result), m_tasks_left(tasks_left), m_tasks_done(tasks_done), m_tg(tg) 292 {} 293 294 void operator()() const { 295 std::size_t index = --m_tasks_left; 296 m_data[index] = produce_item_for(index); 297 std::size_t done_num = ++m_tasks_done; 298 if (index % 2 != 0) { 299 // Recycling as child 300 m_tg.run(*this); 301 return; 302 } else if (done_num == m_data.size()) { 303 assert(m_tasks_left == 0); 304 // Spawning a continuation that does reduction 305 m_tg.run(ContinuationTask(m_data, m_result)); 306 } 307 } 308 std::vector<int>& m_data; 309 int& m_result; 310 std::atomic<std::size_t>& m_tasks_left; 311 std::atomic<std::size_t>& m_tasks_done; 312 oneapi::tbb::task_group& m_tg; 313 }; 314 315 316 int main() { 317 int result = 0; 318 std::vector<int> items(10, 0); 319 std::atomic<std::size_t> tasks_left{items.size()}; 320 std::atomic<std::size_t> tasks_done{0}; 321 322 oneapi::tbb::task_group tg; 323 for (std::size_t i = 0; i < items.size(); i+=2) { 324 tg.run(ChildTask(items, result, tasks_left, tasks_done, tg)); 325 } 326 tg.wait(); 327 } 328