main/tbb_userguide/Task-Based_Programming.rst

*67c11716SAlexey Oralov.. _Task-Based_Programming:
*67c11716SAlexey Oralov
*67c11716SAlexey OralovTask-Based Programming
*67c11716SAlexey Oralov======================
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey OralovWhen striving for performance, programming in terms of threads can be a
*67c11716SAlexey Oralovpoor way to do multithreaded programming. It is much better to formulate
*67c11716SAlexey Oralovyour program in terms of *logical tasks*, not threads, for several
*67c11716SAlexey Oralovreasons.
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov-  Matching parallelism to available resources
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov-  Faster task startup and shutdown
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov-  More efficient evaluation order
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov-  Improved load balancing
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov-  Higher–level thinking
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey OralovThe following paragraphs explain these points in detail.
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey OralovThe threads you create with a threading package are *logical* threads,
*67c11716SAlexey Oralovwhich map onto the *physical threads* of the hardware. For computations
*67c11716SAlexey Oralovthat do not wait on external devices, highest efficiency usually occurs
*67c11716SAlexey Oralovwhen there is exactly one running logical thread per physical thread.
*67c11716SAlexey OralovOtherwise, there can be inefficiencies from the mismatch\ *.
*67c11716SAlexey OralovUndersubscription* occurs when there are not enough running logical
*67c11716SAlexey Oralovthreads to keep the physical threads working. *Oversubscription* occurs
*67c11716SAlexey Oralovwhen there are more running logical threads than physical threads.
*67c11716SAlexey OralovOversubscription usually leads to *time sliced* execution of logical
*67c11716SAlexey Oralovthreads, which incurs overheads as discussed in Appendix A, *Costs of
*67c11716SAlexey OralovTime Slicing*. The scheduler tries to avoid oversubscription, by having
*67c11716SAlexey Oralovone logical thread per physical thread, and mapping tasks to logical
*67c11716SAlexey Oralovthreads, in a way that tolerates interference by other threads from the
*67c11716SAlexey Oralovsame or other processes.
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey OralovThe key advantage of tasks versus logical threads is that tasks are much
*67c11716SAlexey Oralov*lighter weight* than logical threads. On Linux systems, starting and
*67c11716SAlexey Oralovterminating a task is about 18 times faster than starting and
*67c11716SAlexey Oralovterminating a thread. On Windows systems, the ratio is more than 100.
*67c11716SAlexey OralovThis is because a thread has its own copy of a lot of resources, such as
*67c11716SAlexey Oralovregister state and a stack. On Linux, a thread even has its own process
*67c11716SAlexey Oralovid. A task in |full_name|, in contrast, is
*67c11716SAlexey Oralovtypically a small routine, and also, cannot be preempted at the task
*67c11716SAlexey Oralovlevel (though its logical thread can be preempted).
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey OralovTasks in oneTBB are efficient too because *the scheduler is unfair*. Thread schedulers typically
*67c11716SAlexey Oralovdistribute time slices in a round-robin fashion. This distribution is
*67c11716SAlexey Oralovcalled "fair", because each logical thread gets its fair share of time.
*67c11716SAlexey OralovThread schedulers are typically fair because it is the safest strategy
*67c11716SAlexey Oralovto undertake without understanding the higher-level organization of a
*67c11716SAlexey Oralovprogram. In task-based programming, the task scheduler does have some
*67c11716SAlexey Oralovhigher-level information, and so can sacrifice fairness for efficiency.
*67c11716SAlexey OralovIndeed, it often delays starting a task until it can make useful
*67c11716SAlexey Oralovprogress.
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey OralovThe scheduler does *load balancing*. In addition to using the right
*67c11716SAlexey Oralovnumber of threads, it is important to distribute work evenly across
*67c11716SAlexey Oralovthose threads. As long as you break your program into enough small
*67c11716SAlexey Oralovtasks, the scheduler usually does a good job of assigning tasks to
*67c11716SAlexey Oralovthreads to balance load. With thread-based programming, you are often
*67c11716SAlexey Oralovstuck dealing with load-balancing yourself, which can be tricky to get
*67c11716SAlexey Oralovright.
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov.. tip::
*67c11716SAlexey Oralov   Design your programs to try to create many more tasks than there are
*67c11716SAlexey Oralov   threads, and let the task scheduler choose the mapping from tasks to
*67c11716SAlexey Oralov   threads.
*67c11716SAlexey Oralov
*67c11716SAlexey Oralov
*67c11716SAlexey OralovFinally, the main advantage of using tasks instead of threads is that
*67c11716SAlexey Oralovthey let you think at a higher, task-based, level. With thread-based
*67c11716SAlexey Oralovprogramming, you are forced to think at the low level of physical
*67c11716SAlexey Oralovthreads to get good efficiency, because you have one logical thread per
*67c11716SAlexey Oralovphysical thread to avoid undersubscription or oversubscription. You also
*67c11716SAlexey Oralovhave to deal with the relatively coarse grain of threads. With tasks,
*67c11716SAlexey Oralovyou can concentrate on the logical dependences between tasks, and leave
*67c11716SAlexey Oralovthe efficient scheduling to the scheduler.
*67c11716SAlexey Oralov