1*67c11716SAlexey Oralov.. _Task-Based_Programming: 2*67c11716SAlexey Oralov 3*67c11716SAlexey OralovTask-Based Programming 4*67c11716SAlexey Oralov====================== 5*67c11716SAlexey Oralov 6*67c11716SAlexey Oralov 7*67c11716SAlexey OralovWhen striving for performance, programming in terms of threads can be a 8*67c11716SAlexey Oralovpoor way to do multithreaded programming. It is much better to formulate 9*67c11716SAlexey Oralovyour program in terms of *logical tasks*, not threads, for several 10*67c11716SAlexey Oralovreasons. 11*67c11716SAlexey Oralov 12*67c11716SAlexey Oralov 13*67c11716SAlexey Oralov- Matching parallelism to available resources 14*67c11716SAlexey Oralov 15*67c11716SAlexey Oralov 16*67c11716SAlexey Oralov- Faster task startup and shutdown 17*67c11716SAlexey Oralov 18*67c11716SAlexey Oralov 19*67c11716SAlexey Oralov- More efficient evaluation order 20*67c11716SAlexey Oralov 21*67c11716SAlexey Oralov 22*67c11716SAlexey Oralov- Improved load balancing 23*67c11716SAlexey Oralov 24*67c11716SAlexey Oralov 25*67c11716SAlexey Oralov- Higher–level thinking 26*67c11716SAlexey Oralov 27*67c11716SAlexey Oralov 28*67c11716SAlexey OralovThe following paragraphs explain these points in detail. 29*67c11716SAlexey Oralov 30*67c11716SAlexey Oralov 31*67c11716SAlexey OralovThe threads you create with a threading package are *logical* threads, 32*67c11716SAlexey Oralovwhich map onto the *physical threads* of the hardware. For computations 33*67c11716SAlexey Oralovthat do not wait on external devices, highest efficiency usually occurs 34*67c11716SAlexey Oralovwhen there is exactly one running logical thread per physical thread. 35*67c11716SAlexey OralovOtherwise, there can be inefficiencies from the mismatch\ *. 36*67c11716SAlexey OralovUndersubscription* occurs when there are not enough running logical 37*67c11716SAlexey Oralovthreads to keep the physical threads working. *Oversubscription* occurs 38*67c11716SAlexey Oralovwhen there are more running logical threads than physical threads. 39*67c11716SAlexey OralovOversubscription usually leads to *time sliced* execution of logical 40*67c11716SAlexey Oralovthreads, which incurs overheads as discussed in Appendix A, *Costs of 41*67c11716SAlexey OralovTime Slicing*. The scheduler tries to avoid oversubscription, by having 42*67c11716SAlexey Oralovone logical thread per physical thread, and mapping tasks to logical 43*67c11716SAlexey Oralovthreads, in a way that tolerates interference by other threads from the 44*67c11716SAlexey Oralovsame or other processes. 45*67c11716SAlexey Oralov 46*67c11716SAlexey Oralov 47*67c11716SAlexey OralovThe key advantage of tasks versus logical threads is that tasks are much 48*67c11716SAlexey Oralov*lighter weight* than logical threads. On Linux systems, starting and 49*67c11716SAlexey Oralovterminating a task is about 18 times faster than starting and 50*67c11716SAlexey Oralovterminating a thread. On Windows systems, the ratio is more than 100. 51*67c11716SAlexey OralovThis is because a thread has its own copy of a lot of resources, such as 52*67c11716SAlexey Oralovregister state and a stack. On Linux, a thread even has its own process 53*67c11716SAlexey Oralovid. A task in |full_name|, in contrast, is 54*67c11716SAlexey Oralovtypically a small routine, and also, cannot be preempted at the task 55*67c11716SAlexey Oralovlevel (though its logical thread can be preempted). 56*67c11716SAlexey Oralov 57*67c11716SAlexey Oralov 58*67c11716SAlexey OralovTasks in oneTBB are efficient too because *the scheduler is unfair*. Thread schedulers typically 59*67c11716SAlexey Oralovdistribute time slices in a round-robin fashion. This distribution is 60*67c11716SAlexey Oralovcalled "fair", because each logical thread gets its fair share of time. 61*67c11716SAlexey OralovThread schedulers are typically fair because it is the safest strategy 62*67c11716SAlexey Oralovto undertake without understanding the higher-level organization of a 63*67c11716SAlexey Oralovprogram. In task-based programming, the task scheduler does have some 64*67c11716SAlexey Oralovhigher-level information, and so can sacrifice fairness for efficiency. 65*67c11716SAlexey OralovIndeed, it often delays starting a task until it can make useful 66*67c11716SAlexey Oralovprogress. 67*67c11716SAlexey Oralov 68*67c11716SAlexey Oralov 69*67c11716SAlexey OralovThe scheduler does *load balancing*. In addition to using the right 70*67c11716SAlexey Oralovnumber of threads, it is important to distribute work evenly across 71*67c11716SAlexey Oralovthose threads. As long as you break your program into enough small 72*67c11716SAlexey Oralovtasks, the scheduler usually does a good job of assigning tasks to 73*67c11716SAlexey Oralovthreads to balance load. With thread-based programming, you are often 74*67c11716SAlexey Oralovstuck dealing with load-balancing yourself, which can be tricky to get 75*67c11716SAlexey Oralovright. 76*67c11716SAlexey Oralov 77*67c11716SAlexey Oralov 78*67c11716SAlexey Oralov.. tip:: 79*67c11716SAlexey Oralov Design your programs to try to create many more tasks than there are 80*67c11716SAlexey Oralov threads, and let the task scheduler choose the mapping from tasks to 81*67c11716SAlexey Oralov threads. 82*67c11716SAlexey Oralov 83*67c11716SAlexey Oralov 84*67c11716SAlexey OralovFinally, the main advantage of using tasks instead of threads is that 85*67c11716SAlexey Oralovthey let you think at a higher, task-based, level. With thread-based 86*67c11716SAlexey Oralovprogramming, you are forced to think at the low level of physical 87*67c11716SAlexey Oralovthreads to get good efficiency, because you have one logical thread per 88*67c11716SAlexey Oralovphysical thread to avoid undersubscription or oversubscription. You also 89*67c11716SAlexey Oralovhave to deal with the relatively coarse grain of threads. With tasks, 90*67c11716SAlexey Oralovyou can concentrate on the logical dependences between tasks, and leave 91*67c11716SAlexey Oralovthe efficient scheduling to the scheduler. 92*67c11716SAlexey Oralov 93