1*67c11716SAlexey Oralov.. _Task-Based_Programming:
2*67c11716SAlexey Oralov
3*67c11716SAlexey OralovTask-Based Programming
4*67c11716SAlexey Oralov======================
5*67c11716SAlexey Oralov
6*67c11716SAlexey Oralov
7*67c11716SAlexey OralovWhen striving for performance, programming in terms of threads can be a
8*67c11716SAlexey Oralovpoor way to do multithreaded programming. It is much better to formulate
9*67c11716SAlexey Oralovyour program in terms of *logical tasks*, not threads, for several
10*67c11716SAlexey Oralovreasons.
11*67c11716SAlexey Oralov
12*67c11716SAlexey Oralov
13*67c11716SAlexey Oralov-  Matching parallelism to available resources
14*67c11716SAlexey Oralov
15*67c11716SAlexey Oralov
16*67c11716SAlexey Oralov-  Faster task startup and shutdown
17*67c11716SAlexey Oralov
18*67c11716SAlexey Oralov
19*67c11716SAlexey Oralov-  More efficient evaluation order
20*67c11716SAlexey Oralov
21*67c11716SAlexey Oralov
22*67c11716SAlexey Oralov-  Improved load balancing
23*67c11716SAlexey Oralov
24*67c11716SAlexey Oralov
25*67c11716SAlexey Oralov-  Higher–level thinking
26*67c11716SAlexey Oralov
27*67c11716SAlexey Oralov
28*67c11716SAlexey OralovThe following paragraphs explain these points in detail.
29*67c11716SAlexey Oralov
30*67c11716SAlexey Oralov
31*67c11716SAlexey OralovThe threads you create with a threading package are *logical* threads,
32*67c11716SAlexey Oralovwhich map onto the *physical threads* of the hardware. For computations
33*67c11716SAlexey Oralovthat do not wait on external devices, highest efficiency usually occurs
34*67c11716SAlexey Oralovwhen there is exactly one running logical thread per physical thread.
35*67c11716SAlexey OralovOtherwise, there can be inefficiencies from the mismatch\ *.
36*67c11716SAlexey OralovUndersubscription* occurs when there are not enough running logical
37*67c11716SAlexey Oralovthreads to keep the physical threads working. *Oversubscription* occurs
38*67c11716SAlexey Oralovwhen there are more running logical threads than physical threads.
39*67c11716SAlexey OralovOversubscription usually leads to *time sliced* execution of logical
40*67c11716SAlexey Oralovthreads, which incurs overheads as discussed in Appendix A, *Costs of
41*67c11716SAlexey OralovTime Slicing*. The scheduler tries to avoid oversubscription, by having
42*67c11716SAlexey Oralovone logical thread per physical thread, and mapping tasks to logical
43*67c11716SAlexey Oralovthreads, in a way that tolerates interference by other threads from the
44*67c11716SAlexey Oralovsame or other processes.
45*67c11716SAlexey Oralov
46*67c11716SAlexey Oralov
47*67c11716SAlexey OralovThe key advantage of tasks versus logical threads is that tasks are much
48*67c11716SAlexey Oralov*lighter weight* than logical threads. On Linux systems, starting and
49*67c11716SAlexey Oralovterminating a task is about 18 times faster than starting and
50*67c11716SAlexey Oralovterminating a thread. On Windows systems, the ratio is more than 100.
51*67c11716SAlexey OralovThis is because a thread has its own copy of a lot of resources, such as
52*67c11716SAlexey Oralovregister state and a stack. On Linux, a thread even has its own process
53*67c11716SAlexey Oralovid. A task in |full_name|, in contrast, is
54*67c11716SAlexey Oralovtypically a small routine, and also, cannot be preempted at the task
55*67c11716SAlexey Oralovlevel (though its logical thread can be preempted).
56*67c11716SAlexey Oralov
57*67c11716SAlexey Oralov
58*67c11716SAlexey OralovTasks in oneTBB are efficient too because *the scheduler is unfair*. Thread schedulers typically
59*67c11716SAlexey Oralovdistribute time slices in a round-robin fashion. This distribution is
60*67c11716SAlexey Oralovcalled "fair", because each logical thread gets its fair share of time.
61*67c11716SAlexey OralovThread schedulers are typically fair because it is the safest strategy
62*67c11716SAlexey Oralovto undertake without understanding the higher-level organization of a
63*67c11716SAlexey Oralovprogram. In task-based programming, the task scheduler does have some
64*67c11716SAlexey Oralovhigher-level information, and so can sacrifice fairness for efficiency.
65*67c11716SAlexey OralovIndeed, it often delays starting a task until it can make useful
66*67c11716SAlexey Oralovprogress.
67*67c11716SAlexey Oralov
68*67c11716SAlexey Oralov
69*67c11716SAlexey OralovThe scheduler does *load balancing*. In addition to using the right
70*67c11716SAlexey Oralovnumber of threads, it is important to distribute work evenly across
71*67c11716SAlexey Oralovthose threads. As long as you break your program into enough small
72*67c11716SAlexey Oralovtasks, the scheduler usually does a good job of assigning tasks to
73*67c11716SAlexey Oralovthreads to balance load. With thread-based programming, you are often
74*67c11716SAlexey Oralovstuck dealing with load-balancing yourself, which can be tricky to get
75*67c11716SAlexey Oralovright.
76*67c11716SAlexey Oralov
77*67c11716SAlexey Oralov
78*67c11716SAlexey Oralov.. tip::
79*67c11716SAlexey Oralov   Design your programs to try to create many more tasks than there are
80*67c11716SAlexey Oralov   threads, and let the task scheduler choose the mapping from tasks to
81*67c11716SAlexey Oralov   threads.
82*67c11716SAlexey Oralov
83*67c11716SAlexey Oralov
84*67c11716SAlexey OralovFinally, the main advantage of using tasks instead of threads is that
85*67c11716SAlexey Oralovthey let you think at a higher, task-based, level. With thread-based
86*67c11716SAlexey Oralovprogramming, you are forced to think at the low level of physical
87*67c11716SAlexey Oralovthreads to get good efficiency, because you have one logical thread per
88*67c11716SAlexey Oralovphysical thread to avoid undersubscription or oversubscription. You also
89*67c11716SAlexey Oralovhave to deal with the relatively coarse grain of threads. With tasks,
90*67c11716SAlexey Oralovyou can concentrate on the logical dependences between tasks, and leave
91*67c11716SAlexey Oralovthe efficient scheduling to the scheduler.
92*67c11716SAlexey Oralov
93