1*67c11716SAlexey Oralov.. _Parallelizing_Flow_Graph:
2*67c11716SAlexey Oralov
3*67c11716SAlexey OralovParallelizing Data Flow and Dependency Graphs
4*67c11716SAlexey Oralov=============================================
5*67c11716SAlexey Oralov
6*67c11716SAlexey Oralov
7*67c11716SAlexey OralovIn addition to loop parallelism, the |full_name| library also supports graph parallelism.
8*67c11716SAlexey OralovIt's possible to create graphs that are highly scalable, but it is also possible to
9*67c11716SAlexey Oralovcreate graphs that are completely sequential.
10*67c11716SAlexey Oralov
11*67c11716SAlexey Oralov
12*67c11716SAlexey OralovUsing graph parallelism, computations are represented by nodes and the
13*67c11716SAlexey Oralovcommunication channels between these computations are represented by
14*67c11716SAlexey Oralovedges. When a node in the graph receives a message, a task is spawned to
15*67c11716SAlexey Oralovexecute its body object on the incoming message. Messages flow through
16*67c11716SAlexey Oralovthe graph across the edges that connect the nodes. The following
17*67c11716SAlexey Oralovsections present two examples of applications that can be expressed as
18*67c11716SAlexey Oralovgraphs.
19*67c11716SAlexey Oralov
20*67c11716SAlexey Oralov
21*67c11716SAlexey OralovThe following figure shows a *streaming* or *data flow* application
22*67c11716SAlexey Oralovwhere a sequence of values is processed as each value passes through the
23*67c11716SAlexey Oralovnodes in the graph. In this example, the sequence is created by a
24*67c11716SAlexey Oralovfunction F. For each value in the sequence, G squares the value and H
25*67c11716SAlexey Oralovcubes the value. J then takes each of the squared and cubed values and
26*67c11716SAlexey Oralovadds them to a global sum. After all values in the sequence are
27*67c11716SAlexey Oralovcompletely processed, sum is equal to the sum of the sequence of squares
28*67c11716SAlexey Oralovand cubes from 1 to 10. In a streaming or data flow graph, the values
29*67c11716SAlexey Oralovactually flow across the edges; the output of one node becomes the input
30*67c11716SAlexey Oralovof its successor(s).
31*67c11716SAlexey Oralov
32*67c11716SAlexey Oralov
33*67c11716SAlexey Oralov.. container:: fignone
34*67c11716SAlexey Oralov   :name: simple_data_flow_title
35*67c11716SAlexey Oralov
36*67c11716SAlexey Oralov
37*67c11716SAlexey Oralov   **Simple Data Flow Graph**
38*67c11716SAlexey Oralov
39*67c11716SAlexey Oralov
40*67c11716SAlexey Oralov   .. container:: imagecenter
41*67c11716SAlexey Oralov
42*67c11716SAlexey Oralov
43*67c11716SAlexey Oralov      |image0|
44*67c11716SAlexey Oralov
45*67c11716SAlexey Oralov
46*67c11716SAlexey OralovThe following graphic shows a different form of graph application. In
47*67c11716SAlexey Oralovthis example, a dependence graph is used to establish a partial ordering
48*67c11716SAlexey Oralovamong the steps for making a peanut butter and jelly sandwich. In this
49*67c11716SAlexey Oralovpartial ordering, you must first get the bread before spreading the
50*67c11716SAlexey Oralovpeanut butter or jelly on the bread. You must spread on the peanut
51*67c11716SAlexey Oralovbutter before you put away the peanut butter jar, and likewise spread on
52*67c11716SAlexey Oralovthe jelly before you put away the jelly jar. And, you need to spread on
53*67c11716SAlexey Oralovboth the peanut butter and jelly before putting the two slices of bread
54*67c11716SAlexey Oralovtogether. This is a partial ordering because, for example, it doesn't
55*67c11716SAlexey Oralovmatter if you spread on the peanut butter first or the jelly first. It
56*67c11716SAlexey Oralovalso doesn't matter if you finish making the sandwich before putting
57*67c11716SAlexey Oralovaway the jars.
58*67c11716SAlexey Oralov
59*67c11716SAlexey Oralov
60*67c11716SAlexey Oralov.. container:: fignone
61*67c11716SAlexey Oralov   :name: dependence_graph_make_sandwitch
62*67c11716SAlexey Oralov
63*67c11716SAlexey Oralov
64*67c11716SAlexey Oralov   **Dependence Graph for Making a Sandwich**
65*67c11716SAlexey Oralov
66*67c11716SAlexey Oralov
67*67c11716SAlexey Oralov   .. container:: imagecenter
68*67c11716SAlexey Oralov
69*67c11716SAlexey Oralov
70*67c11716SAlexey Oralov      |image1|
71*67c11716SAlexey Oralov
72*67c11716SAlexey Oralov
73*67c11716SAlexey OralovWhile it can be inferred that resources, such as the bread, or the jelly
74*67c11716SAlexey Oralovjar, are shared between ordered steps, it is not explicit in the graph.
75*67c11716SAlexey OralovInstead, only the required ordering of steps is explicit in a dependence
76*67c11716SAlexey Oralovgraph. For example, you must "Put jelly on 1 slice" **before** you "Put
77*67c11716SAlexey Oralovaway jelly jar".
78*67c11716SAlexey Oralov
79*67c11716SAlexey Oralov
80*67c11716SAlexey OralovThe flow graph interface in the oneTBB library allows you to express
81*67c11716SAlexey Oralovdata flow and dependence graphs such as these, as well as more
82*67c11716SAlexey Oralovcomplicated graphs that include cycles, conditionals, buffering and
83*67c11716SAlexey Oralovmore. If you express your application using the flow graph interface,
84*67c11716SAlexey Oralovthe runtime library spawns tasks to exploit the parallelism that is
85*67c11716SAlexey Oralovpresent in the graph. For example, in the first example above, perhaps
86*67c11716SAlexey Oralovtwo different values might be squared in parallel, or the same value
87*67c11716SAlexey Oralovmight be squared and cubed in parallel. Likewise in the second example,
88*67c11716SAlexey Oralovthe peanut butter might be spread on one slice of bread in parallel with
89*67c11716SAlexey Oralovthe jelly being spread on the other slice. The interface expresses what
90*67c11716SAlexey Oralovis legal to execute in parallel, but allows the runtime library to
91*67c11716SAlexey Oralovchoose at runtime what will be executed in parallel.
92*67c11716SAlexey Oralov
93*67c11716SAlexey Oralov
94*67c11716SAlexey OralovThe support for graph parallelism is contained within the namespace
95*67c11716SAlexey Oralov``oneapi::tbb::flow`` and is defined in the ``flow_graph.h`` header file.
96*67c11716SAlexey Oralov
97*67c11716SAlexey Oralov
98*67c11716SAlexey Oralov.. |image0| image:: Images/flow_graph.jpg
99*67c11716SAlexey Oralov.. |image1| image:: Images/flow_graph_complex.jpg
100*67c11716SAlexey Oralov   :width: 440px
101*67c11716SAlexey Oralov   :height: 337px
102*67c11716SAlexey Oralov
103