1.. _estimate_flow_graph_performance:
2
3Estimating Flow Graph Performance
4=================================
5
6
7The performance or scalability of a flow graph is not easy to predict.
8However there are a few key points that can guide you in estimating the
9limits on performance and speedup of some graphs.
10
11
12.. container:: section
13
14
15   .. rubric:: The Critical Path Limits the Scalability in a Dependence
16      Graph
17      :class: sectiontitle
18
19   A critical path is the most time consuming path from a node with no
20   predecessors to a node with no successors. In a dependence graph, the
21   execution of the nodes along a path cannot be overlapped since they
22   have a strict ordering. Therefore, for a dependence graph, the
23   critical path limits scalability.
24
25
26   More formally, let T be the total time consumed by all of the nodes
27   in your graph if executed sequentially. Then let C be the time
28   consumed along the path that takes the most time. The nodes along
29   this path cannot be overlapped even in a parallel execution.
30   Therefore, even if all other paths are executed in parallel with C,
31   the wall clock time for the parallel execution is at least C, and the
32   maximum possible speedup (ignoring microarchitectural and memory
33   effects) is T/C.
34
35
36.. container:: section
37
38
39   .. rubric:: There is Overhead in Spawning a Node's Body as a Task
40      :class: sectiontitle
41
42   The bodies of ``input_nodes``, ``function_nodes``, ``continue_nodes`` and
43   ``multifunction_nodes`` execute within spawned tasks by default. This
44   means that you need to take into account the overhead of task
45   scheduling when estimating the time it takes for a node to execute
46   its body. All of the rules of thumb for determining the appropriate
47   granularity of tasks therefore also apply to node bodies as well. If
48   you have many fine-grained nodes in your flow graph, the impact of
49   these overheads can noticeably impact your performance. However,
50   depending on the graph structure, you can reduce such overheads by
51   using lightweight policy with these nodes.
52
53