Lines Matching refs:a
7 Chunking is controlled by a *partitioner* and a *grainsize.*\ To gain
35 void ParallelApplyFoo( float a[], size_t n ) {
36 parallel_for(blocked_range<size_t>(0,n,G), ApplyFoo(a),
41 The grainsize sets a minimum threshold for parallelization. The
44 iterations in a chunk. Using ``simple_partitioner`` guarantees that
56 chunks with less than [G/2] iterations. Specifying a range with an
82 useful work as the gray area inside a brown border that represents
84 shows how too small a grainsize leads to a relatively high proportion of
85 overhead. Case B shows how a large grainsize reduces this proportion, at
86 the cost of reducing potential parallelism. The overhead as a fraction
89 number of processors when setting a grainsize.
93 should take at least 100,000 clock cycles to execute. For example, if a
103 step 3 will guide you to a much smaller value.
113 A drawback of setting a grainsize too high is that it can reduce
117 on the side of being a little too high instead of a little too low,
118 because too low a value hurts serial performance, which in turns hurts
128 versus grainsize, based on the floating point ``a[i]=b[i]*c``
129 computation over a million indices. There is little work per iteration.
130 The times were collected on a four-socket machine with eight hardware
143 that with a grainsize of one, most of the overhead is parallel
144 scheduling overhead, not useful work. An increase in grainsize brings a
146 because the parallel overhead becomes insignificant for a sufficiently
149 threads. Notice that a grainsize over the wide range 100-100,000 works
156 iteration of an outer loop is likely to provide a bigger grain of