MATLAB parfor is slower than for — what is wrong?

Making the partitioning and grouping the results (overhead in dividing the work and gathering results from the several threads/cores) is high for small values of nt. This is normal, you would not partition data for easy tasks that can be performed quickly in a simple loop.

Always perform something challenging inside the loop that is worth the partitioning overhead. Here is a nice introduction to parallel programming.

The threads come from a thread pool so the overhead of creating the threads should not be there. But in order to create the partial results n matrices from the bistar size must be created, all the partial results computed and then all these partial results have to be added (recombining). In a straight loop, this is with a high probability done in-place, no allocations take place.

The complete statement in the help (thanks for your link hereunder) is:

If the time to compute f, g, and h is
large
, parfor will be significantly
faster than the corresponding for
statement, even if n is relatively
small.

So you see they mean exactly the same as what I mean, the overhead for small n values is only worth the effort if what you do in the loop is complex/time consuming enough.

Leave a Comment