your work function ends too soon:
In [2]: %timeit func(1)
335 µs ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
so you are basically measuring the overhead of multiprocessing.
change your work function to do more work, like loop 1000 * 1000
times rather than 1000
times, you will see it scales again, 1000000
loops cost roughly 0.4s
on my mac, which high enough compared to the overhead.
below is the test result for different n
on my mac, I use Pool(4)
as I have 4 cores, test runs only once rather than multi times like %timeit
, cause the difference is insignificant:
you could see the speedup ratio is increasing proportionally with n
, the overhead of multiprocessing is shared by each work function call.
the math behind, assume per-call overhead is equal:
if we want ratio > 1
:
approximately equal:
which means, if work function runs too fast compares with per-call overhead, multiprocessing
does not scale.