cProfile causes pickling error when running multiprocessing Python code

The problem you’ve got here is that, by using -mcProfile, the module __main__ is cProfile (the actual entry point of the code), not your script. cProfile tries to fix this by ensuring that when your script runs, it sees __name__ as "__main__", so it knows it’s being run as a script, not imported as a module, but sys.modules['__main__'] remains the cProfile module.

Problem is, pickle handles pickling functions by just pickling their qualified name (plus some boilerplate to say it’s a function in the first place). And to make sure it will survive the round trip, it always double checks that the qualified name can be looked up in sys.modules. So when you do pickle.dumps(_apply_along_axis_palmers) (explicitly, or implicitly in this case by passing it as the mapper function), where _apply_along_axis_palmers is defined in your main script, it double checks that sys.modules['__main__']._apply_along_axis_palmers exists. But it doesn’t, because cProfile._apply_along_axis_palmers doesn’t exist.

I don’t know of a good solution for this. The best I can come up with is to manually fix up sys.modules to make it expose your module and its contents correctly. I haven’t tested this completely, so it’s possible there will be some quirks, but a solution I’ve found is to change a module named mymodule.py of the form:

# imports...
# function/class/global defs...

if __name__ == '__main__':
    main()  # Or series of statements

to:

# imports...
import sys
# function/class/global defs...

if __name__ == '__main__':
    import cProfile
    # if check avoids hackery when not profiling
    # Optional; hackery *seems* to work fine even when not profiling, it's just wasteful
    if sys.modules['__main__'].__file__ == cProfile.__file__:
        import mymodule  # Imports you again (does *not* use cache or execute as __main__)
        globals().update(vars(mymodule))  # Replaces current contents with newly imported stuff
        sys.modules['__main__'] = mymodule  # Ensures pickle lookups on __main__ find matching version
    main()  # Or series of statements

From there on out, sys.modules['__main__'] refers to your own module, not cProfile, so things seem to work. cProfile still seems to work despite this, and pickling finds your functions as expected. Only real cost is reimporting your module, but if you’re doing enough real work, the cost of reimporting should be fairly small.

Leave a Comment