Threading in Python [closed]

In order of increasing complexity:

Use the threading module

Pros:

  • It’s really easy to run any function (any callable in fact) in its
    own thread.
  • Sharing data is if not easy (locks are never easy :), at
    least simple.

Cons:

  • As mentioned by Juergen Python threads cannot actually concurrently access state in the interpreter (there’s one big lock, the infamous Global Interpreter Lock.) What that means in practice is that threads are useful for I/O bound tasks (networking, writing to disk, and so on), but not at all useful for doing concurrent computation.

Use the multiprocessing module

In the simple use case this looks exactly like using threading except each task is run in its own process not its own thread. (Almost literally: If you take Eli’s example, and replace threading with multiprocessing, Thread, with Process, and Queue (the module) with multiprocessing.Queue, it should run just fine.)

Pros:

  • Actual concurrency for all tasks (no Global Interpreter Lock).
  • Scales to multiple processors, can even scale to multiple machines.

Cons:

  • Processes are slower than threads.
  • Data sharing between processes is trickier than with threads.
  • Memory is not implicitly shared. You either have to explicitly share it or you have to pickle variables and send them back and forth. This is safer, but harder. (If it matters increasingly the Python developers seem to be pushing people in this direction.)

Use an event model, such as Twisted

Pros:

  • You get extremely fine control over priority, over what executes when.

Cons:

  • Even with a good library, asynchronous programming is usually harder than threaded programming, hard both in terms of understanding what’s supposed to happen and in terms of debugging what actually is happening.

In all cases I’m assuming you already understand many of the issues involved with multitasking, specifically the tricky issue of how to share data between tasks. If for some reason you don’t know when and how to use locks and conditions you have to start with those. Multitasking code is full of subtleties and gotchas, and it’s really best to have a good understanding of concepts before you start.

Leave a Comment