shared_ptr: horrible speed

shared_ptr are the most complicated type of pointer ever:

  • Ref counting takes time
  • Multiple allocation (there are 3 parts: the object, the counter, the deleter)
  • A number of virtual methods (in the counter and the deleter) for type erasure
  • Works among multiple threads (thus synchronization)

There are 2 ways to make them faster:

  • use make_shared to allocate them, because (unfortunately) the normal constructor allocates two different blocks: one for the object and one for the counter and deleter.
  • don’t copy them if you don’t need to: methods should accept shared_ptr<T> const&

But there are also many ways NOT to use them.

Looking at your code it looks like your doing a LOT of memory allocation, and I can’t help but wonder if you couldn’t find a better strategy. I must admit I didn’t got the full figure, so I may be heading straight into a wall but…

Usually code is much simpler if you have an owner for each of the objects. Therefore, shared_ptr should be a last resort measure, employed when you can’t get a single owner.

Anyway, we’re comparing apples and oranges here, the original code is buggy. You take care of deleting the memory (good) but you forgot that these objects were also referenced from other points in the program e1->setNextEdge(e21) which now holds pointers to destructed objects (in a free’d memory zone). Therefore I guess that in case of exception you just wipe out the entire list ? (Or somehow bet on undefined behavior to play nice)

So it’s hard to judge on performances since the former doesn’t recover well from exceptions while the latter does.

Finally: Have you thought about using intrusive_ptr ? It could give you some boost (hehe) if you don’t synchronize them (single thread) and you would avoid a lot of stuff performed by the shared_ptr as well as gain on locality of reference.

Leave a Comment