Remove 100,000+ nodes from a Boost graph

In your removal branch you re-tie() the iterators:

boost::tie(vi, vi_end) = boost::vertices(m_graph);

This will cause the loop to restart every time you restart the loop. This is exactly Schlemiel The Painter.

I’ll find out whether you can trust remove_vertex not triggering a reallocation. If so, it’s easily fixed. Otherwise, you’d want an indexer-based loop instead of iterator-based. Or you might be able to work on the raw container (it’s a private member, though, as I remember).

Update Using vecS as the container for vertices is going to cause bad performance here:

If the VertexList template parameter of the adjacency_list was vecS, then all vertex descriptors, edge descriptors, and iterators for the graph are invalidated by this operation. <…> If you need to make frequent use of the remove_vertex() function the listS selector is a much better choice for the VertexList template parameter.

This small benchmark test.cpp compares:

  • with -DSTABLE_IT (listS)

    $ ./stable 
    Generated 100000 vertices and 5000 edges in 14954ms
    The graph has a cycle? false
    starting selective removal...
    Done in 0ms
    After: 99032 vertices and 4916 edges
  • without -DSTABLE_IT (vecS)

    $ ./unstable 
    Generated 100000 vertices and 5000 edges in 76ms
    The graph has a cycle? false
    starting selective removal...
    Done in 396ms
    After: 99032 vertices and 4916 edges
  • using filtered_graph (thanks @cv_and_he in the comments)

    Generated 100000 vertices and 5000 edges in 15ms
    The graph has a cycle? false
    starting selective removal...
    Done in 0ms
    After: 99032 vertices and 4916 edges
    Done in 13ms

You can clearly see that removal is much faster for listS but generating is much slower.

Leave a Comment