Doxygen is Slow

Tag files are typically the way to go if

  1. you have a number of logically coherent source files (let’s call them components) and
  2. you know the dependencies between the components, e.g. component A uses component B and C, and component B only uses C, and
  3. It is ok (or even preferred) that the index files (e.g. the list of a files/classes/functions) are limited to a single component.
  4. you are interested in HTML output.

A tag file is basically just a structured list of symbols with links to the location in the documentation. Tag files allow doxygen to make links from the documentation of one component to that of another.

It is a 2 step process:

  1. First you run doxygen on each component to generate the tag file for that component. You can do this by disabling all output and use GENERATE_TAGFILE. So for component A, a Doxyfile.tagonly would have the following settings:

    GENERATE_HTML         = NO
    GENERATE_LATEX        = NO
    GENERATE_RTF          = NO
    GENERATE_MAN          = NO
    GENERATE_TAGFILE      = compA.tag
    

    You’ll notice that running doxygen this way is very fast.

  2. The second step is to generate the actual documentation. For component A you need a Doxyfile which includes the tag files of the components B and C since we determined A depends on these components.

    GENERATE_HTML         = YES
    GENERATE_LATEX        = NO
    GENERATE_RTF          = NO
    GENERATE_MAN          = NO
    TAGFILES              = path/to/compB/compB.tag=path/to/compB/htmldocs \
                            path/to/compC/compC.tag=path/to/compC/htmldocs
    

Using this approach I have been able to generate documentation for 20M+ lines of code distributed over 1500+ components in under 3 hours on a standard desktop PC (Core i5 with 8Gb RAM and Linux 64bit), including source browsing, full call graphs, and UML-style diagrams of all data structures. Note that the first step only took 10 minutes.

To accomplish this I made a script to generate the Doxyfile’s for each component based on the list of components and their direct dependencies. In the first step I run 8 instances of doxygen in parallel (using http://www.gnu.org/s/parallel/). In the second step I run 4 instances of doxygen in parallel.

See http://www.doxygen.nl/manual/external.html for more info about tag files.

Leave a Comment