There are multiple solutions to your problem, I’ll let you choose the one that suits you best. They are presented below, from the cleanest to the ugliest (in my opinion and regarding the best practices generally followed).
1. Make it a service
If you end up calling it often, it may be worth exposing pandoc as an (HTTP) API. Some images already do that, for example metal3d/pandoc-server (which I already used with success, but I’m sure you can find others).
In this case, you just run a container with pandoc
+ pdflatex
once and you’re set!
2. Use image inheritance!
Make 2 images : one with pandoc
only, and the other one with pandoc
+ pdflatex
, inheriting the first one with the FROM
directive in the Dockerfile
.
It will solve your concerns about size and still being able to run pandoc without having to fetch pdflatex
too. Then if you need to pull the image with pdflatex
, it will just be an extra layer, not the entire image.
You can also do it the other way, with a base image pdflatex
and another adding pandoc
to it if you find yourself using the pdflatex
image alone often and rarely using the pandoc
image without pdflatex
. You could also make 3 images, pandoc
, pdflatex
, and pdflatex + pandoc
, to cover every need you might have, but then you’ll have at least one image that isn’t linked in any way to the 2 others (can’t heritate a “child” image), making it a bit harder to maintain.
3. Docker client in my-pandoc-image
+ Docker socket mount
This is the solution that you mentionned at the end of your post, and which is probably the most generic and straightforward solution for calling other containerized commands, not taking your precise usecase of pandoc
+ pdflatex
into account.
Just add the docker client tu your image my-pandoc-image
and pass the Docker socket as volume at runtime using docker run -v /var/run/docker.sock:/var/run/docker.sock
. And if you’re concerned is not being able to make pandoc
call docker run ...
instead of pdflatex
directly, just add a poor wrapper called pdflatex
in /usr/local/bin/
which will be responsible of doing the docker run
4. Use volumes-from to get the binary
This one is probably the less clean I’ll present here. You could try getting either the pandoc
binary in a pdflatex
container or the pdflatex
binary in a pandoc
container using --volumes-from
to keep everything packaged in its own Docker image. But honnestly, it’s more of a duct tape than a real solution.
Conclusion
You can chose the solution that best fits your needs, but I would advise the first 2 and strongly discourage the last one.