How to use python-docx to replace text in a Word document and save

UPDATE: There are a couple of paragraph-level functions that do a good job of this and can be found on the GitHub site for python-docx.

This one will replace a regex-match with a replacement str. The replacement string will appear formatted the same as the first character of the matched string.
This one will isolate a run such that some formatting can be applied to that word or phrase, like highlighting each occurence of “foobar” in the text or perhaps making it bold or appear in a larger font.

The current version of python-docx does not have a search() function or a replace() function. These are requested fairly frequently, but an implementation for the general case is quite tricky and it hasn’t risen to the top of the backlog yet.

Several folks have had success though, getting done what they need, using the facilities already present. Here’s an example. It has nothing to do with sections by the way 🙂

for paragraph in document.paragraphs:
    if 'sea' in paragraph.text:
        print paragraph.text
        paragraph.text="new text containing ocean"

To search in Tables as well, you would need to use something like:

for table in document.tables:
    for row in table.rows:
        for cell in row.cells:
            for paragraph in cell.paragraphs:
                if 'sea' in paragraph.text:
                    paragraph.text = paragraph.text.replace("sea", "ocean")

If you pursue this path, you’ll probably discover pretty quickly what the complexities are. If you replace the entire text of a paragraph, that will remove any character-level formatting, like a word or phrase in bold or italic.

By the way, the code from @wnnmaw’s answer is for the legacy version of python-docx and won’t work at all with versions after 0.3.0.

More Related Contents:

Leave a Comment Cancel reply