How to remove sensitive data from a file in github history

I would recommend to use the new git filter-repo, which replaces BFG and git filter-branch.

Note: if you get the following error message when running the above-mentioned commands:

Error: need a version of `git` whose `diff-tree` command has the `--combined-all-paths` option`

it means you have to update git.


First: do that one copy of your local repo (a new clone)

See “Content base filtering“:

At the end, you can (if you are the only one working on that repository) do a git push --force

If you want to modify file contents, you can do so based on a list of expressions in a file, one per line.
For example, with a file named expressions.txt containing:

p455w0rd
foo==>bar
glob:*666*==>
regex:\bdriver\b==>pilot
literal:MM/DD/YYYY==>YYYY-MM-DD
regex:([0-9]{2})/([0-9]{2})/([0-9]{4})==>\3-\1-\2

then running

git filter-repo --replace-text expressions.txt

will go through and replace:

  • p455w0rd with ***REMOVED***,
  • foo with bar,
  • any line containing 666 with a blank line,
  • the word driver with pilot (but not if it has letters before or after; e.g. drivers will be unmodified),
  • the exact text MM/DD/YYYY with YYYY-MM-DD and
  • date strings of the form MM/DD/YYYY with ones of the form YYYY-MM-DD.

Leave a Comment