Usage

git forget-blob file_to_forget

Installation

Get the script from github and make it executable.

To do it in one step, paste the following on your terminal

sudo wget https://raw.githubusercontent.com/nachoparker/git-forget-blob/master/git-forget-blob.sh -O /usr/local/bin/git-forget-blob sudo chmod +x /usr/local/bin/git-forget-blob

If you don’t like installing to /usr/local/bin using sudo , just copy git-forget-blob wherever you like. It will work as long as the file is in the $PATH with execute permissions.

Details

Be it by mistake or by a change of mind, sooner or later we all deal with the problem of making a git repository forget about a file.

We soon realize that git rm will not suffice, as git remembers that the file existed once in our history, and thus will keep a reference to it.

To make things worse, rebasing is not easy either, because any references to the blob will prevent git garbage collector from cleaning up the space. This includes remote references and reflog references.

Typically, we run into this problem whenever there is some chunky binary blob that our repository needs to hold, even worse if we have to update it from time to time. This can result in our repository quickly growing in size.

Enter git-forget-blob

# Completely remove a file from a git repository history # # Copyleft 2017 by Ignacio Nunez Hernanz <nacho _a_t_ ownyourbits _d_o_t_ com> # GPL licensed (see end of file) * Use at your own risk! # # Usage: # git-forget-blob file_to_forget # # Notes: # It rewrites history, therefore will change commit references function git-forget-blob() { git repack -A ls .git/objects/pack/*.idx &>/dev/null || { echo "there is nothing to be forgotten in this repo" && return; } local BLOBS=( $( git verify-pack -v .git/objects/pack/*.idx | grep blob | \ awk '{ print $1 }' ) ) for ref in ${BLOBS[@]}; do local FILE="$( git rev-list --objects --all | grep $ref | awk '{ print $2 }' )" [[ "$FILE" == "$1" ]] && break unset FILE done [[ "$FILE" == "" ]] && { echo "$1 not found in repo history" && return; } git tag | xargs git tag -d git filter-branch --index-filter "git rm --cached --ignore-unmatch $FILE" rm -rf .git/refs/original/ .git/refs/remotes/ .git/*_HEAD .git/logs/ git for-each-ref --format="%(refname)" refs/original/ | \ xargs -n1 --no-run-if-empty git update-ref -d git reflog expire --expire-unreachable=now --all git repack -A -d git prune } # License # # This script is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This script is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this script; if not, write to the # Free Software Foundation, Inc., 59 Temple Place, Suite 330, # Boston, MA 02111-1307 USA

In a nutshell, this

uses git filter-branch to apply git rm to each single commit

to apply to each single commit then, it removes all possible references including remotes, tags and reflog

next, it deletes unreferenced packs, and

finally, it forces aggresive garbage collection with git gc –prune .

Things to keep in mind:

This rewrites history, so forced pushes, merges, conflicts and such niceties will happen.

happen. For the same reasons, tags will be lost and commit hashes will change.

Remember to keep a checked out copy of the repo before trying this, and use with care.