Rewrite git history to modify a file -
to remove large unwanted file of git history can use filter-branch
rewrite index (the list of files in repo) of each commit file never added.
git filter-branch --index-filter "git rm --cached --ignore-unmatch path/to/offending_file.wav" --tag-name-filter cat -- --all
but if want keep file make lot smaller (e.g. imagine if icon accidentally stored huge image). tried approach:
first add replacement file git's database
hash=`git hash-object -w /tmp/replacement.png`
also note file want replace
file="path/to/icon.png"
now filter index follows: first check file exists @ commit:
git cat-file -e :"$file"
if remove index:
git rm --cached "$file"
and add reference our replacement same filename.
git update-index --add --cacheinfo "100644,$hash,$file"
putting together:
git filter-branch --index-filter "if git cat-file -e :$file ; git rm --cached $file ; git update-index --add --cacheinfo 100644,$hash,$file ; fi" --tag-name-filter cat -- --all
this seems work , doesn't print errors too scary. however, no matter how many git gc
, prune commands try original blob still exists in repository. if clone repo new place still exists.
i suspect because remote refs, , original
refs filter-branch
creates still point old tree, original file still referenced.
i did try removing them hack this:
for ref in `git show-ref | cut -c 42- | grep original` ; git update-ref -d $ref ; done
and same remotes
, blob still there.
so questions:
- is there way see why blob isn't garbage collected? i.e. parents objects in graph point it?
- is there non-hacky way remove
originals
refs (and maybe remotes) - including branches , tags? - is there else i'm missing?
aha i've done it! think.
here steps. first it's idea note hash of blob want @ start can check if exists with
git cat-file -t 949abcd....
ok first cleared reflog, since still has reference original clone:
git reflog expire --expire=now --all
next removed origin remote, since still has reference original tree. guess if push new hashes (probably need force push) step unnecessary , file should gced anyway.
git remote rm origin
next removed original
refs (that filter-branch
creates). didn't find less hacky way:
for ref in `git show-ref | cut -c 42- | grep original` ; git update-ref -d $ref ; done
finally, garbage collect. i'm not sure whether --aggressive
required --prune=now
because otherwise git gc
garbage collects old unwanted objects, safety.
git gc --aggressive --prune=now
after these steps git cat-file
reports blob gone! haven't experimented pushing result origin (after re-add it), , i'm not 100% sure of above steps necessary, seemed work far.
Comments
Post a Comment