Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

bitbucket - git filter-branch doesn't remove files

I have accidentially left a databasebackup inside the tree, causing my bitbucket repository to go full.

Bitbucket say "7.26 GB of 2 GB".

On the webserver the entire folder is 6.2G, .git is 5.6G. leaving 600M as actual current files.

I follow the https://support.atlassian.com/bitbucket-cloud/docs/maintain-a-git-repository/ instructions.

I'm using the git shell on windows.

$ du -sh .git
5.6G    .git

$ ./git_find_big.sh
All sizes are in kB's. The pack column is the size of the object, compressed, inside the pack file.
size     pack     SHA                                       location
3690053  3611690  0a0bfa9facc2aea79ebbfaf9ce6221a0b093a115  dbbak/DATABASE_shop.zip
1633941  206040   7599e51f805d2a5a58ef85cc3111ff97b96c7f8c  dbbak/DATABASE_shop.bak

$ git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch dbbak' HEAD
Rewrite 0d26f893e5159bafa22637efb67ad15441c363c2 (16/21) (8 seconds passed, remaining 2 predicted)    rm 'dbbak/DATABASE_shop.bak'
rm 'dbbak/DATABASE_shop.zip'
Rewrite de5bf4e33b2ed8a735d5a310f677134e116c6935 (16/21) (8 seconds passed, remaining 2 predicted)    rm 'dbbak/DATABASE_shop.zip'

Ref 'refs/heads/master' was rewritten

$ du -sh .git
5.6G    .git  <-- still same amount used

$ git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
(no output)
$ git reflog expire --expire=now --all
(no output)
$ git gc --prune=now
Enumerating objects: 16861, done.
Counting objects: 100% (16861/16861), done.
Delta compression using up to 8 threads
Compressing objects: 100% (9896/9896), done.
Writing objects: 100% (16861/16861), done.
Total 16861 (delta 6373), reused 16843 (delta 6367), pack-reused 0

$ du -sh .git
5.6G    .git <-- Still same amount used 

$ git push --all --force
$ git push --tags --force 
// Doesn't alter the space used. I didn't expect it to.

If I re-run ./git_find_big.sh the big files are still there :-(

If I clone from bitbucket to a new folder, the entire folder is 1.3G, .git is 571M.

git log shows the entire commit log.

I am tempted to just delete the entire repository at bitbucket and re-upload the slim version of 1.3G/571M

What am I missing?

ADDITION: Now I get

$ ./git_find_big.sh
All sizes are in kB's. The pack column is the size of the object, compressed, inside the pack file.
size     pack     SHA                                       location
3690053  3611690
1633941  206040
1417160  1381048
165448   164633   8ba397bd7aabe9c09b365e0eb9f79ccdc9a7dce5  dymo/DLS8Setup.8.7.1.exe

I.e. the sha and filenames are gone, the bits are still there. (I omitted some of the files before, to avoid clutter)

WTF...

question from:https://stackoverflow.com/questions/65713044/git-filter-branch-doesnt-remove-files

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Instead of filter-branch, try instead:

Install it first. (python3 -m pip install --user git-filter-repo)

Then, for example:

git filter-repo --strip-blobs-bigger-than 10M

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...