[kwlug-disc] How GIT stores stuff.

Bob B bob at softscape.ca
Fri May 6 12:22:08 EDT 2022


Chris,

Thanks for the pointers and the analysis of that output.

I like understanding what goes on under the covers; I find it helps me work with the tools and make better decisions about how to use them. This should be a helpful read.

I'm just looking at that book now, for others reading this thread, the book seems to be on-line in its entirety!

I especially like the porcelain/plumbing analogue you used. I'm going to add that to my professional vocabulary 😊

BB



> -----Original Message-----
> From: kwlug-disc <kwlug-disc-bounces at kwlug.org> On Behalf Of Chris
> Frey
> Sent: May 4, 2022 12:32 AM
> To: KWLUG discussion <kwlug-disc at kwlug.org>
> Subject: Re: [kwlug-disc] How GIT stores stuff.
> 
> On Tue, May 03, 2022 at 03:53:26PM -0400, Bob B wrote:
> > Any idea how to interpret what it actually did from this:
> >
> > tower:.password-store bob$ git gc
> > Counting objects: 4031, done.
> > Delta compression using up to 24 threads.
> > Compressing objects: 100% (2261/2261), done.
> > Writing objects: 100% (4031/4031), done.
> > Total 4031 (delta 1839), reused 3683 (delta 1703)
> 
> These are common statistics you'll see when you fetch and push
> from/to remote repositories, and here with garbage collection.
> Also with fsck.  It's basically reporting how many objects you have,
> and it's progress as it works through the data.
> 
> I had never looked into the last line in detail until now, but I did
> find an explanation:
> 
> 	https://stackoverflow.com/questions/9379714/what-do-the-numbers-
> in-the-total-line-of-git-gc-git-repack-output-mean
> 
> > Total 4031 (delta 1839), reused 3683 (delta 1703)
> 
> Taking an educated guess, I would read this as saying you have 4031
> total
> objects in your git repo.  I assume 1839 of them could be compressed
> in
> a diff-like way.  You already had 3683 objects in a pack already
> (perhaps
> from a previous git-gc run which sometimes happens automatically for
> you),
> and of those, 1703 were able to be compressed diff-style.
> 
> If you run git-gc again, before and after numbers should be alike.
> 
> 
> > So many questions! Do you have a good learning resource for git or
> is
> > it just a matter of using it more and learning nuances like this as
> you
> > come across them?
> 
> The book I usually recommend to anyone who wants to understand git
> is "Git from the bottom up" by John Wiegley:
> 
> 	https://jwiegley.github.io/git-from-the-bottom-up/
> 
> In the early days of git, I used to follow the mailing list, but
> that was a long time ago.  But the bottom-up details usually stick
> once understanding is achieved.
> 
> I also have found the manpages to be very good if you have the time
> to crunch through them.  Read them like a novel, and they will
> reward you later. :-)
> 
> Especially the git-rev-parse page, which documents how to specify
> revisions and ranges, which is syntax which can be used across many
> other git commands, like git-log.
> 
> i.e.  If you know what these mean, you'll have a great handle on git
> 
> 	master^
> 	..master
> 	HEAD^^
> 	branchname:./path/to/file
> 
> Git commands are split into porcelain (user-friendly high level
> commands),
> and plumbing (low level data access commands).  The 'man git' manpage
> lists which is which.  The plumbing commands have good manpages too,
> but fewer safety mechanisms, so while you may not actually use a
> plumbing
> command, reading its manpage can help you understand what the
> porcelain
> is doing.
> 
> And reading through the list of all available commands (after you
> understand
> git from the bottom up) is a good way to get a feel for what's
> available.
> 
> It's a fun fun fun fun world. :-)
> 
> - Chris
> 
> 
> On Tue, May 03, 2022 at 03:53:26PM -0400, Bob B wrote:
> > Chris,
> >
> > Cool! Good to know.
> >
> > I ran this on a copy of my password-store tree (all gpg encrypted
> files) and it reduced it by about 1M (~2%)
> >
> > Any idea how to interpret what it actually did from this:
> >
> > tower:.password-store bob$ git gc
> > Counting objects: 4031, done.
> > Delta compression using up to 24 threads.
> > Compressing objects: 100% (2261/2261), done.
> > Writing objects: 100% (4031/4031), done.
> > Total 4031 (delta 1839), reused 3683 (delta 1703)
> >
> > tower:.password-store bob$ du -sh . ~/.password-store/
> >  41M	.
> >  42M	/Users/bob/.password-store/
> >
> >
> > So many questions! Do you have a good learning resource for git or
> is it just a matter of using it more and learning nuances like this as
> you come across them?
> >
> > BB
> >
> >
> > > -----Original Message-----
> > > From: kwlug-disc <kwlug-disc-bounces at kwlug.org> On Behalf Of Chris
> > > Frey
> > > Sent: May 3, 2022 2:24 PM
> > > To: KWLUG discussion <kwlug-disc at kwlug.org>
> > > Subject: Re: [kwlug-disc] How GIT stores stuff.
> > >
> > > On Tue, May 03, 2022 at 10:36:23AM -0400, Bob B wrote:
> > > > It has some references to deeper information that look enticing,
> but
> > > in
> > > > summary I think it confirms what I said in that GIT stores
> complete
> > > files,
> > > > not deltas. At least not deltas as 'diffs' of text files.
> > >
> > > This is correct.  At least until you run 'git gc' which then turns
> > > those individual files into packs, which do store things in diff-
> like
> > > ways.
> > >
> > > When you do a fresh clone from a remote repository, you will
> download
> > > the pack,
> > > which you can see if you look inside the .git directory.  If you
> start
> > > your own git repo, you will see many individual files in the
> objects
> > > subdirectory until you run git-gc and git decides it is time to
> > > pack things up for optimization purposes.
> > >
> > > - Chris
> > >
> > >
> > > _______________________________________________
> > > kwlug-disc mailing list
> > > kwlug-disc at kwlug.org
> > > https://kwlug.org/mailman/listinfo/kwlug-disc_kwlug.org
> >
> >
> >
> >
> > _______________________________________________
> > kwlug-disc mailing list
> > kwlug-disc at kwlug.org
> > https://kwlug.org/mailman/listinfo/kwlug-disc_kwlug.org
> 
> _______________________________________________
> kwlug-disc mailing list
> kwlug-disc at kwlug.org
> https://kwlug.org/mailman/listinfo/kwlug-disc_kwlug.org







More information about the kwlug-disc mailing list