[kwlug-disc] Best DIY git service options

Tue Dec 8 15:43:02 EST 2015

On December 7, 2015 2:53:33 PM EST, "B.S." <bs27975 at yahoo.ca> wrote:
>> I use docker to get a test system 
>
>This thread has been interesting in the sense that it feels like git is
>essentially being used as a CMS, or back end central distributed
>storage repository, for more than just code. It had never occurred to
>me to think of git in that way.

Some people think of git as a generic key-value store. Github project wikis are themselves git repos, fronted with:
https://github.com/gollum/gollum
It gives you history and sync for free, on top of a filesystem which doesn't. I have a git repo for each entire course because then I *know* I have the same copies of notes and assignments everywhere, and of course when I am coding (or writing in that bastard LaTeX) I have an undo button.
I use git to sync Emacs to do files between NoNonsenseNotes and my desktop (but it's tedious to do because i need to plug in the phonr. If NNN supported server sync that wasn't Google I wouldn't need this.)

BTW, Gilles, I highly recommend this set up to any student. Since Gitlab allows easy private repos it is legal for students to put coursework in it without tripping academic integrity. I bet a quick way to get a lot of people using git real quick is to give handouts for how to replace Dropbox with it, especially if you catch the first years before they're too into Dropbox. Push that Dropbox doesn't have Undo and, presumably, UW has more storage available?

But I think using git as a generic syncer is the wrong path.  It is tuned for plain text where lines are meaningful. It gets along really badly with CSV files and json files and binary blobs are opaque (by the way, a misconception: git *always* stores complete files; it doesn't do the svn thing where it reconstructs files from diffs.)

In contrast, btfrs and zfs have revisioning and undo built in at the filesysten level, and btfrs at least can do syncs (but only one way? I've never used it actually). I think that is a wiser way to go about things.

But I have my own idea for data which would handle this better. Instead of using opaque streams of bytes with names, treat the filesystem as your data structure: if every object is essentially json, then treat directories as objects and files as atoms (strings and ints and floats). Git should work great on that. Tables suck under it though, unless you rewrite then as a list-of-dicts, but sometimes you won't.

Urbit takes the idea further: the filesystem has revisions built in, and since the OS is a distributed system anyone can sync to anyone automatically.  To deal with the binary blob UX sucking, Urbit *types* the files, and for each known type you have the option of writing a custom diff and patch algorithm, so this means tables don't have to suck: http://urbit.org/docs/user/clay
-- 
Nick Guenther
4B Joint Stats/CS
University of Waterloo