In my previous curmudgeon post, Writing Without Distractions, I gave version control only a brief mention, and promised a follow-up post. That would be this one. This post is intended for people who are not in the software industry, including not only poets but other writers, students, people who program as a hobby, and programmers who have been in suspended animation for the last decade or three and are just now waking up.

The Wikipedia article on version control gives a pretty good overview, but it suffers from being way too general, and at the same time too focused on software development. This post is aimed at poets and other writers, and will be using the most popular version control system, git. (That Wikipedia article shares many of the same flaws as the one on version control.) My earlier post, Git: The other blockchain, was aimed at software developers and blockchain enthusiasts.

What is version control and why should I use it?

A version control system, also called a software configuration management (SCM) system, is a system for keeping track of changes in a collection of files. (The two terms have slightly different connotations and are used in different contexts, but it's like "writer" and "author" -- a distinction without much of a difference. For what it's worth, git's official website is git-scm.com/, but the first line of text on the site says that "Git is a free and open source distributed version control system". Then in the next paragraph they use the initialism SCM when they want to shorten it. Maybe it's easier to type? Go figure.)

So what does the ability to "track changes" really get you?

  • Obviously, you can see what changes you've made, and when you made them. If you're collaborating with someone (your editor perhaps?), you can see what changes they made, and when, on a line-by-line basis. (If you're writing prose, you might want to wrap lines with hard line breaks to make that easier.)
  • You can see the cumulative changes between any two dates.
  • You can get back a copy of a file -- or your whole project -- as it was at any point in its history. Very handy if you deleted a file last year by mistake. Also very handy if you have cats.
  • You can make a clone of your entire project. I used this when I was working on splitting up my collection of song lyrics into four separate collections: mine, public domain, other songs in my repertoire, and work in progress.
  • You can make some experimental changes in a side branch, and either merge them back in if you like them or throw them away if you don't. You can make revisions fearlessly because you know you can undo them any time. It helps if you break your project down into small files -- individual poems, songs, stories, chapters, or even sections. That way you can experiment with putting them in a different order (great when you're planning an album) or in more than one place (great for set-lists and anthologies).
  • You can make a branch for the published version, and immediately start working on the next edition. If you make corrections before it goes to print, you can pull those changes back into the working copy. When your book is finally published, you can attach a tag to it so that you can find those exact versions again.

Now we get into what you can do with a distributed version control system like git.

  • Every copy of your project includes its complete history, so you can work offline without a net connection. Great if inspiration strikes you on the beach or in a coffeehouse.
  • You can pull changes from one copy to another, or push changes to one central copy and pull from that. I use this a lot, because I have one laptop on my desk and another in my bedroom. Git's ability to handle merges gracefully helps, too.
  • You can have copies in as many different places as you like, and either keep them all in sync or not, as you prefer. If they get out of sync because changes were made on two copies, you can merge the changes.
  • You can keep a copy on a server somewhere that you and your collaborators can all use. You don't have to pay for a server; GitHub, GitLab, and BitBucket all let you make repositories for free, including private ones.
  • GitHub and GitLab also let you serve a static website directly out of your repo.
  • You can update your website just by pushing changes to your web server's copy, by means of "hooks" -- little scripts that get run when something happens.

Git Terminology

First, some basic vocabulary. If you've used another version-control system, you might find git's terminology and commands to be different enough to be confusing, so you probably ought to read this. If you haven't used any kind of version control except possibly saving files with names like chapter-2-version-3, you will definitely want to read this.

We'll start with...

Tree: A tree is a directory (you may know them as "folders" if you started using a computer after the Mac was introduced in 1984) and all of the files and subdirectories contained in it. That's because if you start with a complete listing in outline form, like a table of contents, and rotate it 45 degrees clockwise, it looks like an upside-down tree, with the root at the top and the leaves at the bottom. Inside of the computer is a world where magic works and trees grow from their root down. Eventually you get used to it.

The tree that contains your entire project is called the working tree for the project, and that's the collection of files that git is going to be controlling. You can have as many working trees as you want, and their contents can be anything from a single poem to the complete works of Shakespeare to a copy of Wikipedia (or more -- some software projects are big). Git works best if you make lots of smallish projects, maybe at the level of a book (or maybe a trilogy).

Repo: The next thing you need is a git repository, usually shortened to repo. The repo for a project is located in the working tree, in a directory called .git. The initial dot keeps most directory listings from including it, but you can use ls -a to include it. You create a git repo with the command git init. If you already have a git repo or working tree somewhere -- including on some other computer -- you get a local copy with git clone.

Sometimes you want a repository that isn't attached to a working tree; that's called a bare repository. That's the kind of repo you'd put on a backup device or a shared server You get one by adding the --bare option to git init or git clone. If your project is called, say, website, you'd probably want to name your bare repo website.git.

Commit: This is both a verb and a noun. We'll take the verb first -- to commit a set of changes means to put a copy of them in the repository as a new version. This is essentially the same meaning that it has when you commit to a plan, or commit a mad relative to an institution. (Helping them escape is a little tricky, but you can do it with git revert.)

As a noun, a commit is the small chunk of data that gives a unique identity to the changes you committed. It includes the author's name and email address, the date, any comments you care to make about what you did and why, an optional digital signature, and the unique identifier of the previous commit.

Hash: Everything in a git repository -- commits, files, trees, and tags -- has a unique identifier called its hash. (I discussed hashes last month in Git: The other blockchain, so I won't bore you by repeating it.)

Index: Git's index is one of its more unusual (some people would say confusing) features. It's also called the staging area -- it's where git keeps track of what is going to be in the next commit. So first you add the files that you've changed, then you commit the changes. (You can add your whole tree; git can tell what's changed.) The index is particularly useful if you've made changes in two different files, and you want to commit them separately with their own descriptions.

Basic Git (for Poets)

Note: this section is not a tutorial -- you can find some of those in the Resources, and there is also git help tutorial. But it will get you started with the basic commands, and if you're the type of person who likes to dive right in without reading the instructions, it will be enough to show which end of the pool to dive into. I may write a real tutorial later. If you want to just try stuff, I suggest installing git.

Whenever you get stuck, you can use git help. Follow it with the name of a git command to get the full reference manual ("man page") for that command. (You can also use the man command.) There's a pretty good overview, from a developer's perspective, in Everyday Git, which is also available with git help everyday.

Note that commands are preceded by a dollar sign and space ($ ), which is the default shell prompt, and that comments are preceded by an octothorpe (#). (a convention that predates Twitter's "hashtags" by several decades).

$ mkdir Poems
$ cd Poems
$ git init                              # there is now a git repo in Poems
Initialized empty Git repository in /home/steve/vv/.../Journals/Poems/.git/
$ echo "Git history is" > Git_Haiku       # use echo command to avoid using an editor
$ git add .
$ git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	new file:   Git_Haiku
# git helpfully tells you how to unstage the file if you didn't mean to add it
$ git commit -m "start a haiku.  It needs 2 more lines"
[master (root-commit) c97c99d] start a haiku.  It needs 2 more lines
 1 file changed, 1 insertion(+)
 create mode 100644 Git_Haiku
$ echo "a daisy chain with commits" >> Git_Haiku
$ echo "like frozen flowers" >> Git_Haiku
$ git commit -m "add missing lines"
On branch master
Changes not staged for commit:
	modified:   Git_Haiku

no changes added to commit
$ git add .
$ git commit -m "add missing lines"
[master 345a112] add missing lines
 1 file changed, 2 insertions(+)
$ git log --oneline
345a112 (HEAD -> master) add missing lines
c97c99d start a haiku.  It needs 2 more lines

Notice that the first time I ran git commit, Git complained because there was nothing to commit. Files have to be added (i.e. put in the index) before you can commit them.

If you try this, the numbers you will get in the log will be different, because you are not me and the time is not now. And possibly because I cheated and rewrote the poem without actually re-running the whole thing.

...and Finally

The part you've been waiting for -- the end. This post is already long, so I'll just refer you to the resources for now. Expect another installment, though, and please feel free to suggest future topics.

Resources

Tutorials

Digging Deeper