In my previous curmudgeon post, Writing Without Distractions, I gave version control only a brief mention, and promised a follow-up post. That would be this one. This post is intended for people who are not in the software industry, including not only poets but other writers, students, people who program as a hobby, and programmers who have been in suspended animation for the last decade or three and are just now waking up.
The Wikipedia article on version control gives a pretty good overview, but it suffers from being way too general, and at the same time too focused on software development. This post is aimed at poets and other writers, and will be using the most popular version control system, git. (That Wikipedia article shares many of the same flaws as the one on version control.) My earlier post, Git: The other blockchain, was aimed at software developers and blockchain enthusiasts.
What is version control and why should I use it?
A version control system, also called a software configuration management (SCM) system, is a system for keeping track of changes in a collection of files. (The two terms have slightly different connotations and are used in different contexts, but it's like "writer" and "author" -- a distinction without much of a difference. For what it's worth, git's official website is git-scm.com/, but the first line of text on the site says that "Git is a free and open source distributed version control system". Then in the next paragraph they use the initialism SCM when they want to shorten it. Maybe it's easier to type? Go figure.)
So what does the ability to "track changes" really get you?
- Obviously, you can see what changes you've made, and when you made them. If you're collaborating with someone (your editor perhaps?), you can see what changes they made, and when, on a line-by-line basis. (If you're writing prose, you might want to wrap lines with hard line breaks to make that easier.)
- You can see the cumulative changes between any two dates.
- You can get back a copy of a file -- or your whole project -- as it was at any point in its history. Very handy if you deleted a file last year by mistake. Also very handy if you have cats.
- You can make a clone of your entire project. I used this when I was working on splitting up my collection of song lyrics into four separate collections: mine, public domain, other songs in my repertoire, and work in progress.
- You can make some experimental changes in a side branch, and either merge them back in if you like them or throw them away if you don't. You can make revisions fearlessly because you know you can undo them any time. It helps if you break your project down into small files -- individual poems, songs, stories, chapters, or even sections. That way you can experiment with putting them in a different order (great when you're planning an album) or in more than one place (great for set-lists and anthologies).
- You can make a branch for the published version, and immediately start working on the next edition. If you make corrections before it goes to print, you can pull those changes back into the working copy. When your book is finally published, you can attach a tag to it so that you can find those exact versions again.
Now we get into what you can do with a distributed version control system
- Every copy of your project includes its complete history, so you can work offline without a net connection. Great if inspiration strikes you on the beach or in a coffeehouse.
- You can pull changes from one copy to another, or push changes to one central copy and pull from that. I use this a lot, because I have one laptop on my desk and another in my bedroom. Git's ability to handle merges gracefully helps, too.
- You can have copies in as many different places as you like, and either keep them all in sync or not, as you prefer. If they get out of sync because changes were made on two copies, you can merge the changes.
- You can keep a copy on a server somewhere that you and your collaborators can all use. You don't have to pay for a server; GitHub, GitLab, and BitBucket all let you make repositories for free, including private ones.
- GitHub and GitLab also let you serve a static website directly out of your repo.
- You can update your website just by pushing changes to your web server's copy, by means of "hooks" -- little scripts that get run when something happens.
First, some basic vocabulary. If you've used another version-control
system, you might find git's terminology and commands to be different
enough to be confusing, so you probably ought to read this. If you
haven't used any kind of version control except possibly saving
files with names like
chapter-2-version-3, you will
definitely want to read this.
We'll start with...
Tree: A tree is a directory (you may know them as "folders" if you started using a computer after the Mac was introduced in 1984) and all of the files and subdirectories contained in it. That's because if you start with a complete listing in outline form, like a table of contents, and rotate it 45 degrees clockwise, it looks like an upside-down tree, with the root at the top and the leaves at the bottom. Inside of the computer is a world where magic works and trees grow from their root down. Eventually you get used to it.
The tree that contains your entire project is called the working tree for the project, and that's the collection of files that git is going to be controlling. You can have as many working trees as you want, and their contents can be anything from a single poem to the complete works of Shakespeare to a copy of Wikipedia (or more -- some software projects are big). Git works best if you make lots of smallish projects, maybe at the level of a book (or maybe a trilogy).
Repo: The next thing you need is a git
repository, usually shortened to repo. The repo
for a project is located in the working tree, in a directory called
.git. The initial dot keeps most directory listings from
including it, but you can use
ls -a to include it. You
create a git repo with the command
git init. If you
already have a git repo or working tree somewhere -- including on some
other computer -- you get a local copy with
Sometimes you want a repository that isn't attached to a working tree;
that's called a bare repository. That's the kind of repo
you'd put on a backup device or a shared server You get one by adding the
--bare option to
git init or
git clone. If your project is called, say,
website, you'd probably want to name your bare repo
Commit: This is both a verb and a noun. We'll take the
verb first -- to commit a set of changes means to put a copy
of them in the repository as a new version. This is essentially the same
meaning that it has when you commit to a plan, or commit a mad relative to
an institution. (Helping them escape is a little tricky, but you can do
As a noun, a commit is the small chunk of data that gives a unique identity to the changes you committed. It includes the author's name and email address, the date, any comments you care to make about what you did and why, an optional digital signature, and the unique identifier of the previous commit.
Hash: Everything in a git repository -- commits, files, trees, and tags -- has a unique identifier called its hash. (I discussed hashes last month in Git: The other blockchain, so I won't bore you by repeating it.)
Index: Git's index is one of its more unusual (some people would say confusing) features. It's also called the staging area -- it's where git keeps track of what is going to be in the next commit. So first you add the files that you've changed, then you commit the changes. (You can add your whole tree; git can tell what's changed.) The index is particularly useful if you've made changes in two different files, and you want to commit them separately with their own descriptions.
Basic Git (for Poets)
Note: this section is not a tutorial -- you can find some
of those in the Resources, and there is also
git help tutorial. But it will get you started
with the basic commands, and if you're the type of person who likes to
dive right in without reading the instructions, it will be enough to show
which end of the pool to dive into. I may write a real tutorial later.
If you want to just try stuff, I suggest installing git.
Whenever you get stuck, you can use
git help. Follow it
with the name of a git command to get the full reference manual ("man
page") for that command. (You can also use the
There's a pretty good overview, from a developer's perspective, in Everyday Git, which is also
git help everyday.
Note that commands are preceded by a dollar sign and space (
), which is the default shell prompt, and that comments are
preceded by an octothorpe (
#). (a convention that predates
Twitter's "hashtags" by several decades).
$ mkdir Poems $ cd Poems $ git init # there is now a git repo in Poems Initialized empty Git repository in /home/steve/vv/.../Journals/Poems/.git/ $ echo "Git history is" > Git_Haiku # use echo command to avoid using an editor $ git add . $ git status On branch master No commits yet Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: Git_Haiku # git helpfully tells you how to unstage the file if you didn't mean to add it $ git commit -m "start a haiku. It needs 2 more lines" [master (root-commit) c97c99d] start a haiku. It needs 2 more lines 1 file changed, 1 insertion(+) create mode 100644 Git_Haiku $ echo "a daisy chain with commits" >> Git_Haiku $ echo "like frozen flowers" >> Git_Haiku $ git commit -m "add missing lines" On branch master Changes not staged for commit: modified: Git_Haiku no changes added to commit $ git add . $ git commit -m "add missing lines" [master 345a112] add missing lines 1 file changed, 2 insertions(+) $ git log --oneline 345a112 (HEAD -> master) add missing lines c97c99d start a haiku. It needs 2 more lines
Notice that the first time I ran
git commit, Git complained
because there was nothing to commit. Files have to be added (i.e. put in
the index) before you can commit them.
If you try this, the numbers you will get in the log will be different, because you are not me and the time is not now. And possibly because I cheated and rewrote the poem without actually re-running the whole thing.
The part you've been waiting for -- the end. This post is already long, so I'll just refer you to the resources for now. Expect another installment, though, and please feel free to suggest future topics.
git help tutorialand
git help tutorial-2. You may also have the Git User's Manual installed on your computer.
- Everyday Git.
- Learn Enough Git to Be Dangerous. This is where I usually point newcomers to git. This is actually the third volume in the Learn Enough to Be Dangerous series; the first is about the command line and is a good place to start if you'rle a complete beginner with command-line tools.
- GitHub's Git and GitHub learning resources page has links to some good tutorials, too.
- Pro Git - Book Everything is in here, including installation instructions (in Chapter 1) and a tutorial (Chapter 2)
- Pro Git - Book The rest of Pro Git will take you as deep as you want to go.
- Git user's Manual.
- Git Git's official website
- Git - Reference This is the
official reference manual, in the form of Unix "man" pages. That means
that you can use the
mancommand to read them, but it's easier to follow links on the web.
git help- that's right, the manual is also available by way of git itself. That can be very convenient.
- Git - Wikipedia
- Version control - Wikipedia
- Software configuration management - Wikipedia
- Git: The other blockchain