Learn Git from Git: the first commit
Start with an introduction
I saw a post a while back suggesting that one could reveal the inner workings of Git by creating a repository and nesting it within another one. Let’s try it!
In true Git fashion, I will write all of the main headings in the imperative mood.
We’re going to need two Git repositories to get started: an inner and an outer.
Set up the outer Git repository
First let’s make a directory and initialise a Git repository:
mkdir outer
cd outer
git init
Checking the hidden folder contents shows our git repository:
ls -a
.git
Checking the status of the repository:
git status
On branch main
No commits yet
nothing to commit (create/copy files and use "git add" to track)
Set up the inner Git repository
In the outer folder, we’ll make a new folder that will hold the inner repository; this will be the subject of our investigations:
mkdir inner
cd inner
git init
We now have enough to get started!
Check the initial .git folder
Let’s start by investigating inner/.git:
cd .git
ls -a
config description HEAD hooks info objects refs
We have the following folders:
- hooks - contains only
.samplefiles - info - contains an
excludefile - objects - contains
infoandpackfolders, which are empty at this stage - refs - contains
headsandtagsfolders, which are also empty
And the following files:
- config - contains some configuration for the repository, which isn’t very interesting to us
- description - contains the text
Unnamed repository; edit this file 'description' to name the repository. - HEAD - contains the text
ref: refs/heads/main
Create aliases
By design, Git does not allow you to do crazy things like track the .git folder of a nested repository. That would be a terrible idea.
Let’s do it anyway. The following aliases allow us to rename inner/.git so that our outer repository will track it for us.
alias git-mode='cd inner && mv git-internals .git'
alias wtf-mode='mv .git git-internals && cd ..'
Just remember:
- Whenever we are doing git operations on the
innerrepo, we need to be ingit-mode; theinnerfolder needs to have.git - When we are making commits in the
outerrepo,inner/.gitshould be namedgit-internals; I’ll call thiswtf-mode
Prepare before any inner commits
Inner
What happens when we run wtf-mode from the inner repo?
cd inner
wtf-mode
# now we're back in outer/
git status -uall
On branch main
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
inner/git-internals/HEAD
inner/git-internals/config
inner/git-internals/description
inner/git-internals/hooks/applypatch-msg.sample
inner/git-internals/hooks/commit-msg.sample
inner/git-internals/hooks/fsmonitor-watchman.sample
inner/git-internals/hooks/post-update.sample
inner/git-internals/hooks/pre-applypatch.sample
inner/git-internals/hooks/pre-commit.sample
inner/git-internals/hooks/pre-merge-commit.sample
inner/git-internals/hooks/pre-push.sample
inner/git-internals/hooks/pre-rebase.sample
inner/git-internals/hooks/pre-receive.sample
inner/git-internals/hooks/prepare-commit-msg.sample
inner/git-internals/hooks/push-to-checkout.sample
inner/git-internals/hooks/sendemail-validate.sample
inner/git-internals/hooks/update.sample
inner/git-internals/info/exclude
We’ve renamed the .git folder to git-internals. We are in wtf-mode and are now ready to proceed.
Outer
The outer commit is just our starting point and sets a baseline that we can track:
## you should be in outer/
git add .
git commit -m "before any inner commits"
Stage for the inner initial commit
Inner
We need to be in git-mode.
We make a file in the inner repository:
# inner/
touch initial.txt
This isn’t too exciting. If you can be bothered to switch to wtf-mode, you’ll see that the creation of initial.txt was the only change.
Still in git-mode though, let’s stage our file and return to `wtf-mode:
# inner/
git add .
wtf-mode
# outer/
git status
On branch main
Untracked files:
(use "git add <file>..." to include in what will be committed)
inner/git-internals/index
inner/git-internals/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
inner/initial.txt
This is more like it.
We now see two files that .git has created internally:
- inner/git-internals/index - this represents the staging area
- inner/git-internals/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 - a blob file
Index
You can check the contents of the index, even though it’s a binary file:
cat inner/git-internals/index
DIRCi[�l4W��i[�l4V�S�m����⛲��CK�)�wZ���S�
initial.txt���d_��-de
�|1뀥�%
Not very useful. But using a Git command, we can see more details:
git ls-files --stage
100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 initial.txt
So it’s clear that the index holds a reference to the filename and the hash of its corresponding blob.
100644 represents the file permissions, which tells us that a change in file permissions would be perceived as a change to our repository.
The 0 is the stage number; a zero is normal, but you might see other numbers for a single filename if there is a conflict, as the index holds all of them.
Blob file
How do we know it’s a blob file?
git cat-file -t e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
# > blob
git cat-file -p e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
# >
A blob is a binary large object which is useful when storing unstructured, arbitrary data. This one is empty, though.
So here we’ve seen that this blob relates to our initial.txt file.
Outer
Let’s add the files so far to our outer repository.
# wtf-mode
# outer/
git add .
git commit -m "stage inner"
Make the inner initial commit
Inner
Now it’s time to commit that staged file.
# git-mode
# inner/
git commit -m "initial commit"
[main (root-commit) 499a73e] initial commit
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 initial.txt
Outer
Let’s check what’s happened as a result of that commit:
# wtf-mode
# outer/
git status -uall
On branch main
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: inner/git-internals/index
Untracked files:
(use "git add <file>..." to include in what will be committed)
inner/git-internals/COMMIT_EDITMSG
inner/git-internals/logs/HEAD
inner/git-internals/logs/refs/heads/main
inner/git-internals/objects/49/9a73e03c3871ff8510a1663ab5507ab31651ea
inner/git-internals/objects/e7/893ca4219c40722057826f2419bd4794702385
inner/git-internals/refs/heads/main
Our outer Git repository shows it quite nicely. We have an updated index and then some more files. Let’s investigate!
A quick digression: HEAD
Before we check the changes and additions, it’s worth spending a little bit of time reviewing the file at .git/HEAD (not to be confused with the one at .git/logs/HEAD).
If we open up the file at .git/HEAD:
# inner/
cat inner/.git/HEAD
ref: refs/heads/main
The HEAD acts as a pointer, usually to the current branch you are working on. In this case we see that the head is pointing to the main branch.
You can check that makes sense by running the following and checking for main (or whatever your branch name is):
# git-mode
# inner/
git branch
# (press q to quit)
Note that .git/HEAD can point to any commit. If the head points to anywhere other than the last commit in a branch, it is known as a “detached HEAD” state. You might see this message occasionally.
Changes
Index
First we’ll check the state of the index:
# git-mode
# inner/
git ls-files --stage
100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 initial.txt
No changes, but our outer git repository clearly noticed a change. This is because timestamps may have changed for files in the index.
We can check those here:
# git-mode
# inner/
git ls-files --debug
initial.txt
ctime: 1767626860:878153203
mtime: 1767626860:878086995
dev: 16777230 ino: 13593872
uid: 501 gid: 20
size: 0 flags: 0
ctime and mtime indicate when the metadata and content have changed respectively. (It’s slightly confusing that ctime refers to metadata and mtime refers to content).
Additions
COMMIT_EDITMSG
Not very exciting:
# git-mode
# inner/
cat .git/COMMIT_EDITMSG
initial commit
This file might open if you run git commit without using the -m flag to pass a message. If you do pass the -m flag, you shouldn’t see it, and this file is updated in the background.
Refs
You will notice that Git has created a file at inner/.git/refs/heads/main:
# inner/
cat inner/.git/refs/heads/main
499a73e03c3871ff8510a1663ab5507ab31651ea
So we can see that:
- The
HEADfile in.git/points torefs/heads/main - The hash in
refs/heads/mainpoints to an object which is also newly-created
Objects
We have two files in inner/.git/objects/. Let’s inspect them:
git cat-file -t 499a73e03c3871ff8510a1663ab5507ab31651ea
commit
This is therefore the commit that refs/heads/main points to, indicating that this is the latest commit on our main branch.
Checking the other file:
git cat-file -t e7893ca4219c40722057826f2419bd4794702385
tree
If we check the commit using our Git command:
git cat-file -p 499a73e03c3871ff8510a1663ab5507ab31651ea
tree e7893ca4219c40722057826f2419bd4794702385
author Adam Parr <[email protected]> 1767628898 +0000
committer Adam Parr <[email protected]> 1767628898 +0000
initial commit
So the commit stores the author, the committer, and the commit text. Note that the author is the person who wrote the code, and the committer could be different if they managed the merge or the pull request.
A commit could include the hashes of one or more parent commits. A typical commit (for example, if we modified initial.txt) would have one parent. A merge would have two or more commits. Our current commit has no parents, as it is the root commit.
The commit also points to the hash of the tree. The tree stores the state of the repository.
Checking the tree:
git cat-file -p e7893ca4219c40722057826f2419bd4794702385
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 initial.txt
If you search for this commit in this article, you will find it much earlier, when we added it to the index.
Logs (HEAD)
Finally, we’ll check out those two log files.
The first is another HEAD file:
# git-mode
# inner/
cat .git/logs/HEAD
0000000000000000000000000000000000000000 499a73e03c3871ff8510a1663ab5507ab31651ea Adam Parr <[email protected]> 1767628898 +0000 commit (initial): initial commit
Looks familiar? We can pick out the commit hash, the author, the timestamp, and the action description. It could be a commit, a rebase, a checkout, or something else.
The series of zeroes again shows that this is a root commit. The next commit would show 499a73e03c3871ff8510a1663ab5507ab31651ea in its place.
Why is this useful? This file tracks every movement of the HEAD. If you ever get yourself into a “detached HEAD” state, or think you’ve lost something, you should be able to rediscover it using git reflog:
git reflog
499a73e (HEAD -> main) HEAD@{0}: commit (initial): initial commit
This file will grow and grow (it is a log, after all). At the moment, there is only one entry, as we’ve only made one commit.
You can also run git reflog for a particular branch:
git reflog feature
By default, the git reflog removes entries older than 90 days, but you can change it.
Logs (refs)
Checking the remaining file:
cat .git/logs/refs/heads/main
0000000000000000000000000000000000000000 499a73e03c3871ff8510a1663ab5507ab31651ea Adam Parr <[email protected]> 1767628898 +0000 commit (initial): initial commit
This looks the same as the other set of logs. So what’s the difference?
When you switch branches, the HEAD points to the tip of the new branch. This file shows all of the commits linked to the branch you are on, walking up the commit graph by following the parent commit hashes.
You’ll therefore have a file at .git/logs/refs/heads/ for every branch you’ve created.
Make a conclusion
You don’t really need to use Git to explore Git. But with a couple of simple aliases and some creative thinking, it can make it easy to see the effect that your commands are having.
Git is remarkably simple under the hood, although there are other points that haven’t been discussed here, such as tags, and the effect of commands like git cherry-pick. Hopefully you have enough to explore further (or maybe I will).