Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Git Version Control Cookbook
Git Version Control Cookbook

Git Version Control Cookbook: 90 hands-on recipes that will increase your productivity when using Git as a version control system

eBook
€20.98 €29.99
Paperback
€36.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Git Version Control Cookbook

Chapter 1. Navigating Git

In this chapter, we will cover the following topics:

  • Git's objects
  • The three stages
  • Viewing the DAG
  • Extracting fixed issues
  • Getting a list of the changed files
  • Viewing the history with Gitk
  • Finding commits in the history
  • Searching through the history code

Introduction

In this chapter, we will take a look at Git's data model. We will learn how Git references its objects and how the history is recorded. We will learn how to navigate the history, from finding certain text snippets in commit messages to the introduction of a certain string in the code.

The data model of Git is different from other common version control systems (VCSs) in the way Git handles its data. Traditionally, a VCS will store its data as an initial file followed by a list of patches for each new version of the file.

Introduction

Git is different; instead of the regular file and patches list, Git records a snapshot of all the files tracked by Git and their paths relative to the repository root, that is, the files tracked by Git in the file system tree. Each commit in Git records the full tree state. If a file does not change between commits, Git will not store the file once more; instead, Git stores a link to the file.

Introduction

This is what makes Git different from most other VCSs, and in the following chapters, we will explore some of the benefits of this powerful model.

The way Git references the files and directories it tracks is directly built into the data model. In short, the Git data model can be summarized as shown in the following diagram:

Introduction

The commit object points to the root tree. The root tree points to subtrees and files. Branches and tags point to a commit object and the HEAD object points to the branch that is currently checked out. So for every commit, the full tree state and snapshot are identified by the root tree.

Git's objects

Now that you know Git stores every commit as a full tree state or snapshot, let's look closer at the object's Git store in the repository.

Git's object storage is a key-value storage, the key being the ID of the object and the value being the object itself. The key is an SHA-1 hash of the object, with some additional information such as size. There are four types of objects in Git, branches (which are not objects, but are important), and the special HEAD pointer that refers to the branch/commit currently checked out. The four object types are as follows:

  • Files, or blobs as they are also called in the Git context
  • Directories, or trees in the Git context
  • Commits
  • Tags

We will start by looking at the most recent commit object in the repository we just cloned, keeping in mind that the special HEAD pointer points to the branch currently checked out.

Getting ready

To view the objects in the Git database, we first need a repository to be examined. For this recipe, we will clone an example repository located here:

$ git clone https://github.com/dvaske/data-model.git
$ cd data-model

Now you are ready to look at the objects in the database, we will start by looking first at the commit object, then the trees, the files, and finally the branches and tags.

How to do it...

Let's take a closer look at the object's Git stores in the repository.

The commit object

The special Git object HEAD always points to the current snapshot/commit, so we can use that as a target for our request of the commit we want to have a look at:

$ git cat-file -p HEAD
tree 34fa038544bcd9aed660c08320214bafff94150b
parent a90d1906337a6d75f1dc32da647931f932500d83
author Aske Olsson <[email protected]> 1386933960 +0100
committer Aske Olsson <[email protected]> 1386941455 +0100

This is the subject line of the commit message

It should be followed by a blank line then the body, which is this text. Here you can have multiple paragraphs etc. and explain your commit. It's like an email with subject and body, so get people's attention in the subject

The cat-file command with the -p option pretty prints the object given on the command line; in this case, HEAD, which points to master, which in turn points to the most-recent commit on the branch.

We can now see the commit object, consisting of the root tree (tree), the parent commit object's ID (parent), author and timestamp information (author), committer and timestamp information (committer), and the commit message.

The tree object

To see the tree object, we can run the same command on the tree, but with the tree ID (34fa038544bcd9aed660c08320214bafff94150b) as the target:

$ git cat-file -p 34fa038544bcd9aed660c08320214bafff94150b 
100644 blob f21dc2804e888fee6014d7e5b1ceee533b222c15    README.md
040000 tree abc267d04fb803760b75be7e665d3d69eeed32f8    a_sub_directory
100644 blob b50f80ac4d0a36780f9c0636f43472962154a11a    another-file.txt
100644 blob 92f046f17079aa82c924a9acf28d623fcb6ca727    cat-me.txt
100644 blob bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd    hello_world.c

We can also specify that we want the tree object from the commit pointed to by HEAD, by specifying git cat-file -p HEAD^{tree}, which would give the same results as the previous one. The special notation HEAD^{tree} means that from the reference given, (HEAD) recursively dereferences the object at the reference until a tree object is found. The first tree object is the root tree object found from the commit pointed to by the master branch, which is pointed to by HEAD. A generic form of the notation is <rev>^<type> and will return the first object of <type> searching recursively from <rev>.

From the tree object, we can see what it contains: file type/permissions, type (tree/blob), ID, and pathname:

Type/

Permissions

Type

ID/SHA-1

Pathname

100644

blob

f21dc2804e888fee6014d7e5b1ceee533b222c15

README.md

040000

tree

abc267d04fb803760b75be7e665d3d69eeed32f8

a_sub_directory

100644

blob

b50f80ac4d0a36780f9c0636f43472962154a11a

another-file.txt

100644

blob

92f046f17079aa82c924a9acf28d623fcb6ca727

cat-me.txt

100644

blob

bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd

hello-world.c

The blob object

Now, we can investigate the blob (file) object. We can do it using the same command, giving the blob ID as target for the cat-me.txt file:

$ git cat-file -p 92f046f17079aa82c924a9acf28d623fcb6ca727

This is the content of the file: "cat-me.txt."

Not really that exciting, huh?

This is simply the content of the file, which we will also get by running a normal cat cat-me.txt command. So, the objects are tied together, blobs to trees, trees to other trees, and the root tree to the commit object, all by the SHA-1 identifier of the object.

The branch

The branch object is not really like any other Git objects; you can't print it using the cat-file command as we can with the others (if you specify the -p pretty print, you'll just get the commit object it points to):

$ git cat-file master
usage: git cat-file (-t|-s|-e|-p|<type>|--textconv) <object>
   or: git cat-file (--batch|--batch-check) < <list_of_objects>

<type> can be one of: blob, tree, commit, tag.
...
$ git cat-file -p master
tree 34fa038544bcd9aed660c08320214bafff94150b
parent a90d1906337a6d75f1dc32da647931f932500d83
...

Instead, we can take a look at the branch inside the .git folder where the whole Git repository is stored. If we open the text file .git/refs/heads/master, we can actually see the commit ID the master branch points to. We can do this using cat as follows:

$ cat .git/refs/heads/master
34acc370b4d6ae53f051255680feaefaf7f7850d 

We can verify that this is the latest commit by running git log -1:

$ git log -1
commit 34acc370b4d6ae53f051255680feaefaf7f7850d
Author: Aske Olsson <[email protected]>
Date:   Fri Dec 13 12:26:00 2013 +0100

    This is the subject line of the commit message
...

We can also see that HEAD is pointing to the active branch by using cat with the .git/HEAD file:

$ cat .git/HEAD
ref: refs/heads/master

The branch object is simply a pointer to a commit, identified by its SHA-1 hash.

The tag object

The last object to be analyzed is the tag object. There are three different kinds of tags: a lightweight (just a label) tag, an annotated tag, and a signed tag. In the example repository, there are two annotated tags:

$ git tag
v0.1
v1.0

Let's take a closer look at the v1.0 tag:

$ git cat-file -p v1.0
object 34acc370b4d6ae53f051255680feaefaf7f7850d
type commit
tag v1.0
tagger Aske Olsson <[email protected]> 1386941492 +0100

We got the hello world C program merged, let's call that a release 1.0

As you can see, the tag consists of an object, which in this case is the latest commit on the master branch, the object's type (both, commits, and blobs and trees can be tagged), the tag name, the tagger and timestamp, and finally a tag message.

How it works...

The Git command git cat-file -p will pretty print the object given as an input. Normally, it is not used in everyday Git commands, but it is quite useful to investigate how it ties together the objects. We can also verify the output of git cat-file, by rehashing it with the Git command git hash-object; for example, if we want to verify the commit object at HEAD (34acc370b4d6ae53f051255680feaefaf7f7850d), we can run the following command:

$ git cat-file -p HEAD | git hash-object -t commit --stdin
34acc370b4d6ae53f051255680feaefaf7f7850d

If you see the same commit hash as HEAD pointing towards you, you can verify whether it is correct with git log -1.

There's more...

There are many ways to see the objects in the Git database. The git ls-tree command can easily show the contents of trees and subtrees and git show can show the Git objects, but in a different way.

See also

  • For further information about Git plumbing, see Chapter 11, Git Plumbing and Attributes, almost at the end of this book.

The three stages

We have seen the different objects in Git but how do we create them? In this example, we'll see how to create a blob, tree, and commit object in the repository. We'll learn about the three stages of creating a commit.

Getting ready

We'll use the same data-model repository as seen in the last recipe:

$ git clone https://github.com/dvaske/data-model.git
$ cd data-model

How to do it…

First, we'll make a small change to the file and check git status:

$ echo "Another line" >> another-file.txt
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

  modified:   another-file.txt

no changes added to commit (use "git add" and/or "git commit -a")

This, of course, just tells us that we have modified another-file.txt and we need to use git add to stage it. Let's add the another-file.txt file and run git status again:

$ git add another-file.txt
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

  modified:   another-file.txt

The file is now ready to be committed, just as you have probably seen before. But what happened during the add command? The add command, generally speaking, moves files from the working directory to the staging area, but more than this actually happens, though you don't see it. When a file is moved to the staging area, the SHA-1 hash of the file is created and the blob object is written to Git's database. This happens for all the files added and every time a file is added, but if nothing changes for a file, this means it is already stored in the database. At first, this might seem that the database is growing quickly, but this is not the case. Garbage collection kicks in at times, compressing and cleaning up the database and keeping only the objects that are required.

We can edit the file again and run git status:

$ echo 'Whoops almost forgot this' >> another-file.txt
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

  modified:   another-file.txt

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

  modified:   another-file.txt

Now, the file shows up both in the Changes to be committed and Changes not staged for commit sections. This looks a bit weird at first, but there is of course an explanation. When we added the file the first time, the content of it was hashed and stored in Git's database. The changes from the second change of the file have not yet been hashed and written to the database; it only exists in the working directory. Therefore, the file shows up in both the Changes to be committed and Changes not staged for commit sections; the first change is ready to be committed, the second is not. Let's also add the second change:

$ git add another-file.txt
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

  modified:   another-file.txt

Now, all the changes we have made to the file are ready to be committed and we can record a commit:

$ git commit -m 'Another change to another file'
[master 55e29e4] Another change to another file
 1 file changed, 2 insertions(+)

How it works…

As we learned previously, the add command creates the blob object, the tree, and commit objects; however, they are created when we run the commit command. We can view these objects with the cat-file command, as we saw in the previous recipe:

$ git cat-file -p HEAD
tree 162201200b5223d48ea8267940c8090b23cbfb60
parent 34acc370b4d6ae53f051255680feaefaf7f7850d
author Aske Olsson <[email protected]> 1401744547 +0200
committer Aske Olsson <[email protected]> 1401744547 +0200

Another change to another file

The root-tree object from the commit is:

$ git cat-file -p HEAD^{tree}
100644 blob f21dc2804e888fee6014d7e5b1ceee533b222c15  README.md
040000 tree abc267d04fb803760b75be7e665d3d69eeed32f8  a_sub_directory
100644 blob 35d31106c5d6fdb38c6b1a6fb43a90b183011a4b  another-file.txt
100644 blob 92f046f17079aa82c924a9acf28d623fcb6ca727  cat-me.txt
100644 blob bb2fe940924c65b4a1cefcbdbe88c74d39eb23cd  hello_world.c

From the previous recipe, we know the SHA-1 of the root tree was 34fa038544bcd9aed660c08320214bafff94150b and of the another-file.txt file was b50f80ac4d0a36780f9c0636f43472962154a11a, and as expected, they changed in our latest commit when we updated the another-file.txt file. We added the same file, another-file.txt, twice before we created the commit, recording the changes to the history of the repository. We also learned that the add command creates a blob object when called. So in the Git database, there must be an object similar to the content of another-file.txt the first time we added the file to the staging area. We can use the git fsck command to check for dangling objects, that is, objects that are not referred by other objects or references:

$ git fsck --dangling
Checking object directories: 100% (256/256), done.
dangling blob ad46f2da274ed6c79a16577571a604d3281cd6d9

Let's check the contents of the blob using the following command:

$ git cat-file -p ad46f2da274ed6c79a16577571a604d3281cd6d9
This is just another file
Another line

The blob is, as expected, similar to the content of another-file.txt when we added it to the staging area the first time.

The following diagram describes the tree stages and the commands used to move between the stages:

How it works…

See also

  • For more examples and information on the cat-file, fsck, and other plumbing commands, see Chapter 11, Git Plumbing and Attributes.

Viewing the DAG

The history in Git is formed from the commit objects; as development advances, branches are created and merged, and the history will create a directed acyclic graph, the DAG, due to the way Git ties a commit to its parent commit. The DAG makes it easy to see the development of a project based on the commits. Please note that the arrows in the following diagram are dependency arrows, meaning that each commit points to its parent commit(s), hence the arrows point in the opposite direction of time:

Viewing the DAG

A graph of the example repository with abbreviated commit IDs

Viewing the history (the DAG) is built into Git by its git log command. There are also a number of visual Git tools that can graphically display the history. This section will show some features of git log.

Getting ready

We will use the example repository from the last section and ensure that the master branch is pointing to 34acc37:

$ git checkout master && git reset --hard 34acc37

In the previous command, we only use the first seven characters (34acc37) of the commit ID; this is fine as long as the abbreviated ID used is unique in the repository.

How to do it...

The simplest way to see the history is to use the git log command; this will display the history in reverse chronological order. The output is paged through less and can be further limited, for example, by providing only the number of commits in history to be displayed:

$ git log -3

This will display the following result:

commit 34acc370b4d6ae53f051255680feaefaf7f7850d
Author: Aske Olsson <[email protected]>
Date:   Fri Dec 13 12:26:00 2013 +0100

    This is the subject line of the commit message.


    It should be followed by a blank line then the body, which is this text. Here 
    you can have multiple paragraphs etc. and explain your commit. It's like an 
    email with subject and body, so get people's attention in the subject

commit a90d1906337a6d75f1dc32da647931f932500d83
Author: Aske Olsson <[email protected]>
Date:   Fri Dec 13 12:17:42 2013 +0100

    Instructions for compiling hello_world.c

commit 485884efd6ac68cc7b58c643036acd3cd208d5c8
Merge: 44f1e05 0806a8b
Author: Aske Olsson <[email protected]>
Date:   Fri Dec 13 12:14:49 2013 +0100

    Merge branch 'feature/1'

    Adds a hello world C program.

Tip

Turn on colors in the Git output by running git config --global color.ui auto.

By default, git log prints the commit, author's name and e-mail ID, timestamp, and the commit message. However, the information isn't very graphical, especially if you want to see branches and merges.

To display this information and limit some of the other data, you can use the following options with git log:

$ git log --decorate --graph --oneline --all

The previous command will show one commit per line (--oneline) identified by its abbreviated commit ID and the commit message subject. A graph will be drawn between the commits depicting their dependency (--graph). The --decorate option shows the branch names after the abbreviated commit ID, and the --all option shows all the branches, instead of just the current one(s).

$ git log --decorate --graph --oneline --all
* 34acc37 (HEAD, tag: v1.0, origin/master, origin/HEAD, master) This is the sub...
* a90d190 Instructions for compiling hello_world.c
*   485884e Merge branch 'feature/1'
...

This output, however, gives neither the timestamp nor author information, due to the way the --oneline option formats the output.

Fortunately, the log command gives us the possibility to create our own output format. So, we can make a history view similar to the previous. The colors are made with the %C<color-name>text-be-colored%Creset syntax: including the author and timestamp information, and some colors to display it nicely:

$ git log --all --graph --pretty=format:\
'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%ci) %C(bold blue)<%an>%Creset'
How to do it...

This is a bit cumbersome to write, but luckily it can be made as an alias so you only have to write it once:

git config ----global alias.graph "log --all --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%ci) %C(bold blue)<%an>%Creset'"

Tip

Now, all you need to do is call git graph to show the history as you saw previously.

How it works...

Git traverses the DAG by following the parent IDs (hashes) from the given commit(s). The options passed to git log can format the output in different ways; this can serve several purposes, for example, to give a nice graphical view of the history, branches, and tags, as seen previously, or to extract specific information from the history of a repository to use, for example, in a script.

See also

  • For more information about configuration and aliases, see Chapter 2, Configuration.

Extracting fixed issues

A common use case when creating a release is to create a release note, containing among other things, the bugs fixed in the release. A good practice is to write in the commit message if a bug is fixed by the commit. A better practice is to have a standard way of doing it, for example, a line with the string "Fixes-bug: " followed by the bug identifier in the last part of the commit message. This makes it easy to compile a list of bugs fixed for a release note. The JGit project is a good example of this; their bug identifier in the commit messages is a simple "Bug: " string followed by the bug ID.

This recipe will show you how to limit the output of git log to list just the commits since the last release (tag), which contains a bug fix.

Getting ready

Clone the JGit repository using the following command lines:

$ git clone https://git.eclipse.org/r/jgit/jgit
$ cd jgit

If you want the exact same output as in this example, reset your master branch to the following commit, b14a93971837610156e815ae2eee3baaa5b7a44b:

$ git checkout master && git reset --hard b14a939

How to do it...

You are now ready to look through the commit log for commit messages that describe the bugs fixed. First, let's limit the log to only look through the history since the last tag (release). To find the last tag, we can use git describe:

$ git describe 
v3.1.0.201310021548-r-96-gb14a939 

The preceding output tells us three things:

  • The last tag was v3.1.0.201310021548-r
  • The number of commits since the tag were 96
  • The current commit in abbreviated form is b14a939

Now, the log can be parsed from HEAD to v3.1.0.201310021548-r. But just running git log 3.1.0.201310021548-r..HEAD will give us all the 96 commits, and we just want the commits with commit messages that contain "Bug: xxxxxx" for our release note. The xxxxxx is an identifier for the bug, for example, a number. We can use the --grep option with git log for this purpose: git log --grep "Bug: ". This will give us all the commits with "Bug: " in the commit message; all we need now is just to format it to something we can use for our release note.

Let's say we want the release note format to look like the following template:

Commit-id: Commit subject
Fixes-bug: xxx

Our command line so far is as follows:

$ git log --grep "Bug: " v3.1.0.201310021548-r..HEAD

This gives us all the bug fix commits, but we can format this to a format that is easily parsed with the --pretty option. First, we will print the abbreviated commit ID %h, followed by a separator of our choice |, then the commit subject %s, (first line of the commit message), followed by a new line %n, and the body, %b:

--pretty="%h|%s%n%b"

The output of course needs to be parsed, but that's easy with regular Linux tools such as grep and sed:

First, we just want the lines that contain "|" or "Bug: ":

grep -E "\||Bug: "

Then, we replace these with sed:

sed -e 's/|/: /' -e 's/Bug:/Fixes-bug:/'

The entire command put together gives:

\$ git log --grep "Bug: " v3.1.0.201310021548-r..HEAD --pretty="%h|%s%n%b" 
| grep -E "\||Bug: " | sed -e 's/|/: /' -e 's/Bug:/Fixes-bug:/'

The previous set of commands gives the following output:

f86a488: Implement rebase.autostash 
Fixes-bug: 422951 
7026658: CLI status should support --porcelain 
Fixes-bug: 419968 
e0502eb: More helpful InvalidPathException messages (include reason) 
Fixes-bug: 413915 
f4dae20: Fix IgnoreRule#isMatch returning wrong result due to missing reset 
Fixes-bug: 423039 	
7dc8a4f: Fix exception on conflicts with recursive merge 
Fixes-bug: 419641 
99608f0: Fix broken symbolic links on Cygwin. 
Fixes-bug: 419494 
...

Now, we can extract the bug information from the bug tracker and put the preceding code in the release note as well, if necessary.

How it works...

First, we limit the git log command to only show the range of commits we are interested in, then we further limit the output by filtering the "Bug: " string in the commit message. We pretty print the string so we can easily format it to a style we need for the release note and finally find and replace with grep and sed to completely match the style of the release note.

There's more...

If we just wanted to extract the bug IDs from the commit messages and didn't care about the commit IDs, we could have just used grep after the git log command, still limiting the log to the last tag:

$ git log  v3.1.0.201310021548-r..HEAD | grep "Bug: "

If we just want the commit IDs and their subjects but not the actual bug IDs, we can use the --oneline feature of git log combined with the --grep option:

$ git log --grep "Bug: " --oneline  v3.1.0.201310021548-r..HEAD

Getting a list of the changed files

As seen in the previous recipe where a list of fixed issues was extracted from the history, a list of all the files that have been changed since the last release can also easily be extracted. The files can be further filtered to find those that have been added, deleted, modified, and so on.

Getting ready

The same repository and HEAD position (HEAD pointing to b14a939) as seen in the previous recipe will be used. The release is also the same, which is v3.1.0.201310021548-r.

How to do it...

The following command lists all the files changed since the last release (v3.1.0.201310021548-r):

$ git diff --name-only v3.1.0.201310021548-r..HEAD
org.eclipse.jgit.packaging/org.eclipse.jgit.target/jgit-4.3.target 
org.eclipse.jgit.packaging/org.eclipse.jgit.target/jgit-4.4.target 
org.eclipse.jgit.pgm.test/tst/org/eclipse/jgit/pgm/DescribeTest.java 
org.eclipse.jgit.pgm.test/tst/org/eclipse/jgit/pgm/FetchTest.java 
org.eclipse.jgit.pgm/src/org/eclipse/jgit/pgm/Describe.java 
...

How it works...

The git diff command operates on the same revision range as git log did in the previous recipe. By specifying --name-only, Git will only give the paths of the files as output changed by the commits in the range specified.

There's more...

The output of the command can be further filtered; if we only want to show which files have been deleted in the repository since the last commit, we can use the --diff-filter switch with git diff:

$ git diff --name-only --diff-filter=D  v3.1.0.201310021548-r..HEAD 
org.eclipse.jgit.junit/src/org/eclipse/jgit/junit/SampleDataRepositoryTestCase.java 
org.eclipse.jgit.packaging/org.eclipse.jgit.target/org.eclipse.jgit.target.target 
org.eclipse.jgit.test/tst/org/eclipse/jgit/internal/storage/file/GCTest.java

There are also switches for the files that have been added (A), copied (C), deleted (D), modified (M), renamed (R), and so on.

See also

  • For more information, visit the help page by running git help diff

Viewing history with Gitk

We saw earlier how we can view the history (the DAG) and visualize it with the use of git log. However, as the history grows, the terminal representation of the history can be a bit cumbersome to navigate. Fortunately, there are a lot of graphical tools around Git, one of them being Gitk, which works on multiple platforms (Linux, Mac, and Windows).

This recipe will show you how to get started with Gitk.

Getting ready

Make sure you have Gitk installed:

$ which gitk
/usr/local/bin/gitk

If nothing shows up, Gitk in not installed on your system, or at least is not available on your $PATH.

Change the directory to the data-model repository from the objects and DAG examples. Make sure the master branch is checked out and pointing to 34acc37:

$ git checkout master && git reset --hard 34acc37

How to do it...

In the repository, run gitk --all & to bring up the Gitk interface. You can also specify the commit range or branches you want similar to git log or provide --all to see everything:

$ gitk --all &
How to do it...

How it works...

Gitk parses the information for every commit and the objects attached to it to provide an easy graphical information screen that shows a graph of the history, author, and timestamp for each commit. In the bottom half, the commit message and the patches for each file changed and the list of files changed by the selected commit are displayed.

Though very lightweight and fast, Gitk is a very powerful tool. There are many different context menus regarding clicking on a commit, a branch, or a tag in the history view. You can create and delete branches, revert and cherry-pick commits, diff selected commits, and much more.

There's more...

From the interface, you can perform a find and search. Find looks through the history and search looks through the information displayed in the lower half of Gitk for the currently highlighted commit.

Finding commits in history

You already saw in the previous recipe how we can filter the output of git log to only list commits with the string "Bug: " in the commit message. In this example, we will use the same technique to find specific commits in the entire history.

Getting ready

Again, we will use the JGit repository, trying to find commits related to the keyword "Performance". In this recipe, we will look through the entire history, so we don't need the master branch to point to a specific commit.

How to do it...

As we tried earlier, we can use the --grep option to find specific strings in commit messages. In this recipe, we look at the entire history and search every commit that has "Performance" in its commit message:

$ git log --grep "Performance" --oneline --all 
9613b04 Merge "Performance fixes in DateRevQueue" 
84afea9 Performance fixes in DateRevQueue 
7cad0ad DHT: Remove per-process ChunkCache 
d9b224a Delete DiffPerformanceTest 
e7a3e59 Reuse DiffPerformanceTest support code to validate algorithms 
fb1c7b1 Wait for JIT optimization before measuring diff performance 

How it works...

In this example, we specifically ask Git to consider all of the commits in the history, by supplying the --all switch. Git runs through the DAG and checks whether the "Performance" string is included in the commit message. For an easy overview of the results, the --oneline switch is also used to limit the output to just the subject of the commit message. Hopefully then the commit(s) we needed to find can be identified from this much shorter list of commits.

Note that the search is case sensitive; had we searched for "performance" (all in lower case), the list of commits would have been very different:

$ git log --grep "performance" --oneline --all
5ef6d69 Use the new FS.exists method in commonly occuring places
2be6927 Always allocate the PackOutputStream copyBuffer
437be8d Simplify UploadPack by parsing wants separately from haves
e6883df Enable writing bitmaps during GC by default.
374406a Merge "Fix RefUpdate performance for existing Refs"
f1dea3e Fix RefUpdate performance for existing Refs
84afea9 Performance fixes in DateRevQueue
8a9074f Implement core.checkstat = minimal
130ad4e Delete storage.dht package
d4fed9c Refactored method to find branches from which a commit is reachable
...

There's more...

We also could have used the find feature in Gitk to find the same commits. Open Gitk with the --all switch, type Performance in the Find field and hit Enter. This will highlight the commits in the history view and you can navigate to the previous/next result by pressing Shift + up arrow, Shift + down arrow, or the buttons next to the Find field. You will still, however, be able to see the entire history in the view with the matching commits highlighted:

There's more...

Searching through history code

Sometimes it is not enough; by just looking through the commit messages in the history, you may want to know which commits touched a specific method or variable. This is also possible using git log. You can perform a search for a string, for example, a variable or method, and git log will give you the commits, adding or deleting the string from the history. In this way, you can easily get the full commit context for the piece of code.

Getting ready

Again, we will use the JGit repository with the master branch pointing to b14a939:

$ git checkout master && git reset --hard b14a939

How to do it...

We would like to find all the commits that have changes made to lines that contain the method "isOutdated". Again, we will just display the commits on one line each then we can check them individually later:

$ git log -G"isOutdated" --oneline 
f32b861 JGit 3.0: move internal classes into an internal subpackage 
c9e4a78 Add isOutdated method to DirCache 
797ebba Add support for getting the system wide configuration 
ad5238d Move FileRepository to storage.file.FileRepository 
4c14b76 Make lib.Repository abstract and lib.FileRepository its implementation 
c9c57d3 Rename Repository 'config' as 'repoConfig' 
5c780b3 Fix unit tests using MockSystemReader with user configuration 
cc905e7 Make Repository.getConfig aware of changed config

Eight commits have patches that involve the string "isOutdated".

How it works...

Git traverses the history, the DAG, looking at each commit for the string "isOutdated" in the patch between the parent commit and the current commit. This method is quite convenient to find out when a given string was introduced or deleted and to get the full context and commit at that point in time.

There's more...

The -G option used with git log will look for differences in the patches that contain added or deleted lines that match the given string. However, these lines could also have been added or removed due to some other refactoring/renaming of a variable or method. There is another option that can be used with git log, -S, which will look through the difference in the patch text similar to the -G option, but only match commits where there is a change in the number of occurrences of the specified string, that is, a line added or removed, but not added and removed.

Let's see the output of the -S option:

$ git log -S"isOutdated" --oneline 
f32b861 JGit 3.0: move internal classes into an internal subpackage
c9e4a78 Add isOutdated method to DirCache
797ebba Add support for getting the system wide configuration
ad5238d Move FileRepository to storage.file.FileRepository
4c14b76 Make lib.Repository abstract and lib.FileRepository its implementation
5c780b3 Fix unit tests using MockSystemReader with user configuation
cc905e7 Make Repository.getConfig aware of changed config

The search matches seven commits, whereas the search with the –G option matches eight commits. The difference is the commit with the ID c9c57d3 is only found with the –G option in the first list. A closer look at this commit shows that the isOutdated string is only touched due to renaming of another object, and this is why it is filtered away from the list of matching commits in the last list when using the –S option. We can see the content of the commit with the git show command, and use grep -C4 to limit the output to just the four lines before and after the search string:

$ git show c9c57d3 | grep -C4 "isOutdated"
@@ -417,14 +417,14 @@ public FileBasedConfig getConfig() {
           throw new RuntimeException(e);
         }
       }
-    if (config.isOutdated()) {
+    if (repoConfig.isOutdated()) {
         try {
-              loadConfig();
+              loadRepoConfig();
         } catch (IOException e) {
Left arrow icon Right arrow icon

Description

This practical guide contains a wide variety of recipes, taking you through all the topics you need to know about to fully utilize the most advanced features of the Git system. If you are a software developer or a build and release engineer who uses Git in your daily work and want to take your Git knowledge to the next level, then this book is for you. To understand and follow the recipes included in this book, basic knowledge of Git command-line code is mandatory.

What you will learn

  • Understand the Git data model and how you can navigate the database with simple commands
  • Learn how you can recover lost commits/files
  • Discover how you can force rebase on some branches and use regular Git merge on other branches
  • Extract metadata from a Git repository
  • Familiarize yourself with Git notes
  • Discover how you can work offline with Git
  • Debug with Git and use various techniques to find the faulty commit

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jul 24, 2014
Length: 340 pages
Edition : 1st
Language : English
ISBN-13 : 9781782168454
Vendor :
GitHub
Category :
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Jul 24, 2014
Length: 340 pages
Edition : 1st
Language : English
ISBN-13 : 9781782168454
Vendor :
GitHub
Category :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 80.97
Git Version Control Cookbook
€36.99
Git Best Practices Guide
€22.99
Git Essentials
€20.99
Total 80.97 Stars icon
Banner background image

Table of Contents

13 Chapters
1. Navigating Git Chevron down icon Chevron up icon
2. Configuration Chevron down icon Chevron up icon
3. Branching, Merging, and Options Chevron down icon Chevron up icon
4. Rebase Regularly and Interactively, and Other Use Cases Chevron down icon Chevron up icon
5. Storing Additional Information in Your Repository Chevron down icon Chevron up icon
6. Extracting Data from the Repository Chevron down icon Chevron up icon
7. Enhancing Your Daily Work with Git Hooks, Aliases, and Scripts Chevron down icon Chevron up icon
8. Recovering from Mistakes Chevron down icon Chevron up icon
9. Repository Maintenance Chevron down icon Chevron up icon
10. Patching and Offline Sharing Chevron down icon Chevron up icon
11. Git Plumbing and Attributes Chevron down icon Chevron up icon
12. Tips and Tricks Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(3 Ratings)
5 star 33.3%
4 star 33.3%
3 star 33.3%
2 star 0%
1 star 0%
Kevin P Sep 22, 2014
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I thought I was great with Git, until I read this book. It made me realize I have a lot to learn and encouraged me to do so. If you already know Git and want to push yourself to the next level, I can definitely recommend this book. I already applied some of the techniques described in the book at my company to improve our work flow and all the developers love it.It's not the book you should get if you're new to Git of version control systems. There are no deep explanations to the techniques and the authors seem to assume you've worked with the command line and Git before. So if that's you, and you want to get better, then this is the book for you.
Amazon Verified review Amazon
Jascha Casadio Jan 05, 2016
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Git is one of those technologies that has been there since like forever and, for a developer, it is one of the the best things invented since sliced bread. Among the most widely used version control systems, it stands out for being distributed and for how easy it makes it to create, merge and destroy branches. The spotlight it deserves since years resulted in many blog posts introducing the technology, as well as in more advanced ones covering how to best use it depending on the size and distribution of a team. On top of this, it also resulted in many titles made available to us at any decent book store. While there are indeed tons of titles to choose from, only a very limited number of them are really outstanding and deserve the title of must have. Git Version Control Cookbook is the first book that tackles the subject with the winning problem-solution approach, and is thus a good candidate to be part of that short list.Before getting into the details of the book, which, spoiler, deserves some praise, a quick note: as the title itself suggests, it's a cookbook, not an introductory text. As such, it does not teach the reader what is Git and how to clone a remote repository. The reader is expected to have a good knowledge of Git and know by heart how to clone, branch, merge, fast-forward and tag, among other things.Spanning through some twelve chapters, this co-authored book is one of those that you won't finish in an afternoon. Not unless you simply walk quickly through the pages. This book, as typical of a cookbook, is best used when sitting in front of a terminal, with a coffee cup next to the keyboard and enough time to try out the examples, writing down precious notes. As a cookbook, it delivers. The authors follow the consolidated problem-solution approach and cover different subjects, ranging from the global configuration up to patching, passing through edgy topics such as rebasing. Each recipe follows a specific pattern: the problem is introduced; the solution is presented and then explained. Finally, each recipe ends with a paragraph where the authors extend the solution adding more flavors, redirecting the reader to either online resources or the man pages.Technically the book is well written, easy to follow (as long as the concepts are already known). Proofreaders did their job. While not all the recipes will be interesting to everyone, anyone, independently of his skills, will walk away learning something new. Among the concepts that I have particularly enjoyed is pruning. Very clear and exhaustive.Despite the many good things about this book, and the fact that overall is a good pick, there are a couple of things that I did not like: first, I think way too much time is dedicated to the configuration, which is something very basic; similarly, more often than not, the same recipe is presented twice, one solved with the terminal and one with a GUI, which is instead something that I would have added to that extra paragraph at the end of each recipe. Some recipe, moreover, felt way too edge case to happen in real life.Before tying it all up, a final note: this title, just like most of the other books covering Git out there, lacks something: presenting different real life scenarios where, based on the project, team and distribution, we are presented with guidelines and best strategies to model branches and deliveries. But maybe this would deserve a book on its own.A very good book, no doubts. While not outstanding, it definitely serves well anyone working with Git.As usual, you can find more reviews on my personal blog: http://books.lostinmalloc.com. Feel free to pass by and share your thoughts!
Amazon Verified review Amazon
Venkatesh Aug 27, 2019
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
Having read "Git in Practice" and "Git for Teams", I was familiar with most of the content of this book. That said, I did learn some new bits in this book, e.g., git scripts. However, there two issues with this book. First, the way the code is typeset in book was a big obstacle to learn from the book. I hope PacktPub will improve their code typesetting schemes -- do not use the same font/color/styling for both commands and their outputs. Second, the exposition seemed hard to follow. Again, this is more an artifact of the format of Cookbooks from PacktPub. I'd rather they used a cookbook format (i.e., problem, solution, discussion) used by Oreilly and Manning publishers.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.