image

How to git bisect: finding the bug-introducing commit

Have you ever faced a nasty bug that was hard to localize? Then git bisect will be your very good friend, let me show you how to use it!

In this article, I will describe the logic of git bisect and show you an example repository that is prepared to help you understand it. After that, I will show you the manual way and the automated, script-driven way for finding your bugs too! Are you ready to get more productive?

Sometimes it is easier to find out when (in which commit) a specific bug was introduced than to find out the root cause of it with debugging. Maybe you have multiple layers of abstraction or a complex computation or query, and it’s far from obvious what causes a certain behavior.

In cases like this, when you can reproduce the bug, it can be a very good shortcut to get to know which commit caused it. Because that might make it obvious for you what causes the bug, or at least you get to know who committed it and hopefully the commit message will uncover the reasons behind that commit.

Of course, there might be no point in finding who changed a button’s color: just change it to whatever it should be now. Or if you get a specific exception pinpointing a trivial coding error. Git bisect is not for these trivial cases. Also, if you can’t reproduce it, then you should work on that first anyhow!

Background

Git bisect is a built-in git feature, that allows you to jump between your commits in a binary search manner, and this way looking up a specific commit in the least amount of steps.

Let’s see how does this work: say you have commits 001, 002 - 500. You are at commit 500 (HEAD) currently, and you know that you have a bug right now that you didn’t always have.

You can try to go commit by commit back and check every commit, but that would be way too tedious. Instead of this, you can locate one good and one bad commit. As a wild guess, you check out commit 100 and conclude that it is good, and you already know that commit 500 is bad. So we know, that the bug is introduced between commit 100 and 500.

The next commit that you should check is not 499, 498, 497… but the one that is halfway between them, which is in this case: 300.

You might find, that it is still wrong. Now you can ignore commits after it, and you know that the bug is between 100 and 300.

Or you might find, that it is working now. Then you can ignore everything before it, and conclude that the bug is introduced between 300 and 500.

Now you need to repeat this a few times, and you will get the commit in question.

The main takeaway from this is, that at every jump you need to determine if it is still working or not. In most cases that is enough to conclude that we are after or before the introduction of the bug, therefore halving down the range of commits we need to tackle in every step.

This would be a good trick in and of itself, but thanks to git, git can do this exact thing for you! Let’s dive deeper…

Playground

I’ve created a playground for you so that you would be able to experiment with it.

It is right here: https://github.com/KopiasCsaba/blog-git-bisect

This repository contains an add.py, that adds together the two supplied numbers. Also, there are 500 commits there, and the 250th introduced a nasty bug, causing this script to return something weird and exit. What a bummer!

Using git bisect manually

# Check out the repository
git clone https://github.com/KopiasCsaba/blog-git-bisect.git bisecttest
cd bisecttest

# Verify, that this is wrong
python3 add.py 2 3
# 42.25

# Lets start bisecting!
git bisect start

Now, before you would go on, you need to tell git bisect if it is currently good or not. As we’ve seen previously, it is not working anymore. Register this with git bisect:

git bisect bad

We now need to find one commit, that was working. Think about which release was the last one that was working? When was that? Or if you don’t know, just try to step back a week or a month’s worth of commits. Just make an educated guess and try. Nothing goes wrong if you jump to a commit that is still bad, don’t worry just go back even further.

Let’s say, that for example “I guess it was working 400 commits ago”. Jump back that many commits:

git checkout HEAD~400
# HEAD is now at b3743d4 Update add.py: 100

Let’s see whether or not we still have the bug:

python3 add.py 2 3
# 5

Looks decent, let’s continue!

git bisect good
# Bisecting: 199 revisions left to test after this (roughly 8 steps)
# \[3f9791395981f39638721b04d315b8b00c6d600d\] Update add.py: 300

We told git bisect, that now this commit counts as “good”. And git bisect automatically jumped to the next commit, which is the 300th one. Why?

  • We have 500 commits.
  • We started on HEAD that was the 500th commit, and we marked it as bad.
  • We jumped back 400 commits, then we were at the 100th commit, and we marked that as good.
  • Now git bisect knows that the bug is between commits 100 and 500, which is 400 commit.
  • So 100+(400/2) = 300, which is the middle of our current range.

We are getting informed by git bisect that we need to repeat this for 8 times. So we’ll sort through 400 commits in 8 steps, that’s usually way better than going back sequentially commit by commit.

Let’s continue with our git bisect already in progress:

python3 add.py 2 3
# 42.25
git bisect bad
# Bisecting: 99 revisions left to test after this (roughly 7 steps)
# \[d18d9a01d56d2e532daa1b9631b4c337be786a87\] Update add.py: 200

python3 add.py 2 3
# 5
git bisect good
# Bisecting: 49 revisions left to test after this (roughly 6 steps)
# \[4fec816b74c553df5cfc7f4f8462fcb05c283947\] Implementing bug in add.py

python3 add.py 2 3
# 42.25
git bisect bad
# Bisecting: 24 revisions left to test after this (roughly 5 steps)
# \[36d125b572491c98b6aba0fc79655338c8fa0b51\] Update add.py: 225

python3 add.py 2 3
# 5
git bisect good
# Bisecting: 12 revisions left to test after this (roughly 4 steps)
# \[776fd74bbdb0e44906b35d96689bdcdd6c1b76a1\] Update add.py: 237

python3 add.py 2 3
# 5
git bisect good
# Bisecting: 6 revisions left to test after this (roughly 3 steps)
# \[78af1f0b58445b7c323e2a06c957a395ea7c2bb7\] Update add.py: 243

python3 add.py 2 3
# 5
git bisect good
# Bisecting: 3 revisions left to test after this (roughly 2 steps)
# \[ba549093a1547c132eb00634414ac82c27929918\] Update add.py: 246

python3 add.py 2 3
# 5
git bisect good
# Bisecting: 1 revision left to test after this (roughly 1 step)
# \[7d9fcceb579167113ca32deb18ba85c769862620\] Update add.py: 248

python3 add.py 2 3
# 5
git bisect good
# Bisecting: 0 revisions left to test after this (roughly 0 steps)
# \[7a46ffac9fce7a42b814af34f83046bc0ef20cae\] Update add.py: 249

python3 add.py 2 3
# 5
git bisect good
# 4fec816b74c553df5cfc7f4f8462fcb05c283947 is the first bad commit
# commit 4fec816b74c553df5cfc7f4f8462fcb05c283947
# Author: Kopiás Csaba <email@example.com>
# Date:   Sun Oct 20 16:45:11 2019 +0200
# 
#     Implementing bug in add.py
# 
# :100644 100644 114269dcca8d84cdea9d8f19780e08ec2f11103f a00350a750e304ae86ef31580d999734236d1379 M      add.py

And thats it, we have found who and when introduced the bug!

But as you’ve seen, this is still a bit more tedious than it could be.

Using git bisect with a script

If you can make a script that can determine the existence of your bug, then you can automate all this we have covered above!

The rules for the script:

  • Exit with code 125: if the current commit can not be tested. E.g.? it doesn’t even build for some non-related reason, etc.
  • Exit with code 0: if this commit should be taken as “good”
  • Exit with code 1 to 127 (except 125): if this commit is “bad”

Usually, you would want to have this script outside of your repository, but for simplicity, I have included it in the repository mentioned above. It’s called tester.sh. It is introduced and unchanged since the first commit in the repo.

Let’s prepare our scripted bug hunting process! It goes similarly to our previous warm-up commands:

git checkout master
# Start from the beginning

git bisect start

# Test...
./tester.sh 
# Nah, it's bad.

# The exit code of the previous command
echo $? 
# 1
git bisect bad 
 
git checkout HEAD~400
# ...
# HEAD is now at b3743d4 Update add.py: 100

# Test...
./tester.sh 
# It works!
echo $?
# 0

git bisect good
# Bisecting: 199 revisions left to test after this (roughly 8 steps)
# \[3f9791395981f39638721b04d315b8b00c6d600d\] Update add.py: 300

There was not much new until this point. We just got introduced to tester.sh, which returns 0 or 1 depending on the presence of the bug.

Now let’s start the fun! All the above is a result of a single command, and about 2 seconds of execution time.

git bisect run ./tester.sh
# running ./tester.sh
# Nah, it's bad.
# 
# Bisecting: 99 revisions left to test after this (roughly 7 steps)
# \[d18d9a01d56d2e532daa1b9631b4c337be786a87\] Update add.py: 200
# running ./tester.sh
# It works!
# 
# Bisecting: 49 revisions left to test after this (roughly 6 steps)
# \[4fec816b74c553df5cfc7f4f8462fcb05c283947\] Implementing bug in add.py
# running ./tester.sh
# Nah, it's bad.
# 
# Bisecting: 24 revisions left to test after this (roughly 5 steps)
# \[36d125b572491c98b6aba0fc79655338c8fa0b51\] Update add.py: 225
# running ./tester.sh
# It works!
# 
# Bisecting: 12 revisions left to test after this (roughly 4 steps)
# \[776fd74bbdb0e44906b35d96689bdcdd6c1b76a1\] Update add.py: 237
# running ./tester.sh
# It works!
# 
# Bisecting: 6 revisions left to test after this (roughly 3 steps)
# \[78af1f0b58445b7c323e2a06c957a395ea7c2bb7\] Update add.py: 243
# running ./tester.sh
# It works!
# 
# Bisecting: 3 revisions left to test after this (roughly 2 steps)
# \[ba549093a1547c132eb00634414ac82c27929918\] Update add.py: 246
# running ./tester.sh
# It works!
# 
# Bisecting: 1 revision left to test after this (roughly 1 step)
# \[7d9fcceb579167113ca32deb18ba85c769862620\] Update add.py: 248
# running ./tester.sh
# It works!
# 
# Bisecting: 0 revisions left to test after this (roughly 0 steps)
# \[7a46ffac9fce7a42b814af34f83046bc0ef20cae\] Update add.py: 249
# running ./tester.sh
# It works!
# 
# 4fec816b74c553df5cfc7f4f8462fcb05c283947 is the first bad commit
# commit 4fec816b74c553df5cfc7f4f8462fcb05c283947
# Author: Kopiás Csaba <metamorfozis@gmail.com>
# Date:   Sun Oct 20 16:45:11 2019 +0200
# 
#     Implementing bug in add.py
# 
# :100644 100644 114269dcca8d84cdea9d8f19780e08ec2f11103f a00350a750e304ae86ef31580d999734236d1379 M      add.py
# bisect run success

See? We have saved a lot of time and effort. If your app builds for a long time, this can be really useful, because you would not need to pay close attention to the process waiting your build to finish at every step, to verify that commit.

Remarks

git reset

With git reset you can quit from the bisecting process.

git skip

This command skips the current commit from git bisect: you can use it when your app would not compile for example.

When you marked good as bad or bad as good

If it is painful to start over (e.g. long build times or hard to reproduce bugs), and you by accident messed up marking a commit as good or bad, there is nothing to be afraid of.

git bisect log > /tmp/bisect.log
# Remove the messed up marking from the file

# Replay it:
git bisect reset
git bisect replay /tmp/bisect.log

Documentation

For more information on this, check out the documentation!

That’s it!

Thank you, for following this through with me! If you like this kind of content, please subscribe (on the right) & share it with your friends!

Have a beautiful day!