ProbableOdyssey

How I review pull requests

Collaboration is often the bottleneck in software development. As addressed in the classic “Mythical Man-Month” by Fred Brooks, Moore’s law leads to faster hardware each decade, but in spite of that the rate of software delivery stays relatively constant. The bottleneck is never the hardware — it’s the people and communication processes.

Git is the most widely used tool for distributed version control, and Github provides a platform for collaboration on git repositories. Our workflows generally consist of gathering requirements, studying existing codebases, and making changes without toppling the proverbial Jenga tower. We make changes on a new branch on the git repository, and then we make a pull request (PR) before merging the branch into the main branch of the project so that someone else on the team can check the change is appropriate to merge.

For many teams, this is often where things come to a halt. Code review is an essential part of the process to ensure that contributions are agreed upon across the author and team members. It’s also a spot for reviewers to interrogate the code and check that the contributions won’t break anything.

Today I want to share a bit about my review process to try and articulate what goes into a good review, and infer strategies for submitting better pull requests.

The description

Reading the title of the PR is always a great first step. When the title is written like a conventional commit, I know exactly what context I need to switch to. I also find assigning tags such as Low impact, Medium impact, and High impact great for judging at a glance how much time and effort I’ll need to set aside to review the PR.

The description should have a short paragraph or two (or some dot-points) describing the change in more depth. In my opinion, a great PR description tells me:

This is also a great spot to document what didn’t work. These decisions are rarely documented, yet they often contain valuable insights!

If the description is too long, that’s a great indication that the change is too big — and will thus take much longer to review. In keeping with “Curly’s law”:

Things should do one thing, and one thing only

Excellent advice for writing good functions, classes, and modules. I can review PRs that align with this principle much faster.

Testing

When there’s evidence that the change was tested by the author, I feel more confident about the PR. But, if the author additionally provides me with instructions on how to test the PR myself, I feel even more confident! Explaining how to test a change forces the author to think critically, often surfacing issues before the reviewer even sees the code.

Unit tests also give me more confidence in the change, especially if they’re well written. This is a topic for another post, but “Test Driven Development: By Example” by Kent Beck has some great advice on this.

Reading the code

I’ve studied the PR description and I’ve ensured that it’s well tested. I’ve also checked out the branch on my machine and checked that it works. If no red flags have emerged, I move on to reading the code.

Ideally, I’d like to review the change in Github. If the change is small enough, the diffs in the PR don’t put too much cognitive load on me and thus I can review it faster. That being said, tiny changes can often result in higher failure rates. I think both authors and reviewers can get over-confident in changes that only span a few lines, and thus they don’t get scrutinised as much. This is a bias we all have, and we just need to be a bit more vigilant about that!

Most of the time, I’ll open a text file to jot down my thoughts during a review. In Vim I’ll run:

:read!git diff main --name-only
:norm! I- [ ]

Which gives a nice little checklist in my file:

- [ ] README.md
- [ ] src/floob.py
- [ ] src/paths.py
- [ ] test/test_floob.py

Then I’ll open each file one by one, jump around definitions and references. Each comment and question that comes to mind, I’ll keep track of them in my running document:

src/floob.py
Optional style comment, early returns are great
~~~python
if condition:
    LOGGER.info("Floobed is already wumboed")
    return

LOGGER.info("No wumbo found")
df = foo(floob, bar=2)
df["wumbo"] = ...
~~~

This gives me a good way to edit all my comments before posting the more important ones on the PR review on Github.

Lessons

What can we do proactively to ensure that we write small PRs so that we can get changes reviewed faster? I used to start a branch, solve the problem, and submit a PR. But this approach is often haphazard, because it’s difficult to be vigilant about the scope and keep all the commits and diffs on-topic. As tempting as it is to bundle some refactors in to my PRs, this almost always adds friction to the review process.

Nowadays I try spending more time planning before diving into the code:

Sometimes I write the PR description before I even start coding. This keeps the scope more focussed, and I can catch myself when I start to veer off-topic.

I still don’t always get it right — sometimes my PRs are too big, or my explanations too vague. But thinking intentionally about the process has helped me a lot. Code review is more than just checking for bugs, it’s about building shared understanding. By planning better, scoping changes thoughtfully, and writing clear PRs, we make collaboration smoother for everyone. It takes practice, but the payoff is real: better software, better teams.

There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.

  • C.A.R. Hoare

Reply to this post by email ↪