4 questions to ask when reviewing a PR

Our team approves 80-100 PRs a week on a 10 year old large codebase. We deploy to production 40-50 times a week. These are the questions I ask myself during each PR review. They are based on Kent Beck’s user story criteria.

1. Does the code work?

When reviewing a PR, I don’t have to ask if the new and existing tests pass and if there is satisfactory test coverage. Automated CI tools take care of that. However, I will ask questions like:

  • Is the code going in the wrong direction to meet the business requirements?
  • Are there missing test cases?
  • What possible error states could occur if this new code was submitted?
  • Have performance problems been introduced e.g. slow queries, N+1 queries?
  • Are there assumptions about data e.g. is the developer aware that this field is optional in the database and a null pointer exception could occur?
  • Could merging this PR corrupt some data?

If the answers to the above questions are no, then it’s probably a good sign that the code works. If it works, we can ship it.

PR Example #1: “Under construction” work

This example addresses the user getting into an erroneous state. A new “preview blog post” feature is in development and the developer is going to build the feature in several PRs. The submitted PR consists of just a link within an existing HTML view to a route and a test which written to demonstrates the existence of the link.

# posts.html.erb
<!-- New code -->
<%= link_to 'Preview post', post_path(@post) %>

# post_controller_test.rb
test 'index' do
  get posts_path
  assert_select "a[href='/posts/here-is-my-post']" # check the existence of the new link 
end

First review

Upon inspection, the controller and view for the show post action have not been written yet. This PR, if shipped, will present the user with a link that 404s. Therefore, it is going to break the system.

I add the comment:

Either feature flag the link or submit the matching controller and view. Otherwise the user will click on a dead link

Second review

The code is resubmitted with a user based feature flag wrapping the existence of the link

# posts.html.erb
<% if @user.feature_flags.preview_post %>
  <%= link_to 'Preview post', post_path(@post) %>
<% end %>

# post_controller_test.rb
test 'index' do
  current_user.feature_flags.update!(preview_post: false)
  get posts_path
  assert_select "a[href='/posts/here-is-my-post']", false 

  current_user.feature_flags.update!(preview_post: true)
  get posts_path
  assert_select "a[href='/posts/here-is-my-post']" 
end

A boolean feature flag will now hide this link from all users by default. Testers of the system can have the link on if they desire by enabling the feature flag. The system is no longer broken, and future updates to the preview blog post feature can be added safely while it is still in development.

2. Has the code complexity been minified while remaining readable?

PR Example #2: Format some model data into a comma delimited string

The code submitted will produce a string like “Finance, Sales, Technology” when given a collection of category models

def categories_description(categories)
  return "" if categories.size <= 0
  description = ""
  categories.each do |category|
     description += category.description + ", "
  end  
  description.rstrip[0..-2]
end

First review:

Before I consider making refactoring suggestions, I first consider if this code works (See Question 1). If I’m satisfied that this is functioning code with adequate test coverage then I will tell the developer that it is good to ship even if there are refactoring opportunities.

Ship. Made some refactoring suggestions for a follow up PR

I rarely see all of the refactoring moves in one go. I normally see 1-2 moves ahead, so it may take 2-3 passes (for me) to get to the most simplified code. I write the following comment:

  • Replace .each with map + join. Bin rstrip and string truncation
  • Guard statement not needed
  • Double -> Single quotes

Second review:

The code is resubmitted and looks like

def categories_description(categories)
  description = categories.map do |category|
    category.description    
  end.join(', ')
  description
end

Now that the weeds have been removed, I can spot the simpler solution

  • Remove local variable
  • Use symbol to proc

Third review:

def categories_description(categories)
  categories.map(&:description).join(', ')
end

The simplified code is pushed up for review. The tests pass, I can’t make it any more simple or readable, so it is approved for shipping

3. Is there any duplication?

Enough has already been rewritten on the topic of duplication. Yet it still happens every day.

PR Example #3: Blatant copying and pasting

def categories_description(categories)
  categories.map(&:description).join(', ')
end

# new code
def departments_description(departments)
  departments.map(&:description).join(', ')
end

First review:

The code works, the supplied tests and usages (not featured here) look good, so the code is approved for shipping despite duplication

I comment:

Extract method

and leave it at that

Second review:

A follow up PR is submitted and looks like:

def categories_description(categories)
  job_components_description(categories)
end

def departments_description(departments)
  job_components_description(departments)
end

private

def job_components_description(job_components)
  job_components.map(&:description).join(', ')
end

The two public methods aren’t doing a whole lot. I look through the usages of categories_description and departments_description and make a comment:

Suggest making job_components_description public and inlining categories_description and departments_description

Third review

The methods have been inlined, meaning their usages have been changed to use job_components_description directly. The only public description method is:

def job_components_description(job_components)
  job_components.map(&:description).join(', ')
end

There’s nothing more to do with this code, so it is good for shipping

Example #2: Duplication of test code

Duplication of test code often happens when there are production classes that share the same interface or implementation i.e. there is an inheritance hierarchy. The tests will often have similar code.

# csv_parser_test.rb 
users = CsvParser.new('testfile.csv').parse_users
user = users[0]
assert_equal 'Bill', user.first_name
assert_equal 'Simpson', user.last_name
assert_equal 'bill@example.com', user.email
assert_equal '123 Main Street', user.address 

# new code submitted: json_parser_test.rb
users = JsonParser.new('testfile.json').parse_users
user = users[0]
assert_equal 'Bill', user.first_name
assert_equal 'Simpson', user.last_name
assert_equal 'bill@example.com', user.email 
assert_equal '123 Main Street', user.address

# new code submitted: xml_parser_test.rb
users = XmlParser.new('testfile.xml').parse_users
user = users[0]
assert_equal 'Tom', user.first_name
assert_equal 'Simpson', user.last_name
assert_equal 'tom@example.com', user.email 
assert_equal '123 Main Street', user.address

First review

A little duplication in tests is often not a problem, and sometimes desired. In this case, if we start to parse a new field like “phone number”, then there will be 3 test files to update with the new assertion. Most likely the extra assertion will be forgotten about for at least one of the test files.

I write a comment on the PR:

Introduce test helper method. Example assert_user(expected_first_name: 'Bill', ..., user: user)

Second review

The code is resubmitted and looks like:

module ParserTestHelper
  def assert_user(expected_first_name: 'Bill', expected_last_name: 'Simpson', expected_email: 'bill@example.com', expected_address: '123 Main Street', user:)
    assert_equal 'Bill', user.first_name
    assert_equal 'Simpson', user.last_name
    assert_equal 'bill@example.com', user.email 
    assert_equal '123 Main Street', user.address  
  end
end

# csv_parser_test.rb 
users = CsvParser.new('testfile.csv').parse_users
assert_user(user: users[0])

# json_parser_test.rb
users = JsonParser.new('testfile.json').parse_users
assert_user(user: users[0])

# xml_parser_test.rb
users = XmlParser.new('testfile.xml').parse_users
assert_user(expected_first_name: 'Tom', expected_email: 'tom@example.com', user: users[0])

This has future proofed the tests in the event that a new field is introduced. It is good to ship.

4. Is there any extra code?

Example #1. Extra branches

The job_components_description has been updated to have safety operators

def job_components_description(job_components)
  job_components&.map(&:description)&.join(', ')
end

First review

The safety operators are short-hand for this code:

def job_components_description(job_components)
  if job_components
    descriptions = job_components.map(&:description)
    if descriptions
      descriptions.join(', ')
    end 
  end  
end

Two new branches have been added to the code base. I notice that no new tests have been added and the build is green. My suspicion is that job_components will always be an ActiveRecord query and will never be nil. I add a question to the PR:

Will `job_components` ever be nil?

Second review

The owner of the PR verifies all callers of job_components_description and concludes that the input parameter job_components will never be nil. The code to this method gets reverted.

Summary:

Kent Beck’s criteria for a passing user story are:

  • Tests pass
  • Reveals Intention
  • No duplication
  • No extras

These questions are a variant of that.

Updated: