4 questions to ask when reviewing a PR
Our team approves 80-100 PRs a week on a 10 year old large codebase. We deploy to production 40-50 times a week. These are the questions I ask myself during each PR review. They are based on Kent Beck’s user story criteria.
1. Does the code work?
When reviewing a PR, I don’t have to ask if the new and existing tests pass and if there is satisfactory test coverage. Automated CI tools take care of that. However, I will ask questions like:
- Is the code going in the wrong direction to meet the business requirements?
- Are there missing test cases?
- What possible error states could occur if this new code was submitted?
- Have performance problems been introduced e.g. slow queries, N+1 queries?
- Are there assumptions about data e.g. is the developer aware that this field is optional in the database and a null pointer exception could occur?
- Could merging this PR corrupt some data?
If the answers to the above questions are no, then it’s probably a good sign that the code works. If it works, we can ship it.
PR Example #1: “Under construction” work
This example addresses the user getting into an erroneous state. A new “preview blog post” feature is in development and the developer is going to build the feature in several PRs. The submitted PR consists of just a link within an existing HTML view to a route and a test which written to demonstrates the existence of the link.
# posts.html.erb
<!-- New code -->
<%= link_to 'Preview post', post_path(@post) %>
# post_controller_test.rb
test 'index' do
get posts_path
assert_select "a[href='/posts/here-is-my-post']" # check the existence of the new link
end
First review
Upon inspection, the controller and view for the show post action have not been written yet. This PR, if shipped, will present the user with a link that 404s. Therefore, it is going to break the system.
I add the comment:
Either feature flag the link or submit the matching controller and view. Otherwise the user will click on a dead link
Second review
The code is resubmitted with a user based feature flag wrapping the existence of the link
# posts.html.erb
<% if @user.feature_flags.preview_post %>
<%= link_to 'Preview post', post_path(@post) %>
<% end %>
# post_controller_test.rb
test 'index' do
current_user.feature_flags.update!(preview_post: false)
get posts_path
assert_select "a[href='/posts/here-is-my-post']", false
current_user.feature_flags.update!(preview_post: true)
get posts_path
assert_select "a[href='/posts/here-is-my-post']"
end
A boolean feature flag will now hide this link from all users by default. Testers of the system can have the link on if they desire by enabling the feature flag. The system is no longer broken, and future updates to the preview blog post feature can be added safely while it is still in development.
2. Has the code complexity been minified while remaining readable?
PR Example #2: Format some model data into a comma delimited string
The code submitted will produce a string like “Finance, Sales, Technology” when given a collection of category models
def categories_description(categories)
return "" if categories.size <= 0
description = ""
categories.each do |category|
description += category.description + ", "
end
description.rstrip[0..-2]
end
First review:
Before I consider making refactoring suggestions, I first consider if this code works (See Question 1). If I’m satisfied that this is functioning code with adequate test coverage then I will tell the developer that it is good to ship even if there are refactoring opportunities.
Ship. Made some refactoring suggestions for a follow up PR
I rarely see all of the refactoring moves in one go. I normally see 1-2 moves ahead, so it may take 2-3 passes (for me) to get to the most simplified code. I write the following comment:
- Replace .each with map + join. Bin rstrip and string truncation
- Guard statement not needed
- Double -> Single quotes
Second review:
The code is resubmitted and looks like
def categories_description(categories)
description = categories.map do |category|
category.description
end.join(', ')
description
end
Now that the weeds have been removed, I can spot the simpler solution
- Remove local variable
- Use symbol to proc
Third review:
def categories_description(categories)
categories.map(&:description).join(', ')
end
The simplified code is pushed up for review. The tests pass, I can’t make it any more simple or readable, so it is approved for shipping
3. Is there any duplication?
Enough has already been rewritten on the topic of duplication. Yet it still happens every day.
PR Example #3: Blatant copying and pasting
def categories_description(categories)
categories.map(&:description).join(', ')
end
# new code
def departments_description(departments)
departments.map(&:description).join(', ')
end
First review:
The code works, the supplied tests and usages (not featured here) look good, so the code is approved for shipping despite duplication
I comment:
Extract method
and leave it at that
Second review:
A follow up PR is submitted and looks like:
def categories_description(categories)
job_components_description(categories)
end
def departments_description(departments)
job_components_description(departments)
end
private
def job_components_description(job_components)
job_components.map(&:description).join(', ')
end
The two public methods aren’t doing a whole lot. I look through the usages of categories_description
and departments_description
and make a comment:
Suggest making job_components_description
public and inlining categories_description
and departments_description
Third review
The methods have been inlined, meaning their usages have been changed to use job_components_description
directly. The only public description method is:
def job_components_description(job_components)
job_components.map(&:description).join(', ')
end
There’s nothing more to do with this code, so it is good for shipping
Example #2: Duplication of test code
Duplication of test code often happens when there are production classes that share the same interface or implementation i.e. there is an inheritance hierarchy. The tests will often have similar code.
# csv_parser_test.rb
users = CsvParser.new('testfile.csv').parse_users
user = users[0]
assert_equal 'Bill', user.first_name
assert_equal 'Simpson', user.last_name
assert_equal 'bill@example.com', user.email
assert_equal '123 Main Street', user.address
# new code submitted: json_parser_test.rb
users = JsonParser.new('testfile.json').parse_users
user = users[0]
assert_equal 'Bill', user.first_name
assert_equal 'Simpson', user.last_name
assert_equal 'bill@example.com', user.email
assert_equal '123 Main Street', user.address
# new code submitted: xml_parser_test.rb
users = XmlParser.new('testfile.xml').parse_users
user = users[0]
assert_equal 'Tom', user.first_name
assert_equal 'Simpson', user.last_name
assert_equal 'tom@example.com', user.email
assert_equal '123 Main Street', user.address
First review
A little duplication in tests is often not a problem, and sometimes desired. In this case, if we start to parse a new field like “phone number”, then there will be 3 test files to update with the new assertion. Most likely the extra assertion will be forgotten about for at least one of the test files.
I write a comment on the PR:
Introduce test helper method. Example assert_user(expected_first_name: 'Bill', ..., user: user)
Second review
The code is resubmitted and looks like:
module ParserTestHelper
def assert_user(expected_first_name: 'Bill', expected_last_name: 'Simpson', expected_email: 'bill@example.com', expected_address: '123 Main Street', user:)
assert_equal 'Bill', user.first_name
assert_equal 'Simpson', user.last_name
assert_equal 'bill@example.com', user.email
assert_equal '123 Main Street', user.address
end
end
# csv_parser_test.rb
users = CsvParser.new('testfile.csv').parse_users
assert_user(user: users[0])
# json_parser_test.rb
users = JsonParser.new('testfile.json').parse_users
assert_user(user: users[0])
# xml_parser_test.rb
users = XmlParser.new('testfile.xml').parse_users
assert_user(expected_first_name: 'Tom', expected_email: 'tom@example.com', user: users[0])
This has future proofed the tests in the event that a new field is introduced. It is good to ship.
4. Is there any extra code?
Example #1. Extra branches
The job_components_description
has been updated to have safety operators
def job_components_description(job_components)
job_components&.map(&:description)&.join(', ')
end
First review
The safety operators are short-hand for this code:
def job_components_description(job_components)
if job_components
descriptions = job_components.map(&:description)
if descriptions
descriptions.join(', ')
end
end
end
Two new branches have been added to the code base. I notice that no new tests have been added and the build is green. My suspicion is that job_components
will always be
an ActiveRecord query and will never be nil. I add a question to the PR:
Will `job_components` ever be nil?
Second review
The owner of the PR verifies all callers of job_components_description
and concludes that the input parameter job_components
will never be nil. The code to this method gets reverted.
Summary:
Kent Beck’s criteria for a passing user story are:
- Tests pass
- Reveals Intention
- No duplication
- No extras
These questions are a variant of that.