Bugs that bug
Since joining Basecamp a few months ago, I have done my fair share of HEY bug fixes. Not everyone enjoys fixing bugs, but I do. It’s great way to get to know a project—each fix is a little adventure that takes you to remote corners of the codebase, and allows you to dig into the context of why the product was built the way it was.
What I don’t enjoy is bugs that I can’t reproduce. If I can’t reproduce it, I can’t fix it, and the product’s users suffer. The longer a bug like this sticks around, and the more people it affects, the less happy I become.
Two such bugs were bugging me lately (actually one bug with two symptoms, it turns out):
I tried and tried to reproduce them, but to no avail. Others had similar luck:
Can anyone reproduce this reliably? I’ve tried scrolling fast, weird, up, and down, and I never get dupes.
And others were able to reproduce the bugs, but not consistently. We were left guessing, without any real clues about why this was happening. Meanwhile we had at least 35 people write in to support about the problem.
What finally unlocked this bug for us was a customer writing in that they only see this issue after not having loaded the Feed for multiple days. Aha, a clue! It has to with new messages in the Feed. So I stopped looking at my feed for a few days, and sure enough when I came back I saw duplicate messages. Before getting a chance to dig too deeply I accidentally refreshed the page and the duplicates went away, but still this was enough of a clue to get to work solving this thing.
The next step was to reproduce the bug locally. Since it seemed to have to do with new messages in the Feed, I figured I’d run a script to load up my test account’s Feed with lots of new messages:
from = "designated_to_feed@example.com"
to = "test@hey.com"
(1..40).each do |n|
body = "Message #{n}"
mail = Mail.new(to: to, x_original_to: to, from: from, subject: body, body: body)
inbound = ActionMailbox::InboundEmail.create_and_extract_message_id!(mail.encoded)
Receipt::Inbound.process(inbound, wait: true)
end
I visited the feed and… oh gosh, it loaded super slowly, jumped all over the place, and showed duplicate messages after I scrolled down to load in additional pages.
Now with a way to consistently reproduce the bug, fixing it was much like fixing any other bug—printing things out, checking logs, viewing requests and responses in the browser, testing out theories, and gradually isolating the source of the problem.
This one ended up being related to marking messages in the Feed as seen. Whenever you visit the Feed, we send a POST request to mark the whole box as seen. The controller action for the request looks like this:
class Boxes::ObservationsController < ApplicationController
def create
@box.mark_as_seen_later
head :created
end
end
That kicks off a background job which eventually calls Box#seen! and marks each message (actually each posting, but you can treat them as roughly equivalent in this context) as seen:
class Box < ApplicationRecord
def seen!
postings.unseen.each(&:seen!)
end
end
Saving each posting then broadcasts a Turbo stream (via turbo-rails) to update any postings already on the page, and insert any that aren’t:
module Posting::Broadcasting
extend ActiveSupport::Concern, Suppressible
included do
after_save_commit :broadcast_upsert_later, unless: -> { Posting::Broadcasting.suppressed? }
end
private
def broadcast_upsert_later
broadcast_render_later_to box, user.account
end
end
But wait, that means we will insert every unread posting onto the page whenever visiting the Feed, even if we haven’t scrolled down to load in additional pages. Oops, that’s not what we want here.
The solution was to suppress the Turbo update when marking a whole box as seen:
class Box < ApplicationRecord
def seen!
Posting::Broadcasting.suppress do
postings.unseen.each(&:seen!)
end
end
end
I don’t really have any larger lesson here—I’m only sharing my little adventure. Many thanks to the customer who wrote in to support with the clue that turned this from a bug I despised to a bug I enjoyed. It feels great to have it fixed.