How to think about technical debt
Deciding where it exists. Deciding what to do about it. Convincing non-devs that it's worth doing. And then actually fixing it.
Work on a codebase for long enough and inevitably someone will tell you have technical debt.1 This typically means you’ve built so much code that some of it sucks (but hopefully you got some customers along the way). Figuring out how to deal with your technical debt is a challenge that all first time founders and CTOs struggle through.
The reason it’s hard to deal with is that sometimes when a developer shouts “technical debt” they are actually shouting “I don’t like this code and I can’t be bothered mastering it”. This is not a good reason to change it! But it’s also not a good reason to ignore the complaint. Really valid technical debt complaints often start off sounding just as whiny as illegitimate ones. You need to find the signal in the noise.
There are 4 parts to thinking about technical debt:
Deciding where technical debt exists
Deciding if it should be fixed
Convincing all the non-devs it should be fixed
Doing the work
If it’s all too much, there’s a tl;dr at the bottom!
Deciding where technical debt exists
This is the hardest part. Technical debt doesn’t always stand out. Part of being a great senior (and higher) engineer is being able to identify parts of the codebase that are hard to work on and result in more bugs than they should.
I think it is important to encourage everyone to speak up about code that is hard to write, but at the same time, not set any guarantees that raising an issue will lead to action. Common problems will come up a lot, but there will also be big problems that only one person has encountered… yet.
Think of it like product feedback from customers. You should encourage every customer to give you all of their feedback2 but that doesn’t mean you are committing to responding to or acting on each piece of feedback. A lot will never get done because there’s not enough hours in the day, some big time issues or brilliant ideas will get urgently actioned, and things that keep coming up will eventually get ticked off. In all cases, the deciding factor for if something gets acted should not be how loud the complaint is or how important the complainer is; it should be based on your judgement as to what’s best for the product.
It’s the same with tech debt. Ultimately as few people as possible should apply their judgement on what’s best for the codebase to decide on which problems are worth trying to solve. Not based on which devs complained or how important those devs are, but on how bad the issues are and how practical they are to solve.
The trap with technical debt is that everyone thinks that if they label something as technical debt, then that’s what it is. By creating a methodology around defining tech debt, you enable juniors to learn from seniors in a controlled way that doesn’t require refactoring everything. This will help them improve their sense for what is/isn’t tech debt.
Deciding if it should be fixed
Once you’ve agreed that you have technical debt, the next problem is deciding what to do about it. Fixing technical debt is not always easy, and might come in place of working on new features. This means that over the short term you might not ship the next marginal feature if you are working on refactoring or rewriting code instead.
This is made more complex because not everyone will agree on the magnitude of the problem. A naive summary of the situation:
If you ask developers how much tech debt reduction should be done, they’ll only do that and never ship new features.
If you ask go-to-market teams how much reduction should be done, they’ll only want features and you’ll never cut tech debt.
I think this is a naive summary because most people are sensible. Most developers understand that without features there's no customers and they need new jobs. And you can explain tech debt reduction in a way that GTM teams will embrace it.
Still, even if the summary isn't true, it's directionally accurate. Someone needs to decide if the team will work on tech debt, or on new features, or on something else.3 If you're reading this, maybe that someone is you. How should you decide?
Alex’s rules for refactoring
First, the technical debt needs to be in a part of the codebase that gets iterated on often (or would if it were fixed). Don’t go to the effort of refactoring code that will never get touched again! With that in mind:
If the change will take less than an hour, don’t ask, just do it.
You’ll be surprised how many changes meet this criteria.
If you can convincingly argue the change will make you 5x more productive, consider it.
You’ll be surprised how few changes meet this criteria.
If you can’t convince anyone of the 5x benefit, don’t do it.
If the pain is really that bad, eventually the 5x justification will be there.
If the change will take less than an hour, don’t ask, just do it.
Lots of refactors can be done in less than an hour. Most of them won’t be groundbreaking, but it adds up, and they are typically pretty easy to review. The key point is that it needs to be done - that means you need to go from thinking “hmmm, this could be a bit neater” to code committed with all relevant tests passing in under an hour. Some examples of this:
Adding type signatures to important methods.
Adding documentation to public classes and methods.
Lifting state up so it can be shared between classes.
Removing unused code.
Enabling helpful opt-in configurations.4
All these are helpful, but they aren’t very glamorous.
If you can convincingly argue the change will make you 5x more productive, consider it.
A few years ago, our mobile app was built in React Native. I thought it would be a good idea to rebuild it in Hotwire + Turbo Native. My argument was that we improved the app very rarely, and the main reason for that was that it was hard to work on - we didn’t have any React Native expertise. Hotwire would decrease the stack-switching that came with working on the app, which would make it easier for more people to work on it. I believed that if we rebuilt the app in a way that was easier to work on, we’d pick more ambitious projects for the app that would make it much more useful to customers. I didn’t have a way to prove this but my instinct was that it was true.
Fast forward a year, and we’re shipping a lot more improvements to the Hotwire app than we ever did to the React Native app. The old app was stagnating so much, we are shipping 5-10x more app features now! Even better, we’ve been able to launch entire new features and products on it that we wouldn’t otherwise have done.
This was a big project! It was really hard5 both technically and for the whole company. But we got through it and the payoff was big. If you can convince yourself - and a few other people - that you'll get a similar payoff, that's worth considering.
If you can’t convince anyone of the 5x benefit, don’t do it.
The reality is that most ideas for refactoring will either be glamorous, or useful. Useful ideas are fine (if you can do them in under an hour). Glamorous ideas are the trap, because it is very rare that they’ll actually add 5x improvements.
Here’s some examples of glamorous ideas you should be skeptical of. Don’t take this list as gospel, some of this will be domain specific. The mark of a really great senior engineer is their ability to identify the middle of the Venn diagram above and accept the status quo everywhere else.
Rewriting a frontend from React to Vue (or vice versa).
Refactoring a user-facing feature “to improve performance” (without strong benchmarks and customer complaints this is probably pointless).
Breaking a monolith apart into microservices.
Merging microservices into a monolith.
Upgrading dependencies, particularly for internal tools.
Exceptions: security updates, critical/major frameworks. Basically, you should keep
Rails
up to date, but unless you need a specific feature from the newest version you don’t need to bumprubocop
every week.Only weird people like me find this one glamorous.
Convincing all the non-devs that technical debt should be fixed
If you use Shape Up, or any other methodology where you pick what gets worked on before it’s worked on, then at some point you will need to commit to your tech debt reduction work. If you’ve defined it and agreed that it is worth fixing, now’s a good time.
The problem with tech debt reduction is that’s difficult for non-engineers to quantify. And the problem with any prioritisation technique is that it generally requires non-engineers to also agree to what work is going to get done. (I joke, that’s not actually a problem.)
This can result in this quote:
“Sorry, we can’t work on technical debt reduction because it will be hard to sell to the rest of the team. We need a good case for business value.” - well-intentioned product manager
The problem with this quote is that it’s pretty hard to put a “business value” case on most product development work. Asking to do it for behind the scenes changes is asking for the impossible. This means that what it’s really saying is “we don’t reduce tech debt around here”, but in a more convoluted way.
Flat out refusing to do technical debt reduction is not a good idea. If you remove the ability to improve code quality, then the only way you can ever move faster is to hire more people. This is much harder to do (often more people = slower progress!), it costs a lot more than fixing tech debt. (I guess this is the business case?)
Most crucially, it robs engineers of the agency to improve their tools and craft. The implication is that engineers aren’t smart enough to prioritise their work sensibly. To understand how this is perceived by engineers, imagine if the roles were reversed.
Would you tell the sales team that they aren’t allowed to change CRM, ever? Would you tell the marketing team they are never allowed to change how their ad tracking is set up, or the customer success team that they aren’t allowed to build an internal knowledge base?
Of course not, because those are all reasonable sounding ideas that will probably make the team more productive. Just like with tech debt, the key is what impact they’ll actually have. Changing CRM when your headcount is 10x higher than last time you bought one might make sense; changing CRM every week is a bad idea. But you don’t need to tell the sales team that; we trust that they’re smart enough to work it out for themselves.
And yet, when technical debt work gets blocked because it’s a “hard sell” that’s effectively what’s happening. It’s condescending to engineers and unproductive for everyone else. But you hear it a lot, because it’s a really easy position to justify. “We’re building more features! Yay!” is always well received if nobody’s aware of what alternatives didn’t get picked.
In short: don’t ask for a business case before you remove technical debt. Just make sure the right tech debt is being removed (this is up to the engineers to work out). You pay your engineers eye-popping salaries for their intellect; let them use it.
Doing the work
By now the engineers are probably frothing with excitement at the opportunity to finally fix that code they hate.6 They probably have a good idea on exactly how they are going to fix it. Maybe it involves complete system rewrites and extreme unification, standardisation, and de-shit-ification.
Before changing any code, decide who will do the work. The most common cause of technical debt is that over an extended period time, lots of people have contributed to an architecture, bolting on features and requirements as they went, such that it’s no longer a well designed piece of code. So when it comes to fixing your tech debt, you should pick one single person - the most senior person you can find - and put them exclusively in charge of rewriting the code.
I emphasise this, because often senior engineers will suggest that the fixing of tech debt should be done by a junior developers, or by a group of people. This is well intentioned - they think they are giving the juniors a great experience.7 But it's actually a bad idea. If the architecture is so bad you have gone to all this effort to commit to fixing it, you should get the most capable person you can to fix it, not the least! And you certainly shouldn’t give it to a group of people. If you do, the outcome will be just as inconsistent as what you started with.
The goal in this step is to change as little as possible. You want to do the bare minimum required to make the code or system in question less bad. If you haven’t already, do the work of writing down what the end state will look like, in as much detail as you can. Then take out everything that is not absolutely necessary.8
Finally, check your tests. If you are rewriting code but not changing functionality, then you should expect most of your existing tests to pass even with the changes you make. But this only applies if you have tests! If you don’t, it’s better to write the tests and commit them before even starting your refactor.
tl;dr
Encourage everyone to talk about what code is hard to work on. Use judgement to filter out the serious problems.
Use Alex’s rules for refactoring to decide if problems are worth fixing. Make improvements that can go from 0 to done in less than an hour, or that will increase productivity by 5x. Let the rest go.
Reject business cases for technical debt removal. If you trust your engineers’ judgement then let them fix problems that are real.
Do the bare minimum work required to make the code less bad. Never have more than one person work on the same problem.
If you’ve made it this far, good luck on your technical debt quest. I hope you have great test coverage!
Thanks to Adam, Austin, Dan, Harry, Hugo, Jared, and Leon for reading drafts and providing lots of feedback. If there’s still any typos after all that, that’s on me.
If you’ve haven’t been so unlucky yet, here’s a primer on the topic: https://martinfowler.com/articles/is-quality-worth-cost.html
Systems like Canny are neat for this.
Wow that gives me an idea… we should use Canny for tech debt.
Theoretically this happens at the Betting Table, practically (and unfortunately) it happens much more before betting.
Here’s one I turned on today: https://guides.rubyonrails.org/configuring.html#config-action-view-annotate-rendered-view-with-filenames
I’ll write the full story one day.
The best tech debt fixes come when someone really wants the improvements being offered. If everyone has lost enthusiasm for the change by this point, that’s a good sign you shouldn’t proceed.
Usually it’s well intentioned. But sometimes it’s because seniors think writing code is beneath them. Either way, it’s a bad idea.
This is another thing that’s much easier for 1 person to decide on than for a group.