If you’re a software engineer and are reading this, you’ve probably had to work with your fair share of crazy code bases over the years. Poorly built, architected and maintained code bases are a big issue faced by many tech teams. They can cause huge inertia when building new features, are an endless source of frustration and reduce overall confidence in engineering. This post outlines a few strategies that I use to consistently produce extremely reliable and scalable services backed with readable and elegant code bases behind them.
I didn’t come up with any of these ideas but I do consistently use them!
1 - Prototypes
When an engineering team starts a new project or microservice, all too often I’ve seen otherwise talented engineers just dive straight in, like headless chickens and begin building the final product. In the end all this achieves, is it quickly gets them into a bit of a pickle.
A few years ago, I realised the fundamental issue I had when building a new service was that I never seemed to fully understand the challenges surrounding a particular project until I was about halfway through building it. How often have you found yourself saying something along the lines of “If I did this again, I’d do X instead”?
It seems that as the complexity of a projects’ requirements increase, a humans’ ability to see and predict future pitfalls and edge cases rapidly diminishes. So if you start building the final product straight away, you’re likely to only see these issues when you’ve already committed yourself to a certain architecture and it won’t matter what development strategy (TDD, intensive peer reviews, etc) you're using because when an unforeseen issue, edge case or incompatibility comes along, you end up either inserting a nasty hack or have to do a non trivial amount of work to overcome it.
This is where prototypes can save you a ton of time and many headaches! If you take 10% - 15% of your total project time and use it to create a prototype you’ll do two things:
- Expose your solution to the actual problem.
- Get that engineering brain of yours into gear in a risk free environment and realise everything you’ve not thought of.
The code for this prototype doesn’t have to be perfect, hell it can even be in a different language! BUT you’ll get hands on experience solving the actual problem and will encounter serious issues/pitfalls earlier, giving you a chance to plan a solution into your final project ahead of time.
In virtually all scenarios, I’ve found a prototype saves a serious amount of time in the long run.
You can even deploy the prototype to a development environment to get feedback from end users or even load test it to see if various assumptions you’ve made hold up, further helping you avoid any nasty last minute surprises.
2 - Submodules are your friend
One of the things that has historically frustrated me the most as a software engineer are what I call “Herpes” bugs, the ones that just seem to keep on coming back! They can come back for many reasons but mainly due to the double edged sword that is CTRL+C & CTRL+V.
Consider the following situation:
An engineer has to integrate some technology/API into a service, they may even have a deadline looming. Lucky for them, this sort of thing has been done before either in another project or on stack overflow. They decide to copy a code snippet, few functions or maybe even an entire utility library over to save time. The only problem is this code probably contains its fair share of bugs and now they’ve just been replicated too!
This is how some bugs can be spread across an entire organisation, crop up on multiple occasions and take a long time to permanently eliminate. Fortunately, this entire situation can be easily resolved with the use of git submodules, which provide numerous benefits:
Firstly, instead of copying code, you clone it as a submodule. Even if individual projects need different features or functionality of a technology, after just a little bit of thought and effort you will find some pretty elegant abstractions. Furthermore, the total amount code being used in your organisation will decrease as many projects will share large chunks for common functionality. This alone will massively limit the amount of unnecessary complexity and boiler plate in projects across your organisation whilst giving a significant boost to their long term maintainability.
Secondly, this essentially forces you into a service pattern on all your projects, meaning you will restrict all interactions with the chosen technology (databases, message brokers, random APIs, etc) to this one module. If any bugs are discovered, they can be easily located, fixed, pushed up and quickly pulled down wherever else the module is in use.
Thirdly, you encourage engineers to write more unit tests, as the services should be fairly simple to mock (use dependency injection and adapter patterns where necessary) and to keep this code very well abstracted because so many projects rely on it.
Lastly, creating new microservices becomes easy as a huge portion of the initial boilerplate is just a simple clone away! You’ll find building new services feels a bit like playing with LEGO and you can spend your valuable engineering time producing the unique functionality requirements of individual projects. Costs will go down and quality, confidence and reliability all go through the roof!
3 - Use a development environment
One of the most common underestimations I’ve seen during my career is how long engineers think it’s going to take to get a system deployable and production ready. Little and seemingly insignificant jobs that were put off, such as thorough HTTP response handling or connecting to a database using SSL may still need to be added.
These little pieces of “deployment” debt add up over time and the longer the build time, the more debt you’re going to have to pay off at once, probably in a manic “let’s just get this out of the door” style push, which is exactly the sort of environment where corners get cut and long term issues creep in.
This is where a development environment that closely resembles production comes in handy. If used early enough in a project’s lifetime it’s going to force your team to pay off the deployment debt earlier and in more manageable chunks. This allows them the freedom to build much more reliable code and will therefore make it far less likely that they’ll feel the pressure to cut corners.
Furthermore, it means testing and feedback opportunities can begin sooner, allowing bugs to be squashed earlier and if a project starts going off the rails, it can be corrected almost immediately. If you don’t have a development environment, I’d highly recommend you set some time aside and build one as a matter of priority.
4 - Clean up after yourself
Stop reading this post for a minute and quickly grep “TODO” on a code base that you frequently work on. How many hits did you get? 10? 100? ...1000!?
Engineers have a tendency to leave these in their wake and legitimately believe they’ll get around to addressing whatever issue the TODO alludes to. However, due to realities of life and running a business, time is hardly ever set aside to actually go and do this. Furthermore, a mentality of “if its not broken, don’t fix it” can understandably set in, we all remember a handful of times where well intentioned engineers have caused production fires attempting to improve existing functionality.
BUT every TODO represents something that an engineer thought needed addressing at one point in time and as such needs to be treated as bugs. Just like bugs, TODOs can be fairly innocuous in nature or represent a hidden and nefarious force in your code base, just waiting for the correct permutation of events to cause a cataclysmic issue.
A TODO has one of the following states:
- It’s not actually an issue
- It needs to be addressed
- It should be addressed but you can live with the consequences.
And therefore to address these you need to:
- Remove the TODO
- Address/fix the TODO, test and deploy
- Own the issue, replace the TODO with a few comments explaining why you can live with the consequences and if you can’t see point #2.
I highly recommend either spending an entire sprint fixing as many TODOs as you can or spend a couple of points per sprint slowly addressing the issues over an extended period of time.
Furthermore, if you see a TODO in a pull request, ask the author to fix it there and then, unless they have a valid reason not to. Every TODO you have in a code base is like playing Russian roulette, it represents a concern at least one of your engineers had at one point in time and if you play the game too often, eventually you will blow your brains out!
5 - BURN IT TO THE GROUND
We’ve all had to work with one at some point in our careers. A service that seems to be the literal incarnation of the spaghetti code anti pattern. Nobody in the team likes working with it and it’s a giant pain in the arse to maintain but you have to keep it around because it performs some crucial business function.
Nobody really knows how it got so bad either, there might be rumours that it was built in a weekend or everybody blames the ghost of bad engineers past. Whatever the reason, it doesn’t matter, it’s up to you and your team to deal with it. You may be tempted to treat it like your great aunt's antique tiara collection (don’t touch it, don’t break it and hopefully everything will be fine) but a couple of brutal experiences have taught me that whilst you might not change it, the world around it will. Requirements, the environment or even your company will change and at some point this service will need work or will just burst into flames as a result.
In the long run, services like these will cost you more time and money in maintenance and fire fighting than it would if you bite the bullet early on and rebuild it from the ground up. Odds are, you already know exactly how you’d do it too. If you need commercial team buy in, then arguments from monetary view points can help get your point across, especially if the service operates some critical business functionality.
Keeping these things around is never worth it and when the shit hits the fan (...and it will), you’ll regret not replacing them sooner. Don’t put yourself in that situation, burn spaghetti services to the ground and replace them with what you know you are capable of building!
In this post I’ve outlined a few of the strategies I’ve used to produce several extremely stable and scalable production services. These strategies may seem simple and obvious but having the discipline to use them day in and day out will result in reliable services that you can be proud of. Like most things in life, the key to software engineering success is consistency of approach with a few eureka moments along the way!