1st October 2022
Gergely Oroszstarted a Twitter conversation asking about recommended “software engineering practices” for development teams.
(I really like his rejection of the term “best practices” here: I always feel it’s prescriptive and misguiding to announce something as “best”.)
I decided to flesh some of my replies out into a longer post.
The most important characteristic of internal documentation is trust: do people trust that documentation both exists and is up-to-date?
If they don’t, they won’t read it or contribute to it.
The best trick I know of for improving the trustworthiness of documentation is to put it in the same repository as the code it documents, for a few reasons:
When you work on large products, your customers will inevitably find surprising ways to stress or break your system. They might create an event with over a hundred different types of ticket for example, or an issue thread with a thousand comments.
These can expose performance issues that don’t affect the majority of your users, but can still lead to service outages or other problems.
Your engineers need a way to replicate these situations in their own development environments.
One way to handle this is to provide tooling to import production data into local environments. This has privacy and security implications—what if a developer laptop gets stolen that happens to have a copy of your largest customer’s data?
A better approach is to have a robust system in place for generating test data, that covers a variety of different scenarios.
You might have a button somewhere that creates an issue thread with a thousand fake comments, with a note referencing the bug that this helps emulate.
Any time a new edge case shows up, you can add a new recipe to that system. That way engineers can replicate problems locally without needing copies of production data.
The hardest part of large-scale software maintenance is inevitably the bit where you need to change your database schema.
(I’m confident that one of the biggest reasons NoSQL databases became popular over the last decade was the pain people had associated with relational databases due to schema changes. Of course, NoSQL database schema modifications are still necessary, and often they’re even more painful!)
So you need to invest in a really good, version-controlled mechanism for managing schema changes. And a way to run them in production without downtime.
If you do not have this your engineers will respond by being fearful of schema changes. Which means they’ll come up with increasingly complex hacks to avoid them, which piles on technical debt.
This is a deep topic. I mostly use Django for large database-backed applications, and Django has the bestmigration system I’ve ever personally experienced. If I’m working without Django I try to replicate its approach as closely as possible:
Even harder is the challenge of making schema changes without any downtime. I’m always interested in reading about new approaches for this—GitHub’sgh-ost is a neat solution for MySQL.
An interesting consideration here is that it’s rarely possible to have application code and database schema changes go out at the exact same instance in time. As a result, to avoid downtime you need to design every schema change with this in mind. The process needs to be:
This process is a pain. It’s difficult to get right. The only way to get good at it is to practice it a lot over time.
My rule is this:schema changes should be boring and common, as opposed to being exciting and rare.
If you’re working with microservices, your team will inevitably need to build new ones.
If you’re working in a monorepo, you’ll still have elements of your codebase with similar structures—components and feature implementations of some sort.
Be sure to have really good templates in place for creating these “the right way”—with the right directory structure, a README and a test suite with a single, dumb passing test.
I like to use the Pythoncookiecutter tool for this. I’ve also used GitHub template repositories, and I even have a neat trick forcombining the two.
These templates need to be maintained and kept up-to-date. The best way to do that is to make sure they are being used—every time a new project is created is a chance to revise the template and make sure it still reflects the recommended way to do things.
This one’s easy. Pick a code formatting tool for your language—likeBlack for Python orPrettier for JavaScript (I’m so jealous of how Go hasgofmt built in)—and run its “check” mode in your CI flow.
Don’t argue with its defaults, just commit to them.
This saves an incredible amount of time in two places:
The most painful part of any software project is inevitably setting up the initial development environment.
The moment your team grows beyond a couple of people, you should invest in making this work better.
At the very least, you need a documented process for creating a new environment—and it has to be known-to-work, so any time someone is onboarded using it they should be encouraged to fix any problems in the documentation or accompanying scripts as they encounter them.
Much better is an automated process: a single script that gets everything up and running. Tools like Docker have made this a LOT easier over the past decade.
I’m increasingly convinced that the best-in-class solution here is cloud-based development environments. The ability to click a button on a web page and have a fresh, working development environment running a few seconds later is a game-changer for large development teams.
Gitpod andCodespaces are two of the most promising tools I’ve tried in this space.
I’ve seen developers lose hours a week to issues with their development environment. Eliminating that across a large team is the equivalent of hiring several new full-time engineers!
Reviewing a pull request is a lot easier if you can actually try out the changes.
The best way to do this is with automated preview environments, directly linked to from the PR itself.
These are getting increasingly easy to offer.Vercel,Netlify,Render andHeroku all have features that can do this. Building a custom system on top of something likeGoogle Cloud Run orFly Machines is also possible with a bit of work.
This is another one of those things which requires some up-front investment but will pay itself off many times over through increased productivity and quality of reviews.
This isSoftware engineering practices by Simon Willison, posted on1st October 2022.
Part of seriesMy open source process
Next:Is the AI spell-casting metaphor harmful or helpful?
Previous:Weeknotes: Datasette Cloud preview invitations
Sponsor me for$10/month and get a curated email digest of the month's most important LLM developments.
Pay me to send you less!
Sponsor & subscribe