Common bus factor pitfalls in serverless projects
The bus factor is a measurement of the risk resulting from information and capabilities not being shared among team members, derived from the phrase “in case they get hit by a bus”.
Knowledge silos within a software development team bring a lot of risk. While I do think the nature of serverless architectures encourage a broader set of responsibilities and knowledge for individual developers on a team, there are still areas where such silos can emerge. Here’s a few that I’ve noticed:
- Infrastructure-as-Code. Developers who avoid IaC because they see it as their job to only write the application code and need someone else to provision and configure the dependent infra for them.
- Complex cross-cutting features that use less common cloud services or architectural patterns that deviate from the more common synchronous CRUDL commands, e.g. choreographed async event flows or Step Functions workflows.
- IAM configuration.
- CI/CD pipeline configuration and deployment scripts.
- AWS account setup.
So how can you mitigate these bus factor risks in your serverless team? Here’s a few suggestions:
- Encourage developers to deliver end to end vertical feature slices rather than having separate developers do the application code vs the infrastructure code, say.
- Continuously deploy to production so that developers are pushed to become familiar with the CI/CD pipeline and release process from an early stage, as well as learning observability tools (if their feature breaks something in production, they’ll be responsible for debugging it there)
- Minimise the number of different tools, languages and frameworks that are used across the application code as well as IaC and deployment scripts.
- Practise code reviews or pair programming, especially on more complex features that use less commonly touched cloud services.
- Document key areas such your high-level architecture, AWS org setup and delivery process which everyone should be able to understand or reference if they need to.
- Create design documents and diagrams for complex use cases such as choreographed event fan-outs where even after reading through the code it would still be difficult to understand all the touchpoints in the flow.
- Write automated tests. Even if I’ve never edited or even read a certain part of the codebase, I’ll be much less reluctant to change it if I know it is covered by tests which are run regularly.
- If possible, use Infrastructure-as-Code even for baseline cloud landing zone configuration (like OrgFormation). This acts as a form of documentation as engineers needing to debug an access control issue can always check this code to see how a particular account has been set up or locked down (e.g. SSO Permission Sets and SCPs).
— Paul
P.S. I’m still looking for startup CTOs to interview for some research I’m doing. So if you know any early-stage tech startup teams who: 1) Have seed funding AND 2) Are NOT using a serverless architecture for their product, then I’d really appreciate you introducing me to their CTO.
Paul Swail
Indie Cloud Consultant helping small teams learn and build with serverless.
Learn more how I can help you here.
Join daily email list
I publish short emails like this on building software with serverless on a daily-ish basis. They’re casual, easy to digest, and sometimes thought-provoking. If daily is too much, you can also join my less frequent newsletter to get updates on new longer-form articles.